Data Ingestion

Populating Your Collections with Data

In this section, we'll guide you through the process of populating your Vantage collections with documents, highlighting the necessity of preparing your data correctly according to Vantage's data ingestion format. We'll also describe how to effectively use metadata fields for enhancing search capabilities.

Data Ingestion

Preparation Steps

To ingest your data, there are a few requirements: you need to have a collection created where you want to ingest your data, and your data must be in the proper format. Here are the steps you need to follow, along with the necessary links:

  1. Create a new collection either in the Console UI or via the API.
  2. Prepare your data according to the Vantage Ingest Format in parquet format.
  3. Upload your data through the Console UI or via the API.

πŸ‘

Search Away

After ingesting your data into a Vantage Collection, you should be able to perform searches.

Ingestion and Search

There is a strong relationship between the document ids and the meta_ fields that you ingest into your collection, and how you search your data. You can take a closer look on Vantage Ingestion Format page, how these fields will impact your search.

Different Data Ingestion Formats

JSONL Format (preferred)

The Vantage JSONL format is used to bulk upload your data via the Console UI and via the direct upload links available in the REST API. You can take a closer look at the Vantage JSONL format on its documentation page.

Parquet Format

The Vantage Parquet format is used to bulk upload your data via the Console UI and via the direct upload links available in the REST API. You can take a closer look at the Vantage Parquet format on its documentation page.