JSONL Format
The Vantage JSONL format is used to bulk upload your data via the Console UI and via the direct upload links available in the REST API. Additionally, the format is also used for the Upload Documents API.
The schema and format of the documents for ingestion.
Required Fields
id
: Physical type is BYTE_ARRAY, logical type isString
text
: Physical type is BYTE_ARRAY, logical type isString
embeddings
: Physical type is a list of DOUBLE
Optional Fields
operation
: Specifies the action to be performed on the document. The available options:update
,add
,delete
meta_[...]
: Support querying and filtering; also used for Facetsmeta_ordered_[...]
: Support sortingvariants
: Describe variants of a document
Typically you'd have only one of
text
orembeddings
.For more details please check Vantage Documents page.
JSONL Documents
Examples of correctly prepared JSONL data are below:
{"id": "1", "text": "Example text", "meta_color": "green", "variants": [{"id": "size-xl", "size": "XL"}, {"id": "size-l", "size": "L"}]}
{"id": "2", "text": "Sample text", "meta_color": "blue", "variants": [{"id": "size-m", "size": "M"}, {"id": "size-s", "size": "S"}]}
{"id": "1", "text": "Example text", "meta_color": "green", "embeddings": [1,2,3, ...], "variants": [{"id": "size-xl", "size": "XL"}, {"id": "size-l", "size": "L"}]}
{"id": "2", "text": "Sample text", "meta_color": "blue", "embeddings": [4,5,6, ...], "variants": [{"id": "size-m", "size": "M"}, {"id": "size-s", "size": "S"}]}
Note: meta_color
, and variants
fields are optional.
When you have your data you can use Vantage Python SDK to upload it easily:
from vantage_sdk import VantageClient
vantage_instance = VantageClient.using_vantage_api_key(
vantage_api_key=VANTAGE_API_KEY,
account_id=ACCOUNT_ID,
)
# JSONL data from above
documents = '{
"id": "1",
"text": "Example text",
"meta_color": "green",
"embeddings": [1,2,3, ...]
}\\n{
"id": "2",
"text": "Sample text",
"meta_color": "blue",
"embeddings": [4,5,6, ...]
}'
vantage_instance.upsert_documents_from_jsonl_string(
collection_id="example-collection",
documents_jsonl=documents
)
from vantage_sdk import VantageClient
vantage_instance = VantageClient.using_vantage_api_key(
vantage_api_key=VANTAGE_API_KEY,
account_id=ACCOUNT_ID,
)
# Proper JSONL data written in file
documents_path = "my_documents.jsonl"
vantage_instance.upsert_documents_from_jsonl_file(
collection_id="example-collection",
jsonl_file_path=documents_path
)
Updated 4 months ago