A collection is the fundamental object in the Vantage platform that enables you to organize, manage, and search your data sets within the platform. Your data records, called documents, are ingested into a collection. Your search queries run against a collection. We currently support text data in collections, but we will soon support other types of data as well.
When creating a collection, you give it an ID, a name, and specify some parameters for the AI model that will be used to embed your collection data. You can create many collections in your account to separate your different data sets that you want to search against.
A collection ID is used when using our API or Console to upload and search your data. The collection ID tells the Vantage platform which one of your collections you want to search. The ID must be unique within your account, and not collide with active or deleted IDs. There are a few rules when naming a collection ID:
- Characters: the colleciton ID can only contain lowercase letters [a-z], digits [0-9] and a hypen [-]
- Length: the maximum length for a colleciton ID is 36 characters
- Immutable: the collection ID can not be changed after the collection is created.
A collection name is your easy and descriptive way to identify your different collections in the Console. There are a few rules when naming a collection:
- Length: the maximum length for a colleciton ID is 256 characters
- Mutable: the collection name can be renamed after the collection is created
By far the most common case is to have the Vantage platform manage the translation of your data to AI embeddings. This means that during ingestion and search, the Vantage platform will automate the translation of your data and search queries into embeddings to support semantic search. We call this Vantage Managed Embeddings (VME).
For VME collections you will be required to enter:
- LLM provider: OpenAI or Hugging Face (coming soon)
- LLM model: Select or enter the name of the model that you'll use from your LLM provider.
- LLM API key: Your LLM provider API key. The Vantage platform securely stores and uses this key on your behalf to embed your data and your search queries.
A less common, but supported, option is for you to upload embeddings from the LLM of your choice into your collection. We call this User Provided Embeddings (UPE). When creating a collection with UPE, no additional LLM configuration is necessary.
In this mode, instead of uploading text data, you embed your data yourself (could be text, image, etc) and send the embedding to the Vantage platform. You must also provide the embedding for every search query sent to the Vantage platform. The platform supports embedding dimension sizes up to 2048. If higher dimensions are needed, please contact support.
The semantic search by text query endpoint will not be avaiable for UPE collections.
Updated 12 days ago