Vantage supports several parameters that are common to all search endpoints:

Collection ID: The only required field, specifying the collection to search within.
Accuracy: Defines the accuracy threshold for the search results.
Filter: Allows for narrowing down search results based on specific criteria.
Facets: Categorizes documents based on common attribute values.
Sort: Determines the sorting order of the search results.
Pagination: Controls the pagination settings for navigating through search results.
Weighted Field Values: Applies specific weights to certain fields to influence search relevance.

Required Parameters

Collection Identification (required)

You can have many collections with various types and composition of data.

To instruct the Vantage platform which collection within your account to perform the search against, users have to provide account_id and collection_id as part of the endpoint path.

account_id: The Vantage account ID that the collection is contained within. This can be found in the Console UI and it is typically your company or organization name.
collection_id: The unique identifier of the collection you are searching. You specified this ID when you created the collection. This can be found in the Console UI or by API request.

💻
SDK Usage

If you are accessing the Vantage platform through one of our SDKs, Account ID can be provided during the client initialization process, while Collection ID can be provided during the method call.

from vantage_sdk import VantageClient

vantage_instance = VantageClient.using_vantage_api_key(
    vantage_api_key=VANTAGE_API_KEY,
    account_id=ACCOUNT_ID,
)

vantage_instance.semantic_search(
    collection_id="example-collection",
    text="some query text"
)

import { VantageClientConfiguration, VantageClient } from "@vantage-sdk";

const configuration: VantageClientConfiguration = {
    vantageApiKey: vantageApiKey,
    accountId: accountId, 
    
}
let client = new VantageClient(configuration)

const collectionId = "example-collection"
const queryText = "Example query"

const searchResults = client.semanticSearch(collectionId, queryText);

import com.vantagediscovery.sdk.VantageClient;

public static void main(String[] args) {
    final VantageClient client = VantageClient.usingVantageApiKey()
        .withAccountId(ACCOUNT_ID)
        .withVantageApiKey(VANTAGE_API_KEY)
        .build();
}

Additional Optional Parameters

Accuracy

The Vantage platform lets you tune the recall of every search query, controlling how much of your collection data to search over. Generally, a lower accuracy number gives great results, with exceptional speed (tens of milliseconds). A higher accuracy number may provide additional or better results, but take longer to process (one to three seconds).

collection.accuracy: A number between 0.001 and 1.000 that tells the Vantage platform how much of the collection to search across. A higher number will search across more of the collection but take longer. If unsure, a good place to start is 0.2.

{
  ...
  "collection": {
    "accuracy" : 0.2
    ...
  }
  ...
}

{
  ...
  "collection": {
    "accuracy" : 0.5
    ...
  }
  ...
}

💻
SDK Usage

If you are accessing the Vantage platform through one of our SDKs, accuracy can be provided during the method call.

...

vantage_instance.semantic_search(
    collection_id="example-collection",
    text="some query text",
    accuracy=0.15
)

...

...

const collectionId = "example-collection"
const queryText = "Example query"
const accuracy = 0.15

const searchResults = client.semanticSearch(collectionId, queryText, accuracy);

...


final String collectionId = "example-collection

final SearchResult result = client.search()
    .collection("example-collection")
    .semantic()
    .withSearchProperties(
        CommonSearchProperties
           .builder()
           .withAccuracy(BigDecimal.valueOf(0.15))
           .build()
    )
    .withSearchText("Example query")
    .execute();

vantage search-semantic --vantage-api-key 'API_KEY' --text "Example query" --accuracy 0.15 example-collection

Filtering

Filters enable your collection's ingested features or categorical data to be used in conjunction with semantic similarity search. For example, in an e-commerce product catalog search, a shopper may only want product results within a single category, brand, size or color.

Document and variant fields are indexed separately, and therefore the caller must specify filters for document fields and variant fields separately using filter.boolean_filter and filter.variant_filter, respectively.

Filters can be applied via structured JSON, or via a single string using parentheses to group AND, OR, and NOT clauses. Note that in order to use range filters, you must use the structured JSON approach.

Structured JSON

A tree of JSON objects can specify the filter by a combination of field / value specifications, field / range specifications, AND, OR, and NOT nodes

field:value: Limits results based on exact, case-sensitive matching to a document or variant field provided during ingestion. Both field and value are case sensitive.
field:[value_1, value_2, ..., value_n: Limits results based on exact, case-sensitive matching to one of the specified values
field:{from: value_1, to: value_2}: Limits results to those with a value for field within the range of value_1 (inclusive) and value_2 (exclusive).
and: Combines an array of filter clauses with an AND. A document or variant must match all of these clauses to be returned.
or: Combines an array of filter clauses with an OR. A document or variant must match one or more of these clauses to be returned.
not: Combines an array of filter clauses with an NOT. A document must match

# product_category was ingested as meta_product_category
{ "product_category":"Fashion" }

{ 
  "and": 
  [
    { "product_category": "Fashion" },
    { "product_brand": "Brand XYZ" }
  ]
}

{ "product_category": ["Fashion", "Clothing"] }

{ "price": { "from": 50.0, "to": 100.0 } }

{ 
  "not": 
  [
    { "content_rating": "TV-14" }
  ]
}

{ 
  "and": 
  [
    { "product_category": ["Fashion", "Clothing" },
    { "product_brand": "Brand XYZ" }
  ]
}

Single Filter String

Either an empty string (no filters) or a boolean clause that will filter the results while the Vantage platform scores for semantic similarity. The string itself is comprised of:

field:"value": Limits results based on exact, case-sensitive matching to a meta_ field provided during ingestion. Both field and value are case sensitive.
Combinations of these limits put together with AND and OR.
These filters can be composed together and compositely to create trees of complex filters using parentheses ( and ).
Can be reversed by adding NOT in front of the filter
Note: Range filters currently cannot be performed with the compound filter string. Please use the Structured JSON approach if you need range filters.

# product_category was ingested as meta_product_category
product_category:"Fashion"

product_brand:"Brand XYZ"

(product_category:"Fashion" AND product_brand: "Brand XYZ")

(product_category:"Fashion" OR product_category:"Clothing")

NOT content_rating:"TV-14"

(
  (product_category:"Fashion" OR product_category:"Clothing")
  AND 
  product_brand:"Brand XYZ"
)

both boolean_filter and variant_filter are sent in JSON, so a filter typically has the quotes (") escaped in the JSON request. Most JSON libraries do this automatically on your behalf when you create JSON from an object string containing quotes.

{
  filter: {
    boolean_filter: "((product_category:\"Fashion\" OR product_category:\"Clothing\") AND product_brand:\"Brand XYZ\")"
  }
}

💻
SDK Usage

If you are accessing the Vantage platform through one of our SDKs, filter options can be provided during the method call, using the Filter object.

from vantage_sdk.model.search import Filter

...

filter_options = Filter(
    boolean_filter='(product_category:"Fashion" AND product_brand:"Brand XYZ")',
    variant_filter='(color:"Black" OR color:"Brown")',
)

vantage_instance.semantic_search(
    collection_id="example-collection",
    text="some query text",
    filter=filter_options,
)

...

const collectionId = "example-collection"
const queryText = "Example query"

const filterOptions = new Filter(
    '(product_category:"Fashion" AND product_brand:"Brand XYZ")', // booolean filter
    '(color:"Black" OR color:"Brown")' // variant filter
)

client.semanticSearch(
    collectionId,
    queryText, 
    undefined, // accuracy
    undefined, // pagination
    filterOptions
)

...


final String collectionId = "example-collection

final SearchResult result = client.search()
    .collection("example-collection")
    .semantic()
    .withSearchProperties(
        CommonSearchProperties
           .builder()
           .withFilter(
               Filter.builder()
                   .withBooleanFilter("(product_category:\"Fashion\" AND product_brand:\"Brand XYZ\")")
                   .withVariantFilter("(color:\"Black\" OR color:\"Brown\")")
                   .build()
           )
           .build()
    )
    .withSearchText("Example query")
    .execute();

vantage search-semantic --vantage-api-key API_KEY --text "Example query" --boolean-filter '(product_category:"Fashion" AND product_brand:"Brand XYZ")' --variant-filter '(color:"Black" OR color:"Brown")' example-collection

Facets

Facets enable categorization of search results based on common attribute values. Facets can be used in conjunction with filters to provide a user experience to progressively narrow down search results.

In the request, the caller specifies which attribute dimensions should be summarized (e.g. brand or price). These must be fields that were indexed with a meta_ prefix.

For each attribute dimension, the results will contain a summary of the unique values for that attribute, and the count of documents in the result set that have that attribute value.

facets: An array of objects containing name, type, and ranges fields. The name represents the field name to be faceted (upserted during ingestion as meta_<name>). The type is an enum that defines whether we want to specify a concrete value (count) or a range of values (range). The ranges parameter is an array of ranges, which can be open-ended and overlap. The min value is included in the range and the max value is excluded.

{
  "facets": [
    {
        "name": "color", "type": "count"
    },
    {
        "name": "price", "type": "range", "ranges": [
          { 
              "value": "under_50", "max": 50.0
          },
          {
              "value": "50_to_100", "min": 50.0, "max": 100.0
          },
          {
              "value": "100_and_up", "min": 100.0
          } 
      ]
    }
  ]
}

💻
SDK Usage

If you are accessing the Vantage platform through one of our SDKs, facets options can be provided during the method call, using the Facet object.

from vantage_sdk.model.search import Facet, FacetType

...

facets = [
    Facet(
        name="color",
        type=FacetType.COUNT,
    ),
    Facet(
        name="size",
        type=FacetType.COUNT,
        values=["sm", "md"],
    ),
]

vantage_instance.semantic_search(
    collection_id="example-collection",
    text="some query text",
    facets=facets,
)

...

const collectionId = "example-collection"
const queryText = "Example query"

const facets = [
            new Facet("color", FacetTypeEnum.Count),
            new Facet("size", FacetTypeEnum.Count, ["sm", "md"])
        ]

client.semanticSearch(
    collectionId,
    queryText, 
    undefined, // accuracy
    undefined, // pagination
    undefined, // filter
    undefined, // sort
    undefined, // field value weighting
    facets,
)

...


final SearchResult result = client
    .search()
    .collection("example-collection")
    .semantic()
    .withSearchProperties(
        CommonSearchProperties
            .builder()
            .withAccuracy(BigDecimal.ONE)
            .withFacets(List.of(
                 Facet.countAllFacet("color"),
                 Facet.countValuesFacet("size", List.of("sm", "md", "lg"))
            ))
        .build()
    )
    .withSearchText("test search")
    .execute();

vantage search-semantic --vantage-api-key API_KEY --text "Example query" --facets '[ { "name": "color", "type": "count", "values": [] }, { "name": "size", "type": "count", "values": [ "sm", "md", "lg" ] } ]' example-collection

Sort

To enable sorting of your search results, follow the steps outlined below:

Data Ingestion: When ingesting your data, ensure that the column names intended for sorting have the prefix meta_ordered_. This prefix differentiates sortable columns from other metadata fields, which typically use the prefix meta_. For instance, if you wish to sort by price, name the column meta_ordered_price.

Value Type Restriction: Values provided for the meta_ordered_ columns must be of type float.

Executing a Search: During your search query, refer to the field by its base name without the prefix. For example, use price to sort by the previously defined meta_ordered_price column.

field: The name of the field by which search results are sorted. For instance, based on the context provided earlier, price would serve as the sort_field when you want to organize search results according to price values.
order: Specifies the direction in which search results are organized. It can be either ascending (asc) to sort from lowest to highest values, or descending (desc) to sort from highest to lowest values. The default sorting order is descending (desc).
mode: Indicates the criteria used for sorting search results. Options include field_selection, which organizes results based on the values of the sort_field, and semantic_threshold, which sorts results based on their relevance or similarity to the search query. The default sorting mode is field_selection.

{  
  "sort": {  
    	"field": "price",  
    	"order": "asc",  
    	"mode": "field_selection",  
    }  
}

{
  "sort": {
    	"field": "price",
    	"order": "desc",
    	"mode": "semantic_threshold",
    }
}

💻
SDK Usage

If you are accessing the Vantage platform through one of our SDKs, sort options can be provided during the method call, using the Sort object.

from vantage_sdk.model.search import Sort

...

sort_options = Sort(
    field="price",
    order="asc",
    mode="field_selection",
)

vantage_instance.semantic_search(
    collection_id="example-collection",
    text="some query text",
    sort=sort_options,
)

...

const collectionId = "example-collection"
const queryText = "Example query"

const sortOptions = new Sort(
    "price", // field
    "asc", // order
    "field_selection" // mode
)

client.semanticSearch(
    collectionId,
    queryText, 
    undefined, // accuracy
    undefined, // pagination
    undefined, // filter
    sortOptions
)

...


final String collectionId = "example-collection"

final SearchResult result = client.search()
    .collection("example-collection")
    .semantic()
    .withSearchProperties(
        CommonSearchProperties
           .builder()
           .withAccuracy(BigDecimal.valueOf(0.15))
           .withSort(new Sort("price", Sort.SortOrderType.ASC, Sort.SortModeType.FIELD_SELECTION))
           .build()
    )
    .withSearchText("Example query")
    .execute();

vantage search-semantic --vantage-api-key API_KEY --text "Example query" --accuracy 0.15 --sort-field price --sort-order asc --sort-mode field_selection example-collection

Pagination

Pagination lets you control which results you receive within the larger set of results. You can call the endpoint repeatedly to page your results, requesting batches of results up to a total of 1,000 results.

pagination.page: A number, starting at 0, that indicates the page of results to return, where each page is of size pagination.count.
pagination.count: The number of results to return for this request. Must be greater than 0.
pagination.threshold: Determines the "pool" of records to match before sorting. Must be lower than 10,000.

{
	...
  "pagination": {
    "page": 0,
    "count": 40
  }
	...
}

{
	...
  "pagination": {
    "page": 1,
    "count": 40
  }
	...
}

{
	...
  "pagination": {
    "page": 0,
    "count": 40,
    "threshold": 300,
  }
	...
}

{
	...
  "pagination": {
    "page": 0,
    "count": 40,
    "threshold": 5000,
  }
	...
}

🚧
Result order determinism
The overall search result set for a given query may change for a variety of reasons between requests. While it's very likely that the next page of results will begin on the precise next result from the overall set, it's possible that new content being ingested into the collection may alter the overall result set.

💻
SDK Usage

If you are accessing the Vantage platform through one of our SDKs, pagination options can be provided during the method call, using the Pagination object.

from vantage_sdk.model.search import Pagination

...

pagination_options = Pagination(
    page=0,
    count=40,
    threshold=300,
)

vantage_instance.semantic_search(
    collection_id="example-collection",
    text="some query text",
    pagination=pagination_options,
)

...

const collectionId = "example-collection"
const queryText = "Example query"

const paginationOptions = new Pagination(
    0, // page
    40, // count
    300 // threshold
)

client.semanticSearch(
    collectionId,
    queryText, 
    undefined, // accuracy
    paginationOptions,
)

...


final String collectionId = "example-collection

final SearchResult result = client.search()
    .collection("example-collection")
    .semantic()
    .withSearchProperties(
        CommonSearchProperties
           .builder()
           .withAccuracy(BigDecimal.valueOf(0.15))
           .withSort(new Sort("price", Sort.SortOrderType.ASC, Sort.SortModeType.FIELD_SELECTION))
           .withPagination(
               Pagination.builder()
                   .withPage(0)
                   .withCount(40)
                   .withThreshold(300)
                   .build()
           )
           .build()
    )
    .withSearchText("Example query")
    .execute();

vantage search-semantic --vantage-api-key API_KEY --text "Example query" --accuracy 0.15 --page 0 --items-per-page 40 --pagination-threshold 300 example-collection

Request ID

To enable asynchronous calls to the search endpoints, an identifier is included in the request which is then returned with the results.

request_id: An integer that will be returned with the results. It should be unique across all in-progress calls to any search endpoint.

Field Value Weighting

Keyword Support

If you are using Vantage Managed embeddings, during ingestion the text field is processed to support a straightforward keyword boosting method for search. You can use the tokens extracted and boost direct keyword matching using the following two fields to boost the core semantic matching score. This is useful if you are trying to add just a bit of keyword help to the existing semantic search to help ensure direct and long-tail phrases from your users are well represented in the initial results.

field_value_weighting.query_key_word_max_overall_weight: A number that will represent the largest increase in score with the number of key word or phrases that were matched. 1 is neutral, and regardless of how many keywords match or don't, the semantic score won't be affected. 0-1 reduces the score if the keyword matches meaning, reduce the score if there are keyword matches. 1-2 increases the score based on the number of phrases and matches present up to the maximum.
field_value_weighting.query_key_word_weighting_mode: A field which instructs Vantage how to do weighting on keywords. none indicate no keyword matching will be part of the query. uniform treats all word and phrases (after stemming) in the query input equal in weight using consistent score additions for any keyword matches. weighted uses embeddings to match words and phrases (after stemming) to the query input, and let the embedding distances determine the relative weights to apply for any matched.

{
  "field_value_weighting": {
      "query_key_word_weighting_mode": "uniform",
      "query_key_word_max_overall_weight": 1.05
  }
}

Field Value Boosting

There are many use cases where items in a particular category, brand, color, or other defined attributes (in meta_ fields) should be boosted (or reduced) slightly to help improve the overall results. red shoes will receive generally good results in the semantic search for shoes and/or red things. But the semantic scores without the context of the corpus will often favor these semantic ideas generally, instead of the dictionary of items in your collection. You can boost field values, based on the context for where you are calling the search (Brand Specific landing page on your site) or parse the search itself for values. Vantage will take a set of fields, values, and weights and if they match exactly adjust the scores for those items accordingly. If the field values don't match, no adjustments to scores occur.

field_value_weighting.weighted_field_values: An array of objects, that instruct Vantage to boost the scores for the fields, names and weights specified. weight is 1 neutral, with 0-1 reducing scores and the 1-2 increasing the scores for items that match.

{
  "field_value_weighting": {
    "weighted_field_values": [
      {
          "field": "category", "value": "shoes", "weight": 1.03
      },
      {
          "field": "color", "value": "red", "weight": 1.03
      },
      {
          "field": "style", "value": "bogus", "weight": 1.03
      }
    ]
  }
}

Any documents with category:shoes score will be multipled by 1.03. Same with items that are color:red. The bogus style articulated above, which is guaranteed to never have a value of bogus in style will be ignored and no adjustments to any unmatched field values will occur.

📘
Reference Guide for Search API

Search Options

Required Parameters

Collection Identification (required)

💻
SDK Usage

Additional Optional Parameters

Accuracy

💻
SDK Usage

Filtering

Structured JSON

Single Filter String

💻
SDK Usage

Facets

💻
SDK Usage

Sort

💻
SDK Usage

Pagination

🚧
Result order determinism

💻
SDK Usage

Request ID

Field Value Weighting

Keyword Support

Field Value Boosting

📘
Reference Guide for Search API

Required Parameters

Collection Identification (required)

💻SDK Usage

Additional Optional Parameters

Accuracy

💻SDK Usage

Filtering

Structured JSON

Single Filter String

💻SDK Usage

Facets

💻SDK Usage

Sort

💻SDK Usage

Pagination

🚧Result order determinism

💻SDK Usage

Request ID

Field Value Weighting

Keyword Support

Field Value Boosting

📘Reference Guide for Search API

💻
SDK Usage

💻
SDK Usage

💻
SDK Usage

💻
SDK Usage

💻
SDK Usage

🚧
Result order determinism

💻
SDK Usage

📘
Reference Guide for Search API