Search Options
Parameters Common to All Search Endpoints
Vantage supports several parameters are common to all search endpoints:
- Collection ID: The only required field, specifying the collection to search within.
- Accuracy: Defines the accuracy threshold for the search results.
- Pagination: Controls the pagination settings for navigating through search results.
- Filter: Allows for narrowing down search results based on specific criteria.
- Sort: Determines the sorting order of the search results.
- Weighted Field Values: Applies specific weights to certain fields to influence search relevance.
Required Parameters
Collection Identification (required)
You can have many collections with various types and composition of data.
To instruct the Vantage platform which collection within your account to perform the search against, user have to provide collection_id
and account_id
as part of the endpoint path.
account_id
: The Vantage account ID that the collection is contained within. This can be found in the Console UI and it is typically your company or organization name.collection_id
: The unique identifier of the collection you are searching. You specified this ID when you created the collection. This can be found in the Console UI or by API request.
SDK Usage
If you are accessing the Vantage platform through one ouf our SDKs, Account ID can be provided during the client initialization process, while Collection ID can be provided during the method call.
from vantage_sdk import VantageClient vantage_instance = VantageClient.using_vantage_api_key( vantage_api_key=VANTAGE_API_KEY, account_id=ACCOUNT_ID, ) vantage_instance.semantic_search( collection_id="example-collection", text="some query text" )
import { VantageClientConfiguration, VantageClient } from "@vantage-sdk"; const configuration: VantageClientConfiguration = { vantageApiKey: vantageApiKey, accountId: accountId, } let client = new VantageClient(configuration) const collectionId = "example-collection" const queryText = "Example query" const searchResults = client.semanticSearch(collectionId, queryText);
import com.vantagediscovery.sdk.VantageClient; public static void main(String[] args) { final VantageClient client = VantageClient.usingVantageApiKey() .withAccountId(ACCOUNT_ID) .withVantageApiKey(VANTAGE_API_KEY) .build(); }
Additional Optional Parameters
Accuracy
The Vantage platform lets you tune the recall of every search query, controlling how much of your collection data to search over. Generally, a lower accuracy number give great results, with exceptional speed (tens of milliseconds). A higher accuracy number may provide additional or better results, but take longer to process (one to three seconds).
collection.accuracy
: A number between 0.001 and 1.000 that tells the Vantage platform how much of the collection to search across. A higher number will search across more of the collection but take longer. If unsure, a good place to start is 0.2.
{
...
"collection": {
"accuracy" : 0.15
...
}
...
}
{
...
"collection": {
"accuracy" : 0.5
...
}
...
}
SDK Usage
If you are accessing the Vantage platform through one of our SDKs,
accuracy
can be provided during the method call.... vantage_instance.semantic_search( collection_id="example-collection", text="some query text", accuracy=0.15 ) ...
... const collectionId = "example-collection" const queryText = "Example query" const accuracy = 0.15 const searchResults = client.semanticSearch(collectionId, queryText, accuracy); ...
final String collectionId = "example-collection final SearchResult result = client.search() .collection("example-collection") .semantic() .withSearchProperties( CommonSearchProperties .builder() .withAccuracy(BigDecimal.valueOf(0.15)) .build() ) .withSearchText("Example query") .execute();
vantage search-semantic --vantage-api-key 'API_KEY' --text "Example query" --accuracy 0.15 example-collection
Sort
To enable sorting of your search results, follow the steps outlined below:
Data Ingestion: When ingesting your data, ensure that the column names intended for sorting have the prefix meta_ordered_
. This prefix differentiates sortable columns from other metadata fields, which typically use the prefix meta_
. For instance, if you wish to sort by price, name the column meta_ordered_price
.
Value Type Restriction: Values provided for the meta_ordered_
columns must be of type float
.
Executing a Search: During your search query, refer to the field by its base name without the prefix. For example, use price
to sort by the previously defined meta_ordered_price
column.
field
: The name of the field by which search results are sorted. For instance, based on the context provided earlier,price
would serve as thesort_field
when you want to organize search results according to price values.order
: Specifies the direction in which search results are organized. It can be either ascending (asc
) to sort from lowest to highest values, or descending (desc
) to sort from highest to lowest values. The default sorting order is descending (desc
).mode
: Indicates the criteria used for sorting search results. Options includefield_selection
, which organizes results based on the values of thesort_field
, andsemantic_threshold
, which sorts results based on their relevance or similarity to the search query. The default sorting mode isfield_selection
.
{
"sort": {
"field": "price",
"order": "asc",
"mode": "field_selection",
}
}
{
"sort": {
"field": "price",
"order": "desc",
"mode": "semantic_threshold",
}
}
SDK Usage
If you are accessing the Vantage platform through one of our SDKs, sort options can be provided during the method call, using the
Sort
object.from vantage_sdk.model.search import Sort ... sort_options = Sort( field="price", order="asc", mode="field_selection", ) vantage_instance.semantic_search( collection_id="example-collection", text="some query text", sort=sort_options, )
... const collectionId = "example-collection" const queryText = "Example query" const sortOptions = new Sort( "price", // field "asc", // order "field_selection" // mode ) client.semanticSearch( collectionId, queryText, undefined, // accuracy undefined, // pagination undefined, // filter sortOptions ) ...
final String collectionId = "example-collection final SearchResult result = client.search() .collection("example-collection") .semantic() .withSearchProperties( CommonSearchProperties .builder() .withAccuracy(BigDecimal.valueOf(0.15)) .withSort(new Sort("price", Sort.SortOrderType.ASC, Sort.SortModeType.FIELD_SELECTION)) .build() ) .withSearchText("Example query") .execute();
vantage search-semantic --vantage-api-key API_KEY --text "Example query" --accuracy 0.15 --sort-field price --sort-order asc --sort-mode field_selection example-collection
Pagination
Pagination lets you control which results you receive within the larger set of results. You can call the endpoint repeatedly to page your results, requesting batches of results up to a total of 1000 results.
pagination.page
: A number, starting at 0, that indicates the page of results to return, where each page is of sizepagination.count
.pagination.count
: The number of results to return for this request. Must be greater than 0.pagination.threshold
: Determines the "pool" of records to match before sorting. Must be lower than 10K.
{
...
"pagination": {
"page": 0,
"count": 40
}
...
}
{
...
"pagination": {
"page": 1,
"count": 40
}
...
}
{
...
"pagination": {
"page": 0,
"count": 40,
"threshold": 300,
}
...
}
{
...
"pagination": {
"page": 0,
"count": 40,
"threshold": 5000,
}
...
}
Result order determinism
The overall search result set for a given query may change for a variety of reasons between requests. While it's very likely that the next page of results will begin on the precise next result from the overall set, it's possible that new content being ingested into the collection may alter the overall result set.
SDK Usage
If you are accessing the Vantage platform through one of our SDKs, pagination options can be provided during the method call, using the
Pagination
object.from vantage_sdk.model.search import Pagination ... pagination_options = Pagination( page=0, count=40, threshold=300, ) vantage_instance.semantic_search( collection_id="example-collection", text="some query text", pagination=pagination_options, )
... const collectionId = "example-collection" const queryText = "Example query" const paginationOptions = new Pagination( 0, // page 40, // count 300 // threshold ) client.semanticSearch( collectionId, queryText, undefined, // accuracy paginationOptions, ) ...
final String collectionId = "example-collection final SearchResult result = client.search() .collection("example-collection") .semantic() .withSearchProperties( CommonSearchProperties .builder() .withAccuracy(BigDecimal.valueOf(0.15)) .withSort(new Sort("price", Sort.SortOrderType.ASC, Sort.SortModeType.FIELD_SELECTION)) .withPagination( Pagination.builder() .withPage(0) .withCount(40) .withThreshold(300) .build() ) .build() ) .withSearchText("Example query") .execute();
vantage search-semantic --vantage-api-key API_KEY --text "Example query" --accuracy 0.15 --page 0 --items-per-page 40 --pagination-threshold 300 example-collection
Filtering
Filters enable your collection's ingested features or categorical data to be used in conjunction with semantic similarity search. Using filters generally results in lightning quick results. They are frequently used in traditional faceted search interfaces. For example, in product catalog search, you may only want product results within a single category, brand, size or color.
-
filter.boolean_filter
Either an empty string (no filters) or a boolean clause that will filter the results while the Vantage platform scores for semantic similarity. The string itself is comprised of:field:"value"
: Limits results based on exact, case sensitive matching to ameta_
field provided during ingestion. Bothfield
andvalue
are case sensitive.- Combinations of these limits put together with
AND
andOR
. - These filters can be composed together and compositely to create trees of complex filters using parentheses
(
and)
. - Can be reversed by adding
NOT
in front of the filter
-
filter.variant_filter
Either an empty string (no filters) or a boolean clause that will filter the results while the Vantage platform scores for semantic similarity. The string itself is comprised of:field:"value"
: Limits results based on exact, case sensitive matching to a fields insidevariants
list of objects provided during ingestion. Bothfield
andvalue
are case sensitive.- Combinations of these limits put together with
AND
andOR
. - These filters can be composed together and compositely to create trees of complex filters using parentheses
(
and)
. - Can be reversed by adding
NOT
in front of the filter
# product_category was ingested as meta_product_category
product_category:"Fashion"
product_BrandName:"Brand XYZ"
(product_category:"Fashion" AND product_BrandName:"Brand XYZ")
(product_category:"Fashion" OR product_category:"Clothing")
NOT content_rating:"TV-14"
(
(product_category:"Fashion" OR product_category:"Clothing")
AND
product_BrandName:"Brand XYZ"
)
- both
boolean_filter
andvariant_filter
are sent in JSON, so a filter typically has the quotes ("
) escaped in the JSON request. Most JSON libraries do this automatically on your behalf when you create JSON from an object string containing quotes.
{
filter: {
boolean_filter: "((product_category:\"Fashion\" OR product_category:\"Clothing\") AND product_BrandName:\"Brand XYZ\")"
}
}
SDK Usage
If you are accessing the Vantage platform through one of our SDKs, filter options can be provided during the method call, using the
Filter
object.from vantage_sdk.model.search import Filter ... filter_options = Filter( boolean_filter='(product_category:"Fashion" AND product_BrandName:"Brand XYZ")', variant_filter='(color:"Black" OR color:"Brown")', ) vantage_instance.semantic_search( collection_id="example-collection", text="some query text", filter=filter_options, )
... const collectionId = "example-collection" const queryText = "Example query" const filterOptions = new Filter( '(product_category:"Fashion" AND product_BrandName:"Brand XYZ")', // booolean filter '(color:"Black" OR color:"Brown")' // variant filter ) client.semanticSearch( collectionId, queryText, undefined, // accuracy undefined, // pagination filterOptions ) ...
final String collectionId = "example-collection final SearchResult result = client.search() .collection("example-collection") .semantic() .withSearchProperties( CommonSearchProperties .builder() .withFilter( Filter.builder() .withBooleanFilter("(product_category:\"Fashion\" AND product_BrandName:\"Brand XYZ\")") .withVariantFilter("(color:\"Black\" OR color:\"Brown\")") .build() ) .build() ) .withSearchText("Example query") .execute();
vantage search-semantic --vantage-api-key API_KEY --text "Example query" --boolean-filter '(product_category:"Fashion" AND product_BrandName:"Brand XYZ")' --variant-filter '(color:"Black" OR color:"Brown")' example-collection
Request ID
To enable asynchronous calls to the search endpoints, an identifier is included in the request which is then returned with the results.
request_id
: An integer that will be returned with the results. It should be unique across all in-progress calls to any search endpoint.
Field Value Weighting
Keyword Support
If you are using Vantage Managed embeddings, during ingestion the text
field is processed to support a straightforward keyword boosting method for search. You can use the tokens extracted and boost direct keyword matching using the following two fields to boost the core semantic matching score. This is useful if you are trying to add just a bit of keyword help to the existing semantic search to help ensure direct and long-tail phrases from your users are well represented in the initial results.
field_value_weighting.query_key_word_max_overall_weight
: A number that will represent the largest increase in score with the number of key word or phrases that were matched.1
is neutral, and regardless of how many keywords match or don't, the semantic score won't be affected.0-1
reduces the score if the keyword matches meaning, reduce the score if there are keyword matches.1-2
increases the score based on the number of phrases and matches present up to the maximum.field_value_weighting.query_key_word_weighting_mode
: A field which instructs Vantage how to do weighting on keywords.none
indicate no keyword matching will be part of the query.uniform
treats all word and phrases (after stemming) in the query input equal in weight using consistent score additions for any keyword matches.weighted
uses embeddings to match words and phrases (after stemming) to the query input, and let the embedding distances determine the relative weights to apply for any matched.
{
"field_value_weighting": {
"query_key_word_weighting_mode": "uniform",
"query_key_word_max_overall_weight": 1.05
}
}
Field Value Boosting
There exists many use cases, where items in of particular category, brand, color, or other defined attributes (in meta_
fields) should be boosted (or reduced) slightly to help improve the overall results. red shoes
will receive generally good results in the semantic search for shoes, and red things, and generally both but often the semantic scores without the context of the corpus will often favor these semantic ideas generally, instead of the dictionary of items in your collection. You can boost field values, based on the context for where you are calling the search (Brand Specific landing page on your site) or the parse the search itself for values. Vantage will take a set of fields, values, and weights and if they match exactly adjust the scores for those items accordingly. If the field values don't match, no adjustments to scores occur.
field_value_weighting.weighted_field_values
: An array of objects, that instruct Vantage to boost the scores for the fields, names and weights specified.weight
is 1 neutral, with 0-1 reducing scores and the 1-2 increasing the scores for items that match.
{
"field_value_weighting": {
"weighted_field_values": [
{
"field": "category", "value": "shoes", "weight": 1.03
},
{
"field": "color", "value": "red", "weight": 1.03
},
{
"field": "style", "value": "bogus", "weight": 1.03
}
]
}
}
Any documents with category:shoes
score will be multipled by 1.03. Same with items that are color:red
. The bogus
style articulated above, which is guaranteed to never have a value of bogus
in style
will be ignored and no adjustments to any unmatched field values will occur.
Facets
Facets are like filters that allow users to drill down into specific attributes of the data. For example, if you're searching for clothing items in an online store, you might use facets like color
and size
to narrow down the results to just red shirts in medium size. Facets provide a structured way to explore data by enabling easy filtering on object attributes.
In our case, the API will return the count for each facet value provided, rather than the specific objects themselves. You can retrieve the objects by using boolean_filter
to filter on different facet values.
facets
: An array of objects containingname
,type
, andvalues
fields. Thename
represents the facet's name (upserted during ingestion asmeta_facet_<name>
). Thetype
is an enum that defines whether we want to specify a concrete value (count
) or a range of values (range
). Currently, only thecount
type is available. Thevalues
field represents an array of values for which we want to receive a count. Ifvalues
is an empty list, API will return count for all possible values.
{
"facets": [
{
"name": "color", "type": "count", "values": []
},
{
"name": "size", "type": "count", "values": ["sm", "md"]
}
],
}
SDK Usage
If you are accessing the Vantage platform through one of our SDKs, facets options can be provided during the method call, using the
Facet
object.from vantage_sdk.model.search import Facet, FacetType ... facets = [ Facet( name="color", type=FacetType.COUNT, ), Facet( name="size", type=FacetType.COUNT, values=["sm", "md"], ), ] vantage_instance.semantic_search( collection_id="example-collection", text="some query text", facets=facets, )
... const collectionId = "example-collection" const queryText = "Example query" const facets = [ new Facet("color", FacetTypeEnum.Count), new Facet("size", FacetTypeEnum.Count, ["sm", "md"]) ] client.semanticSearch( collectionId, queryText, undefined, // accuracy undefined, // pagination undefined, // filter undefined, // sort undefined, // field value weighting facets, ) ...
final SearchResult result = client .search() .collection("example-collection") .semantic() .withSearchProperties( CommonSearchProperties .builder() .withAccuracy(BigDecimal.ONE) .withFacets(List.of( Facet.countAllFacet("color"), Facet.countValuesFacet("size", List.of("sm", "md", "lg")) )) .build() ) .withSearchText("test search") .execute();
vantage search-semantic --vantage-api-key API_KEY --text "Example query" --facets '[ { "name": "color", "type": "count", "values": [] }, { "name": "size", "type": "count", "values": [ "sm", "md", "lg" ] } ]' example-collection
Updated 6 days ago