vectorai

Contents

  • Vector AI - Essentials
  • QuickStart

Guides

  • Vector Search / Nearest Neighbors
  • Search
  • Collection Metadata
  • Advanced Search
  • Advanced Vector Search
  • Vector based Recommendations (Search by Id)
  • Vector Analytics/Aggregation
  • Clustering
  • Dimensionality Reduction
  • Visualisations (Advanced)
  • Custom Encodings

Case Studies

  • Example - Vector Recommendations With NBA Players

Frequently Asked Questions

  • Frequently Asked Questions

Documentation

  • Client
  • Read
  • Write
  • Cluster
  • Array & Dictionary
  • Dimensionality Reduction
  • Search
    • Search
  • Images
  • Texts
  • Audios
  • Visualisations
vectorai
  • »
  • Search
  • View page source

Search¶

Search¶

Search

class vectorai.api.search.ViSearchClient(username, api_key, url=None)¶

Search and Advanced Search Operations

hybrid_search(collection_name: str, text: str, vector: List, fields: List, text_fields: List, sum_fields: bool = True, metric: str = 'cosine', min_score: float = None, traditional_weight=0.075, page: int = 1, page_size: int = 10, include_vector: bool = False, include_count: bool = True, asc: bool = False, return_curl: bool = False, **kwargs)¶

Search a text field with vector and text using Vector Search and Traditional Search

Vector similarity search + Traditional Fuzzy Search with text and vector.

Parameters
  • text – Text Search Query (not encoded as vector)

  • vector – Vector, a list/array of floats that represents a piece of data.

  • text_fields – Text fields to search against

  • traditional_weight – Multiplier of traditional search. A value of 0.025~0.1 is good.

  • fuzzy – Fuzziness of the search. A value of 1-3 is good.

  • join – Whether to consider cases where there is a space in the word. E.g. Go Pro vs GoPro.

  • collection_name – Name of Collection

  • search_fields – Vector fields to search through

  • approx – Used for approximate search

  • sum_fields – Whether to sum the multiple vectors similarity search score as 1 or seperate

  • page_size – Size of each page of results

  • page – Page of the results

  • metric – Similarity Metric, choose from [‘cosine’, ‘l1’, ‘l2’, ‘dp’]

  • min_score – Minimum score for similarity metric

  • include_vector – Include vectors in the search results

  • include_count – Include count in the search results

  • hundred_scale –

    Whether to scale up the metric by 100

    asc:

    Whether to sort the score by ascending order (default is false, for getting most similar results)

search_by_id(collection_name: str, document_id: str, field: str, sum_fields: bool = True, metric: str = 'cosine', min_score=0, page: int = 1, page_size: int = 10, include_vector: bool = False, include_count: bool = True, asc: bool = False, approx: int = 0, hundred_scale: bool = False, return_curl: bool = False, **kwargs)¶

Single Product Recommendations (Search by an id)

Recommendation by retrieving the vector from the specified id’s document. Then performing a search with that vector.

Parameters
  • document_id – ID of a document

  • collection_name – Name of Collection

  • search_field – Vector fields to search through

  • approx – Used for approximate search

  • sum_fields – Whether to sum the multiple vectors similarity search score as 1 or seperate

  • page_size – Size of each page of results

  • page – Page of the results

  • metric – Similarity Metric, choose from [‘cosine’, ‘l1’, ‘l2’, ‘dp’]

  • min_score – Minimum score for similarity metric

  • include_vector – Include vectors in the search results

  • include_count – Include count in the search results

  • hundred_scale –

    Whether to scale up the metric by 100

    asc:

    Whether to sort the score by ascending order (default is false, for getting most similar results)

    return_curl:

    Return the CURL statement relevant to the Python requests

search_by_ids(collection_name: str, document_ids: List, field: str, vector_operation: str = 'mean', sum_fields: bool = True, metric: str = 'cosine', min_score=0, page: int = 1, page_size: int = 10, include_vector: bool = False, include_count: bool = True, asc: bool = False, return_curl: bool = False, **kwargs)¶

Multi Product Recommendations (Search by ids)

Recommendation by retrieving the vectors from the specified list of ids documents. Then performing a search with an aggregated vector that is the sum (depends on vector_operation) of those vectors.

Parameters
  • document_ids – IDs of documents

  • vector_operation – Aggregation for the vectors, choose from [‘mean’, ‘sum’, ‘min’, ‘max’]

  • collection_name – Name of Collection

  • search_field – Vector fields to search through

  • approx – Used for approximate search

  • sum_fields – Whether to sum the multiple vectors similarity search score as 1 or seperate

  • page_size – Size of each page of results

  • page – Page of the results

  • metric – Similarity Metric, choose from [‘cosine’, ‘l1’, ‘l2’, ‘dp’]

  • min_score – Minimum score for similarity metric

  • include_vector – Include vectors in the search results

  • include_count – Include count in the search results

  • hundred_scale –

    Whether to scale up the metric by 100

    asc:

    Whether to sort the score by ascending order (default is false, for getting most similar results)

search_by_positive_negative_ids(collection_name: str, positive_document_ids: List, negative_document_ids: List, field: str, vector_operation: str = 'mean', sum_fields: bool = True, metric: str = 'cosine', min_score=0, page: int = 1, page_size: int = 10, include_vector: bool = False, include_count: bool = True, asc: bool = False, return_curl: bool = False, **kwargs)¶

Multi Product Recommendations with Likes and Dislikes (Search by ids)

Recommendation by retrieving the vectors from the specified list of positive and negative ids documents. Then performing a search with an aggregated vector that is the sum (depends on vector_operation) of positive id vectors minus the negative id vectors.

Parameters
  • positive_document_ids – Positive Document IDs to get recommendations for, and the weightings of each document

  • negative_document_ids – Negative Document IDs to get recommendations for, and the weightings of each document

  • vector_operation – Aggregation for the vectors, choose from [‘mean’, ‘sum’, ‘min’, ‘max’]

  • collection_name – Name of Collection

  • search_field – Vector fields to search through

  • approx – Used for approximate search

  • sum_fields – Whether to sum the multiple vectors similarity search score as 1 or seperate

  • page_size – Size of each page of results

  • page – Page of the results

  • metric – Similarity Metric, choose from [‘cosine’, ‘l1’, ‘l2’, ‘dp’]

  • min_score – Minimum score for similarity metric

  • include_vector – Include vectors in the search results

  • include_count – Include count in the search results

  • hundred_scale –

    Whether to scale up the metric by 100

    asc:

    Whether to sort the score by ascending order (default is false, for getting most similar results)

search_with_positive_negative_ids_as_history(collection_name: str, vector: List, positive_document_ids: List, negative_document_ids: List, field: str, vector_operation: str = 'mean', sum_fields: bool = True, metric: str = 'cosine', min_score=0, page: int = 1, page_size: int = 10, include_vector: bool = False, include_count: bool = True, asc: bool = False, return_curl: bool = False, **kwargs)¶

Multi Product Recommendations with Likes and Dislikes (Search by ids)

Search by retrieving the vectors from the specified list of positive and negative ids documents. Then performing a search with search query vector and aggregated vector, that is the sum (depends on vector_operation) of positive id vectors minus the negative id vectors.

Parameters
  • vector – Vector, a list/array of floats that represents a piece of data.

  • positive_document_ids – Positive Document IDs to get recommendations for, and the weightings of each document

  • negative_document_ids – Negative Document IDs to get recommendations for, and the weightings of each document

  • vector_operation – Aggregation for the vectors, choose from [‘mean’, ‘sum’, ‘min’, ‘max’]

  • collection_name – Name of Collection

  • search_field – Vector fields to search through

  • approx – Used for approximate search

  • sum_fields – Whether to sum the multiple vectors similarity search score as 1 or seperate

  • page_size – Size of each page of results

  • page – Page of the results

  • metric – Similarity Metric, choose from [‘cosine’, ‘l1’, ‘l2’, ‘dp’]

  • min_score – Minimum score for similarity metric

  • include_vector – Include vectors in the search results

  • include_count – Include count in the search results

  • hundred_scale –

    Whether to scale up the metric by 100

    asc:

    Whether to sort the score by ascending order (default is false, for getting most similar results)

advanced_search(collection_name: str, multivector_query: Dict, sum_fields: bool = True, facets: List = [], filters: List = [], metric: str = 'cosine', min_score=None, page: int = 1, page_size: int = 10, include_vector: bool = False, include_count: bool = True, include_facets: bool = False, asc: bool = False, approx: int = 0, return_curl: bool = False, **kwargs)¶

Advanced Vector Similarity Search. Support for multiple vectors, vector weightings, facets and filtering

Advance Vector Similarity Search, enables machine learning search with vector search. Search with a multiple vectors for the most similar documents.

For example: Search with a product image and description vectors to find the most similar products by what it looks like and what its described to do.

You can also give weightings of each vector field towards the search, e.g. image_vector_ weights 100%, whilst description_vector_ 50%.

Advanced search also supports filtering to only search through filtered results and facets to get the overview of products available when a minimum score is set.

Parameters
  • collection_name – Name of Collection

  • multivector_query – Query for advance search that allows for multiple vector and field querying

  • page – Page of the results

  • page_size – Size of each page of results

  • approx – Used for approximate search

  • sum_fields – Whether to sum the multiple vectors similarity search score as 1 or seperate

  • metric – Similarity Metric, choose from [‘cosine’, ‘l1’, ‘l2’, ‘dp’]

  • filters – Query for filtering the search results

  • facets – Fields to include in the facets, if [] then all

  • min_score – Minimum score for similarity metric

  • include_vector – Include vectors in the search results

  • include_count – Include count in the search results

  • include_facets – Include facets in the search results

  • hundred_scale –

    Whether to scale up the metric by 100

    asc:

    Whether to sort the score by ascending order (default is false, for getting most similar results)

Example

>>> vi_client = ViCollectionClient(username, api_key, collection_name, url)
>>> advanced_search_query = {
        'text' : {'vector': encode_question("How do I cluster?"), 'fields' : ['function_vector_']}
    }
>>> vi_client.advanced_search(advanced_search_query)
advanced_hybrid_search(collection_name: str, text: str, multivector_query: Dict, text_fields: List, sum_fields: bool = True, facets: List = [], filters: List = [], metric: str = 'cosine', min_score=None, page: int = 1, page_size: int = 10, include_vector: bool = False, include_count: bool = True, include_facets: bool = False, asc: bool = False, return_curl: bool = False, **kwargs)¶

Advanced Search a text field with vector and text using Vector Search and Traditional Search

Advanced Vector similarity search + Traditional Fuzzy Search with text and vector.

You can also give weightings of each vector field towards the search, e.g. image_vector_ weights 100%, whilst description_vector_ 50%.

Advanced search also supports filtering to only search through filtered results and facets to get the overview of products available when a minimum score is set.

Parameters
  • collection_name – Name of Collection

  • page – Page of the results

  • page_size – Size of each page of results

  • approx – Used for approximate search

  • sum_fields – Whether to sum the multiple vectors similarity search score as 1 or seperate

  • metric – Similarity Metric, choose from [‘cosine’, ‘l1’, ‘l2’, ‘dp’]

  • filters – Query for filtering the search results

  • facets – Fields to include in the facets, if [] then all

  • min_score – Minimum score for similarity metric

  • include_vector – Include vectors in the search results

  • include_count – Include count in the search results

  • include_facets – Include facets in the search results

  • hundred_scale – Whether to scale up the metric by 100

  • multivector_query – Query for advance search that allows for multiple vector and field querying

  • text – Text Search Query (not encoded as vector)

  • text_fields – Text fields to search against

  • traditional_weight – Multiplier of traditional search. A value of 0.025~0.1 is good.

  • fuzzy – Fuzziness of the search. A value of 1-3 is good.

  • join –

    Whether to consider cases where there is a space in the word. E.g. Go Pro vs GoPro.

    asc:

    Whether to sort the score by ascending order (default is false, for getting most similar results)

advanced_search_by_id(collection_name: str, document_id: str, fields: Dict, sum_fields: bool = True, facets: List = [], filters: List = [], metric: str = 'cosine', min_score=None, page: int = 1, page_size: int = 10, include_vector: bool = False, include_count: bool = True, include_facets: bool = False, asc: bool = False, return_curl: bool = False, **kwargs)¶

Advanced Single Product Recommendations (Search by an id).

For example: Search with id of a product in the database, and using the product’s image and description vectors to find the most similar products by what it looks like and what its described to do.

You can also give weightings of each vector field towards the search, e.g. image_vector_ weights 100%, whilst description_vector_ 50%.

Advanced search also supports filtering to only search through filtered results and facets to get the overview of products available when a minimum score is set.

Parameters
  • collection_name – Name of Collection

  • page – Page of the results

  • page_size – Size of each page of results

  • approx – Used for approximate search

  • sum_fields – Whether to sum the multiple vectors similarity search score as 1 or seperate

  • metric – Similarity Metric, choose from [‘cosine’, ‘l1’, ‘l2’, ‘dp’]

  • filters – Query for filtering the search results

  • facets – Fields to include in the facets, if [] then all

  • min_score – Minimum score for similarity metric

  • include_vector – Include vectors in the search results

  • include_count – Include count in the search results

  • include_facets – Include facets in the search results

  • hundred_scale – Whether to scale up the metric by 100

  • document_id – ID of a document

  • search_fields –

    Vector fields to search against, and the weightings for them.

    asc:

    Whether to sort the score by ascending order (default is false, for getting most similar results)

Example

>>> filter_query = [
        {'field': 'field_name',
        'filter_type': 'text',
        'condition_value': 'monkl',
        'condition': '=='}
    ]
>>> results = client.advanced_search_by_id(document_id=client.random_documents()['documents'][0]['_id'],
fields={'image_url_field_flattened_vector_':1}, filters=filter_query)
advanced_search_by_ids(collection_name: str, document_ids: Dict, fields: Dict, vector_operation: str = 'mean', sum_fields: bool = True, facets: List = [], filters: List = [], metric: str = 'cosine', min_score=None, page: int = 1, page_size: int = 10, include_vector: bool = False, include_count: bool = True, include_facets: bool = False, asc: bool = False, return_curl: bool = False, **kwargs)¶

Advanced Multi Product Recommendations (Search by ids).

For example: Search with multiple ids of products in the database, and using the product’s image and description vectors to find the most similar products by what it looks like and what its described to do.

You can also give weightings of each vector field towards the search, e.g. image_vector_ weights 100%, whilst description_vector_ 50%.

You can also give weightings of on each product as well e.g. product ID-A weights 100% whilst product ID-B 50%.

Advanced search also supports filtering to only search through filtered results and facets to get the overview of products available when a minimum score is set.

Parameters
  • collection_name – Name of Collection

  • page – Page of the results

  • page_size – Size of each page of results

  • approx – Used for approximate search

  • sum_fields – Whether to sum the multiple vectors similarity search score as 1 or seperate

  • metric – Similarity Metric, choose from [‘cosine’, ‘l1’, ‘l2’, ‘dp’]

  • filters – Query for filtering the search results

  • facets – Fields to include in the facets, if [] then all

  • min_score – Minimum score for similarity metric

  • include_vector – Include vectors in the search results

  • include_count – Include count in the search results

  • include_facets – Include facets in the search results

  • hundred_scale – Whether to scale up the metric by 100

  • document_ids – Document IDs to get recommendations for, and the weightings of each document

  • search_fields – Vector fields to search against, and the weightings for them.

  • vector_operation –

    Aggregation for the vectors, choose from [‘mean’, ‘sum’, ‘min’, ‘max’]

    asc:

    Whether to sort the score by ascending order (default is false, for getting most similar results)

advanced_search_by_positive_negative_ids(collection_name: str, positive_document_ids: Dict, negative_document_ids: Dict, fields: Dict, vector_operation: str = 'mean', sum_fields: bool = True, facets: List = [], filters: List = [], metric: str = 'cosine', min_score=None, page: int = 1, page_size: int = 10, include_vector: bool = False, include_count: bool = True, include_facets: bool = False, asc: bool = False, return_curl: bool = False, **kwargs)¶

Advanced Multi Product Recommendations with likes and dislikes (Search by ids).

For example: Search with multiple ids of liked and dislike products in the database. Then using the product’s image and description vectors to find the most similar products by what it looks like and what its described to do against the positives and most disimilar products for the negatives.

You can also give weightings of each vector field towards the search, e.g. image_vector_ weights 100%, whilst description_vector_ 50%.

You can also give weightings of on each product as well e.g. product ID-A weights 100% whilst product ID-B 50%.

Advanced search also supports filtering to only search through filtered results and facets to get the overview of products available when a minimum score is set.

Parameters
  • collection_name – Name of Collection

  • page – Page of the results

  • page_size – Size of each page of results

  • approx – Used for approximate search

  • sum_fields – Whether to sum the multiple vectors similarity search score as 1 or seperate

  • metric – Similarity Metric, choose from [‘cosine’, ‘l1’, ‘l2’, ‘dp’]

  • filters – Query for filtering the search results

  • facets – Fields to include in the facets, if [] then all

  • min_score – Minimum score for similarity metric

  • include_vector – Include vectors in the search results

  • include_count – Include count in the search results

  • include_facets – Include facets in the search results

  • hundred_scale – Whether to scale up the metric by 100

  • positive_document_ids – Positive Document IDs to get recommendations for, and the weightings of each document

  • negative_document_ids – Negative Document IDs to get recommendations for, and the weightings of each document

  • search_fields – Vector fields to search against, and the weightings for them.

  • vector_operation –

    Aggregation for the vectors, choose from [‘mean’, ‘sum’, ‘min’, ‘max’]

    asc:

    Whether to sort the score by ascending order (default is false, for getting most similar results)

advanced_search_with_positive_negative_ids_as_history(collection_name: str, vector: List, positive_document_ids: Dict, negative_document_ids: Dict, fields: Dict, vector_operation: str = 'mean', sum_fields: bool = True, facets: List = [], filters: List = [], metric: str = 'cosine', min_score=None, page: int = 1, page_size: int = 10, include_vector: bool = False, include_count: bool = True, include_facets: bool = False, asc: bool = False, return_curl: bool = False, **kwargs)¶

Advanced Search with Likes and Dislikes as history

For example: Vector search of a query vector with multiple ids of liked and dislike products in the database. Then using the product’s image and description vectors to find the most similar products by what it looks like and what its described to do against the positives and most disimilar products for the negatives.

You can also give weightings of each vector field towards the search, e.g. image_vector_ weights 100%, whilst description_vector_ 50%.

You can also give weightings of on each product as well e.g. product ID-A weights 100% whilst product ID-B 50%.

Advanced search also supports filtering to only search through filtered results and facets to get the overview of products available when a minimum score is set.

Parameters
  • collection_name – Name of Collection

  • page – Page of the results

  • page_size – Size of each page of results

  • approx – Used for approximate search

  • sum_fields – Whether to sum the multiple vectors similarity search score as 1 or seperate

  • metric – Similarity Metric, choose from [‘cosine’, ‘l1’, ‘l2’, ‘dp’]

  • filters – Query for filtering the search results

  • facets – Fields to include in the facets, if [] then all

  • min_score – Minimum score for similarity metric

  • include_vector – Include vectors in the search results

  • include_count – Include count in the search results

  • include_facets – Include facets in the search results

  • hundred_scale – Whether to scale up the metric by 100

  • positive_document_ids – Positive Document IDs to get recommendations for, and the weightings of each document

  • negative_document_ids – Negative Document IDs to get recommendations for, and the weightings of each document

  • search_fields – Vector fields to search against, and the weightings for them.

  • vector_operation – Aggregation for the vectors, choose from [‘mean’, ‘sum’, ‘min’, ‘max’]

  • vector –

    Vector, a list/array of floats that represents a piece of data

    asc:

    Whether to sort the score by ascending order (default is false, for getting most similar results)

chunk_search(collection_name: str, vector: List, search_fields: list, chunk_scoring: str = 'max', facets: List = [], filters: List = [], metric: str = 'cosine', sum_fields: bool = True, approx: int = 0, min_score=None, page: int = 1, page_size: int = 20, include_vector: bool = False, include_count: bool = True, include_facets: bool = False, asc: bool = False, return_curl: bool = False, **kwargs)¶

Chunk search functionality :param collection_name: Name of collection :param vector: A list of values :param Search_fields: A list of fields to search :param chunk_scoring: How each chunk should be scored :param approx: How many approximate neighbors to go through

Next Previous

© Copyright 2020, OnSearch Pty Ltd.

Built with Sphinx using a theme provided by Read the Docs.