User text vector search #8

Spartee · 2022-08-08T22:10:11Z

Description

In addition to the current full text search capability, we should be able to offer a natual language based vector search that a user can use to find products.

This is already largely implemented, but is turned off due to the fact that performance is not as good as it should be.

Related Code

Backend -> routes.py API route

@r.post("/vectorsearch/text/user",
       response_model=t.List[Product],
       name="product:find_similar_by_user_text",
       operation_id="compute_user_text_similarity")
async def find_products_by_user_text(similarity_request: UserTextSimilarityRequest) -> t.List[Product]:
    q = create_query(similarity_request.search_type,
                    similarity_request.number_of_results,
                    vector_field_name="text_vector",
                    gender=similarity_request.gender,
                    category=similarity_request.category)

    redis_client = await Redis(host=config.REDIS_HOST, port=config.REDIS_PORT, db=0)

    # obtain vector from text model in top level  __init__.py
    vector = TEXT_MODEL.encode(similarity_request.user_text)
    # obtain results of the query
    results = await redis_client.ft().search(q, query_params={"vec_param": vector.tobytes()})

    # Get Product records of those results
    similar_product_pks = [p.product_pk for p in results.docs]
    similar_products = [await Product.get(pk) for pk in similar_product_pks]
    return similar_products

Backend -> Pydantic Schema for API route

class UserTextSimilarityRequest(BaseModel):
    user_text: str
    number_of_results: int = 15
    search_type: str = "KNN"
    gender: str = ""
    category: str = ""

The huggingface model is held as a global variable within the top-level __init__.py. This could probably be improved.

Frontend -> Header.tsx currently commented out.

               <Button
              onClick={() => queryProductsByUserText()}
              variant="outline-success"
              disabled={searchText.length < 1}>
                Vector Search
              </Button>

Frontend JS to call backend -> api.ts

export const getSemanticallySimilarProductsbyText = async (text: string,
                                                    gender="",
                                                    category="",
                                                    search='KNN',
                                                    limit=15,
                                                    skip=0) => {
      let body = {
      user_text: text,
      search_type: search,
      number_of_results: limit,
      gender: gender,
      category: category
      }

    const url = MASTER_URL + "vectorsearch/text/user";
    return fetchFromBackend(url, 'POST', body);
};

TODO

investigate performance of user text vector search
- This will largely be in the data prep stage. We currently use the description as the text vector data. This is manually cleaned with regex and probably not optimal.
Once performance is acceptable, enable the vector search in addition to text search
make sure the buttons look right on the front end.

The text was updated successfully, but these errors were encountered:

Spartee added the enhancement New feature or request label Aug 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

User text vector search #8

User text vector search #8

Spartee commented Aug 8, 2022

User text vector search #8

User text vector search #8

Comments

Spartee commented Aug 8, 2022

Description

Related Code

TODO