Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(plugins): allow multi-index categoryAttribute in query suggestions #981

Open
wants to merge 6 commits into
base: next
Choose a base branch
from

Conversation

andreyvolokitin
Copy link

Related discussion: #978

In short

Multiple indexes can be connected to a suggestion index. This cause JS error when there's a suggestion match from the index, which is not provided in categoryAttribute. Also there's a need to split a category across multiple indexes, and then accumulate values across all indexes. Example: storing "products" and "blogs" in separate indexes while using an "area_type" facet for each index, like area_type: "Products", then showing this categories for suggestions

Details

When we have multiple indexes connected to suggestions index — the only useable case related to categories/facets is putting the same facet attribute name containing unique value per each index (so it’s a single category split across indexes).

If we want to show this category for suggestions, we might use multi-index path syntax for categoryAttribute: ["instant_search_1|instant_search_2", "facets", "exact_matches", "facet"]

In this case only the first match in the array of matches for the provided path will be used in combination with others for remaining indexes — actually such matches should only contain 1 element by the logic described above, and If there’re multiple matches in a single index, then it’s not a proper usage of this approach, which won’t make much sense.

I didn't found documentation source to write a description, but I made a test which shows this

@codesandbox-ci
Copy link

codesandbox-ci bot commented May 30, 2022

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

Latest deployment of this branch, based on commit 3ee4469:

Sandbox Source
@algolia/autocomplete-example-github-repositories-custom-plugin Configuration
@algolia/autocomplete-example-instantsearch Configuration
@algolia/autocomplete-example-playground Configuration
@algolia/autocomplete-example-preview-panel-in-modal Configuration
@algolia/autocomplete-example-react-renderer Configuration
@algolia/autocomplete-example-starter-algolia Configuration
@algolia/autocomplete-example-starter Configuration
@algolia/autocomplete-example-reshape Configuration
@algolia/autocomplete-example-vue Configuration

searchClient,
indexName: 'indexName',
categoryAttribute: [
'index_1|index_2',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't really thought about this feature yet, but api-wise, I think it would be better if this was an array instead of a magic string:

Suggested change
'index_1|index_2',
['index_1', 'index_2'],

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably even nested arrays so that you can target different attribute names:

[
  ['index_1', 'facets', 'exact_matches', 'data_origin1'],
  ['index_2', 'facets', 'exact_matches', 'data_origin2']
]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, looking at the example again, the whole item should be an array if we accept multiple indices

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought a lot about how it could work, but I didn't came out with a case where we need to use different attribute names (maybe there is a case). data_origin1 and data_origin2 basically need to be a single attribute — data_origin, but this attribute would contain different values per index. So basically it almost the same as current categoryAttribute, but it searches multiple indexes, and picks only the first match from each.

I need to wrap my head around a multi-attribute use-case, it wouldn't hurt probably, but currently I can't come up with an example usage for this — the current PR logic would accumulate and sort the multi-index matches like they're of a same "kind", but they can be completely different with multiple attributes

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also described this in a "Details" section of PR description. Will providing different attribute names indeed make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see how in most/all use cases the facets would be completely identical, but I could imagine another index where the facet values are the same but translated, so they wouldn't be identical.

I prefer handling an optionally nested array over magic strings, as people could already be using | in their index names

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I also agree with arrays, just wasn't sure about multiple attributes, will redo it shortly

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it, also outlined the docs in categoryAttribute description

Comment on lines 161 to 164
// use only the first facet value if multiple indexes needs to be targeted by multiple paths
return totalCategories.concat(
paths.length > 1 ? attrVal[0] : attrVal
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain why only one should be kept?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When indexes gets split by some uniform "property", like data origin, each index will contain same attribute (i.e. data_origin), and the value will be different for each index (data_origin="Store" or data_origin="Blog"), and same for all records across the index. By this logic, attrVal should only contain 1 record, and if it's not the case — I thought it would "wrong" and took "precautions"

But thinking about it now — the categoryAttribute logic/intention is to provide a facet used for accumulating available "categories" (facet values) which produce hits per each "category". This might mean there's no reason which will forbid to pass multiple non-related facets/attributes, for any indexes — plugin will just accumulate all the things, sorted by count. It is all available "categories" after all, so it shouldn't cause anything unusual

If the user wants to use the logic described in a 1st paragraph — they can use a uniform facet value across the index and get their single element inside attrVal . If something else is required — all the categories will be accumulated by provided paths

If this seems like a proper solution I'll keep all the elements (I'm not very familiar with all algolia peculiarities so can't judge, also this gets back to multi-attribute discussion — now it seems like it should be actually useful quite often).

One thing is that if there're multiple indexes separated in described manner, and additionally some facet is used which can have different values across all indexes — how the actual searching will be done? This is multiple indexes needing to be filtered by facet and all hits accumulated in a single widget — but this is, as far as I understood, actually impossible to do I think? So could such attribute only be useful if applied for a single index? And could this be useful in a such "multi-index" scenario? It seems that for a single index it definitely could be useful

In short: I don't have enough algolia experience to decide on this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think it will be best to keep all the items — it will be just a generic way of accumulating categories, and the user will decide what should be done with it

@sarahdayan
Copy link
Member

Hey @andreyvolokitin and thanks for opening this. Before you do more work on this, could you please give us a higher-level idea of what you're trying to achieve and isn't possible with the current API?

A sandbox would help us better visualize it. Here's a starter one using the build generated by your PR (so, with your changes), you can use it to demonstrate what this API enables: https://codesandbox.io/s/algolia-autocomplete-example-starter-ecqk6z?file=/app.tsx

Feel free to share indications too, so it's clear to us what is now possible.

@andreyvolokitin
Copy link
Author

andreyvolokitin commented Jun 3, 2022

@sarahdayan for a full demo I'll need a couple of indexes, and suggestion index with properly generated suggestions, maybe I could use my own algolia app for this, I’ll see. Also the idea is demonstrated by the test case, I'll try to explain this again:

Multiple indexes can be connected to a suggestion index, so API response for suggestions might contain random combinations of suggestions facets (residing in index_name.facets.exact_matches) — facets for only 1 of the indexes, or for both. A single categoryAttribute path allows to target facets only for a single index. If suggestion results only contain facets for a different index (not the one provided in categoryAttribute) — this will result in Cannot read properties of undefined error when getting facets with provided path here, because the facets are present in API response, but not by the path which was provided in categoryAttribute (the index name will be different). This is a minimal problem here (edit: no longer relevant as per this)

My use-case is to show facets from multiple indexes, via targeting them by multiple paths with categoryAttribute — this also fixes the JS error. Motivation: when we store different data in different indexes (like store products in 1 index, and blog posts in another index), we might want to search them simultaneously, and use createQuerySuggestionsPlugin to categorise suggestions. I.e. if we set area_label facet at each index, which will contain same value for all items across the index (i.e. data_origin="Store” in one index and data_origin="Blog” in another index) then we will be able to achieve this — the data_origin facet will be “split across” multiple indexes, while in suggestion categories we will be getting this (if we provide multiple paths in categoryAttribute):

'cooktop', // Query Suggestions item
  'in Store’, // Category item
  'in Blog’, // Category item

While discussing this PR, the idea was proposed to also allow multiple attribute names for categoryAttribute (#981 (comment)). The code for this is what this PR is currently doing — it gets all the facets from all the provided paths. The result of this is also shown in a test case — when allowing different attribute names for categoryAttribute they can target anything, so total accumulated facets might be quite non-related, but this is left for the user to decide what to target and how to handle results

@andreyvolokitin
Copy link
Author

Is there still anything I should do to get this merged?

@jahid56
Copy link

jahid56 commented Sep 21, 2022

Any Update regarding this issue???

@Haroenv Haroenv requested review from sarahdayan and removed request for francoischalifour and Haroenv December 28, 2022 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants