-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add embedding search #3774
Add embedding search #3774
Conversation
✅ Deploy Preview for esphome ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
A few quick thoughts:
|
I had some of the same thoughts, but found out I couldn't implement them. Right now, I'm only searching on the titles. This isn't ideal, but I'm using a simple weighted average to turn word embeddings into phrase embeddings, so including the body might drown out the title. Plus, the fact that we're using embeddings means that related terms will still work. It turns out Glove doesn't have a word embedding for i2c. I should see if the original tokenizer was tokenizing it differently. |
Try "graph". Changelogs in the results are really annoying. Graph Component by itself doesn't appear in the list. Maybe it would worth to take ref instances into account separately. |
Because the graph component isn't its own page, it doesn't show up as a separate result in either this search or the default Sphinx search. (It would be nice if the subheaders like "light effects", "graph component", "pin" showed up, but that would probably require hardcoding or something)
That sounds interesting, but I'm not sure what that would look like in practice. Could you elaborate? |
I just type in what I search for, and if there's nothing relevant showing up while typing, I press the button. I would expect to have a more detailed set of relevant results, without changelog entries, etc., based on similar criteria as the popups I got while typing. In my mind the two are one. |
I understand where you're coming from, you would expect them to act similarly. However, this PR is just an incremental change, only adding a search while typing; the more detailed search is exactly the same to the one currently on https://esphome.io/. I could hook into Sphinx's search logic, and add some custom filtering and sorting, but I would prefer to just edit the search while typing. |
I understand that. But users will see is holistically, it would really help to have a consistent experience. |
I think anything to improve searching esphome would be beneficial. |
Looks like if I increase the embedding dimension to 50 the performance gets better. I'll have to see how I can increase the dimensions while not shipping too many word embeddings. |
It’s somewhat better, yet “Pressure” still doesn’t show 3 components whose titles include the word pressure. |
@RubyBailey for some reason one component is using "co_" (which we have no embeddings for) instead of "co", making it embed badly. optimally it would just use "co", but i just turned the "co_" token into the "co" token, which fixes the problem |
Today I was going to update this to the newest embeddings when I saw that 4 days ago another PR was merged to replace the search system with something else. Oh well. |
Description:
Shows 3 recommended pages when you start typing in the search box
Fast (also doesn't require hitting "go"), better at finding some pages (light -> light component is first result, internet -> shows internet-related components), although doesn't replace the current stuff
Uses glove-based embeddings, shipping directly to the browser, no 3rd party. The current search index is ~250kb compressed; this index is ~300kb compressed, including a number of words that might be typed.
Merge first: #3773
Related issue (if applicable): N/A
Pull request in esphome with YAML changes (if applicable): N/A
Checklist:
next
because this is new documentation that has a matching pull-request in esphome as linked above.or
current
because this is a fix, change and/or adjustment in the current documentation and is not for a new component or feature./index.rst
when creating new documents for new components or cookbook.