-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Introduce top-k algo to HNSW #4114
Conversation
lib/common/common/src/top_k.rs
Outdated
@@ -11,7 +11,7 @@ use crate::types::{ScoreType, ScoredPointOffset}; | |||
#[derive(Default)] | |||
pub struct TopK { | |||
k: usize, | |||
elements: Vec<Reverse<ScoredPointOffset>>, | |||
pub elements: Vec<Reverse<ScoredPointOffset>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer not to expose this.
You can either return reference to a slice in a method, or deconstruct the whole thing into vector if you need ownership
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
3436510
to
0203d07
Compare
* feat: Allow building any branch with GH dev image builder workflow * fix: Trim feat, fix, etc * fix: Replace all / with -
None => true, | ||
Some(removed) => removed.idx != score_point.idx, | ||
}; | ||
let was_added = self.nearest.push(score_point); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is the deduplication based on removed.idx != score_point.idx,
not required anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't return index of the point if it was discarded or not, because we would need to return a whole list in this case. So the TopK return type is different and it is just bool
None => ScoreType::min_value(), | ||
Some(worst_of_the_best) => worst_of_the_best.score, | ||
} | ||
self.nearest.threshold() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it Ok. to use threshold here given that it is lazy and might not represent the full information of the top values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be not Ok, it is up to benchmarks
I seems this PR is not longer neeeded. Closing it |
HNSW equivalent of #4037 PR.
New Feature Submissions:
cargo +nightly fmt --all
command prior to submission?cargo clippy --all --all-features
command?Todo: