-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: new text embedding for sparse vector #466
base: main
Are you sure you want to change the base?
Conversation
The failed CI is due to an upstream uncompatiablity:
|
2089054
to
0e0c58b
Compare
The reason why CI fails is that |
src/datatype/text_svecf32.rs
Outdated
if *x != F32::zero() { | ||
match need_splitter { | ||
true => { | ||
buffer.push_str(format!("{}:{}", i + 1, x).as_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel not good about indexing from 1
. It's not consistent with subscripting.
69e527e
to
2823e00
Compare
Let's hold this PR for now, due to the conflict between 1-based array and 0-based array |
f86433b
to
2d6c196
Compare
This is used to support bm25 extension. It can produce string instead of depending on pgvecto.rs/pgvector. cc @cutecutecat |
cbf9900
to
12c8d9f
Compare
1b376ab
to
2581c60
Compare
be449e4
to
8cb55a3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parsing must accept all valid inputs and reject all invalid inputs. It should be better if it can be written trivially.
e2d685d
to
81b3a58
Compare
81b3a58
to
02d15a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about {}/1/2
?
ed1f8ad
to
9a17411
Compare
Fixed. |
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you have defined ParseState
, you can write the code in automata style.
7bbb0e8
to
746bb52
Compare
Signed-off-by: cutecutecat <junyuchen@tensorchord.ai>
Signed-off-by: usamoi <usamoi@outlook.com>
Signed-off-by: cutecutecat <junyuchen@tensorchord.ai>
Signed-off-by: cutecutecat <junyuchen@tensorchord.ai>
Signed-off-by: cutecutecat <junyuchen@tensorchord.ai>
Signed-off-by: cutecutecat <junyuchen@tensorchord.ai>
746bb52
to
a6c78d5
Compare
Part of #459
proc_macro_byte_character
from upstreamReminder
The index is from 1 instead of 0 at pgvector