Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting in-document IDs and joined doc imports/exports #51

Open
ashvardanian opened this issue Sep 6, 2022 · 0 comments
Open

Supporting in-document IDs and joined doc imports/exports #51

ashvardanian opened this issue Sep 6, 2022 · 0 comments
Labels
good first issue Good for newcomers

Comments

@ashvardanian
Copy link
Contributor

Benefits

Java, GoLang and many other bindings will receive "upsert" functionality with just a single char const * argument.
Similarly, streaming exports can emplace ID into the packed document, to simplify post-processing for user.
This form is compatible with Mongo DB and Elastic stack, which are behind in terms of Apache Arrow adoption.

Changes

  1. If no docs_count is set:
    • if the format is JSON - we count the newlines.
    • we need to have at least the first length variable set.
  2. Every input document is checked to contain an integer-castable top-level _id field.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
No open projects
Status: No status
Development

No branches or pull requests

1 participant