New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingestion bug when using partition key and unsorted data #4925
Comments
The number of published document is probably a limitation of the UI. I think we download a partial list of splits metadata and do the computation in javascript. @PieReissyet Concretely what is the other issue you observe? |
My hunch is that heaving partitioning makes the load on merge heavier, especially if the merge policy is inappropriate and the number of partiiton is high. In that case the default setting in You can confirm this hypothesis by looking at the pending merge / ongoing merge curves. After 10 * commit_timeout large merges will occur and the peak will get even higher. The graph should show the number of pending merge to eventually come back to 0 or close to 0 and not diverge. |
Ingestion gets blocked, numbers go crazy on the UI then eventually the insertion keep going after a while. We had use cases where it was stuck for many hours and others where it eventually went back depending on the numbers. We did not dig the issue more since we managed to get rid of the issue by just sorting our data by partition key. I will definitely try the |
yes that is in line with what I thought.
Actually scratch that, I am wrong. |
We had issues with ~130M docs / ~45go of data uncompressed |
Describe the bug
Cant ingest data while using a partition key if number of partition key is high and ndjson data is not sorted per partition key.
Steps to reproduce (if applicable)
Ingest 100M more docs with partition key and unsorted data with default configuration
Expected behavior
Data to be ingested in the index without any issue.
Configuration:
Default configuration, version 0.8.1, please check this thread (https://discord.com/channels/908281611840282624/1233364603480576092) where we discuss the issue and hopefully find a solution for further details
The text was updated successfully, but these errors were encountered: