-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backpressure when updating points to avoid OOM #4169
Comments
I like this idea though I wonder how well it would work in practice. With this, you'd have to define some threshold which may be quite arbitrary. Also, preventing updates with a 429 does not mean it won't OOM. Ongoing optimizations may still claim a lot of memory causing a crash. Out of curiosity; what operations are you sending, and at what rate? |
Could the threshold be some percentage of memory, that would leave enough space for background optimizations? Each operation is a batch update with 20 upserts + 20 deletes with a filter selector, issued sequentially by the client ie: there is never more than one batch update concurrently (called with |
|
Fluent Bit 'storage.max_chunks_up' |
Is your feature request related to a problem? Please describe.
A single threaded client doing sequential batch updates can cause qdrant to be killed with an OOM.
We've seen this occur regularly in a 3 node cluster, 1GB memory per node, for a collection of ~4000 vectors, with replication factor and write consistency 3.
Describe the solution you'd like
qdrant to implement backpressure / flow control and respond with HTTP code 429 to mutating requests when near memory limits.
Describe alternatives you've considered
Adding a sleep to the client to slow throughput.
Additional context
See similar implementations of this feature:
Happy to provide a script to replicate the OOMs if it helps.
The text was updated successfully, but these errors were encountered: