Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backpressure when updating points to avoid OOM #4169

Open
tekumara opened this issue May 3, 2024 · 4 comments
Open

Backpressure when updating points to avoid OOM #4169

tekumara opened this issue May 3, 2024 · 4 comments

Comments

@tekumara
Copy link

tekumara commented May 3, 2024

Is your feature request related to a problem? Please describe.

A single threaded client doing sequential batch updates can cause qdrant to be killed with an OOM.

We've seen this occur regularly in a 3 node cluster, 1GB memory per node, for a collection of ~4000 vectors, with replication factor and write consistency 3.

Describe the solution you'd like

qdrant to implement backpressure / flow control and respond with HTTP code 429 to mutating requests when near memory limits.

Describe alternatives you've considered

Adding a sleep to the client to slow throughput.

Additional context

See similar implementations of this feature:

Happy to provide a script to replicate the OOMs if it helps.

@timvisee
Copy link
Member

timvisee commented May 3, 2024

I like this idea though I wonder how well it would work in practice.

With this, you'd have to define some threshold which may be quite arbitrary. Also, preventing updates with a 429 does not mean it won't OOM. Ongoing optimizations may still claim a lot of memory causing a crash.

Out of curiosity; what operations are you sending, and at what rate?

@tekumara
Copy link
Author

tekumara commented May 3, 2024

Could the threshold be some percentage of memory, that would leave enough space for background optimizations?

Each operation is a batch update with 20 upserts + 20 deletes with a filter selector, issued sequentially by the client ie: there is never more than one batch update concurrently (called with wait=true). See these logs which end when the qdrant-0 node is killed. The batch updates look like they take ~60ms one after each other.

@Stanleylail
Copy link

  1. Back pressure for HTTP requests doesn't prevent OutOfMemoryError Better openapi definitions #1016 . 2. WebfluxUploadController.java looks like the option.

@Stanleylail
Copy link

Fluent Bit 'storage.max_chunks_up'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants