Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability for prometheus & thanos sidecar to flush on graceful shutdown #6540

Open
Nashluffy opened this issue Apr 20, 2024 · 3 comments
Open

Comments

@Nashluffy
Copy link

Nashluffy commented Apr 20, 2024

Component(s)

Prometheus

What is missing? Please describe.

We run several short-lived (sometimes only 1 hour in age) clusters. When using the thanos sidecar approach, downscaling a prometheus replica (either permanently or removing shards) will result in data loss of all chunks in the head.

There are several issues that have all roughly touched on this issue.
#4967
prometheus/prometheus#12261
thanos-io/thanos#1849

It would be great to have native support for flushing and uploading what’s in the head in prometheus-operator (likely requiring changes to other components as well).

Unfortunately there's no TSDB API for "flushing" the head, but you can create a snapshot of TSDB, then move all new blocks in that snapshot into the top-level data dir.

The thanos sidecar can then perform it's own "flushing" in the form of uploading blocks one last time.

prometheus-operator feels like the most natural place to orchestrate this, but open to discussion!

kind: Prometheus
spec:
  thanos:
    flushOnShutdown: true

Describe alternatives you've considered.

I'm currently achieving this in a separate container that uses a preStop hook to

  1. call the snapshot endpoint of prometheus
  2. move the new blocks from that snapshot dir into the top-level data dir
  3. run thanos tools bucket upload-blocks.

The snapshot isn't a lot of storage as existing blocks in the snapshot are symlinks to the actual block.

We previously used a Thanos receiver setup which avoided this problem altogether, but it was wildly more expensive and quite a lot of overhead to operate.

Environment Information.

Environment

Kubernetes Version: 1.27
Prometheus-Operator Version: 0.73

@Nashluffy Nashluffy added kind/feature needs-triage Issues that haven't been triaged yet labels Apr 20, 2024
@nicolastakashi nicolastakashi added area/sharding and removed needs-triage Issues that haven't been triaged yet labels Apr 20, 2024
@nicolastakashi
Copy link
Contributor

Hey @Nashluffy thanks for this new issue. 😄
Yes, the described steps would work and sounds pretty nice.
I'd like to have something less hacky by not relying on lifecycle hooks.
I just opened this new issue on the Thanos Project, let's see what people think about it.
thanos-io/thanos#7295

@Nashluffy
Copy link
Author

Thanks! I'll keep the prometheus-operator discussion here

Just another point: I think a call to the flush endpoint should be part of the Prometheus finalizer as well, not just when scaling down shards. This would capture my use-case, as we don't use shards.

@ArthurSens
Copy link
Member

Seems aligned with on of the ideas we had for Graceful shutdown. (See https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/proposals/202310-shard-autoscaling.md#snapshot--upload-on-shutdown)

I think we could extend the proposed API to also provide this alternative as a shutdown option. Of course, that requires me to continue and finish my PR 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants