Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compactor blocks cleaner: retry operations that could interfere with rewriting bucket index #8071

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

seizethedave
Copy link
Contributor

@seizethedave seizethedave commented May 6, 2024

What this PR does

This PR adds retries to operations in BlocksCleaner.cleanUser whose failures could lead to the bucket index failing to be rewritten. (ReadIndex and WriteIndex.)

And why:

When the blocks cleaner runs for a tenant, it carries out a series of steps to perform one cleanUser pass. Most of these steps involve an objstore invocation. (Fetching a block index, iterating the paths under a block folder, deleting a marker...)
In these series of steps, there are currently two avenues for "retries":

  1. Retries that the GCS, Minio (and so on) objstore provider SDKs perform. For example, the GCS SDK will automatically retry operations that it deems idempotent. And it has a suite of rules to determine which errors it will retry. Minio has similar (but different) policies around automatically retrying things.
  2. every 15 minutes (by default) the tenant's block cleaner job will be run again.

We are currently relying on Avenue 2 to eventually recover from past block cleaner failures. But the crux of a recent incident was that the stuff in cleanUser must 100% complete for the updated bucket index to be written. If cleanUser fails enough consecutive times, store-gateways will refuse to load the "stale" bucket index, and some queries will begin to fail. In that incident, a larger percentage of obj store calls were exceeding their context deadline (which looks like network flakiness) hence the >=4 consecutive cleanUser failures leading to a >=1 hour stale bucket index.

Notes:
  • ReadIndex and WriteIndex already have 1 minute hardcoded deadlines, so the new outer request deadlines I've chosen for those are safe.
  • UpdateIndex could use a retry, too, because if that method returns an error, the bucket index won't be rewritten. However, I've done some analysis in our logs and UpdateIndex for some tenants can take 5+ minutes while it updates scads of deletion markers. I'm not going to add time-based retries on that method so as not to accidentally break any legitimate work being done. There's room for improvement to come back and add finer grained retries on the operations inside of UpdateIndex.

Which issue(s) this PR fixes or relates to

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, but maybe we can retry on temporary object store errors too?

pkg/compactor/blocks_cleaner.go Outdated Show resolved Hide resolved
@seizethedave seizethedave marked this pull request as ready for review May 7, 2024 19:35
@seizethedave seizethedave requested a review from a team as a code owner May 7, 2024 19:35
@seizethedave seizethedave changed the title [WIP] Compactor blocks cleaner: retry operations that could interfere with rewriting bucket index Compactor blocks cleaner: retry operations that could interfere with rewriting bucket index May 7, 2024
@seizethedave seizethedave requested review from dimitarvdimitrov and jhalterman and removed request for dimitarvdimitrov May 7, 2024 19:36
// the backoff config. Each invocation of f will be given perCallTimeout to
// complete. This is specifically designed to retry timeouts due to flaky
// connectivity with the objstore backend.
func withRetries(ctx context.Context, perCallTimeout time.Duration, bc backoff.Config, logger log.Logger, f func(context.Context) error) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this functional approach towards doing retries. Maybe something like this could go into dskit? Alternatively, for this functional approach to retries, we could use Failsafe-go (which we're using for circuit breaking).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just looking at failsafe-go yesterday, randomly. Whoever wrote that has been around the block a few times. :)

Maybe something like this could go into dskit?

Yeah, I looked to see if it already existed in dskit. There's probably room for something there. For this application, the coupling of per-call timeout contexts and a shouldRetry function that looks for DeadlineExceeded seemed a little single-purpose.

Failsafe's backoff looks nice. But there is a plus to sticking with dskit/backoff as it is so pervasive in this codebase.

I expect to come back into this file to add similar retries inside of UpdateIndex ("sometime") so maybe we can keep this dialogue open/rule of three and all that?

Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. LGTM overall.

var idx *bucketindex.Index
err := withRetries(ctx, 1*time.Minute, c.retryConfig, log.With(userLogger, "op", "readIndex"), func(ctx context.Context) error {
var err error
idx, err = c.readIndex(ctx, c.bucketClient, userID, c.cfgProvider, userLogger)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to just call bucketindex.ReadIndex here, and same for c.writeIndex vs bucketindex.WriteIndex later.

If we want to inject errors from read/write index calls, I'd suggest doing it at bucket level (see ErrorInjectedBucketClient, deceivingUploadBucket or errBucket), instead of introducing indirection in BlocksCleaner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants