-
Notifications
You must be signed in to change notification settings - Fork 354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pageserver: skip waiting for logical size on shard >0 #7744
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Timeline::await_initial_logical_size
does not seem to be called from anywhere except init, to my surprise. I cannot see how this could have any negative effect. Should go into the next release.
3060 tests run: 2933 passed, 0 failed, 127 skipped (full report)Code coverage* (full report)
* collected from Rust tests only The comment gets automatically updated with the latest test results
4dc4c79 at 2024-05-14T13:17:55.639Z :recycle: |
Delete resumption code path apparently relied on the "unexpected" code path, needs investigation. |
I think it's all right, the error is:
Above in release builds, assertion failure in debug builds. Now with the hint/ad to try out check_allowed_errors. This is what I was thinking of putting these into two different PRs, but I failed to put it in words yesterday. However, these are not sharded tests, and the PR already contains a fix, which means that we are about to create another racy shutdown logging error situation. In #7733 I am growing tired of those (it'd be the second follow-up). |
## Problem Shards with number >0 could hang waiting for `await_initial_logical_size`, as we don't calculate logical size on these shards. This causes them to hold onto semaphore units and starve other tenants out from proceeding with warmup activation. That doesn't hurt availability (we still have on-demand activation), but it does mean that some background tasks like consumption metrics would omit some tenants. ## Summary of changes - Skip waiting for logical size calculation on shards >0 - Upgrade unexpected code paths to use debug_assert!(), which acts as an implicit regression test for this issue, and make the info() one into a warn()
Problem
Shards with number >0 could hang waiting for
await_initial_logical_size
, as we don't calculate logical size on these shards. This causes them to hold onto semaphore units and starve other tenants out from proceeding with warmup activation.That doesn't hurt availability (we still have on-demand activation), but it does mean that some background tasks like consumption metrics would omit some tenants.
Summary of changes
Checklist before requesting a review
Checklist before merging