Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.10.x] backport of a series of blocking query readability and functionality improvements #12456

Merged
merged 12 commits into from
Feb 25, 2022

Conversation

rboyer
Copy link
Member

@rboyer rboyer commented Feb 25, 2022

This is an omnibus PR including all blocking query fixes found in:

NOTE: the blocking query code was incidentally modified due to #10299 but that change was not backported.

To remove the TODO, and make it more readable.

In general this reduces the scope of variables, making them easier to reason about.
It also introduces more early returns so that we can see the flow from the structure
of the function.
This safeguard should be safe to apply in general. We are already
applying it to non-blocking queries that call blockingQuery, so it
should be fine to apply it to others.
This helps keep the logic in blockingQuery more focused. In the future we
may have a separate struct for RPC queries which may allow us to move this
off of Server.
Remove some unnecessary comments around query_blocking metric. The only
line that needs any comments in the atomic decrement.

Cleanup the block and return comments and logic. The old comment about
AbandonCh may have been relevant before, but it is expected behaviour
now.

The logic was simplified by inverting the err condition.
Follow the Go convention of accepting a small interface that documents
the methods used by the function.

Clarify the rules for implementing a query function passed to
blockingQuery.
This test shows how blocking queries are not efficient when the query
returns no results.  The test fails with 100+ calls instead of the
expected 2.

This test is still a bit flaky because it depends on the timing of the
writes. It can sometimes return 3 calls.

A future commit should fix this and make blocking queries even more
optimal for not-found results.
By using the query results as state.

Blocking queries are efficient when the query matches some results,
because the ModifyIndex of those results, returned as queryMeta.Mindex,
will never change unless the items themselves change.

Blocking queries for non-existent items are not efficient because the
queryMeta.Index can (and often does) change when other entities are
written.

This commit reduces the churn of these queries by using a different
comparison for "has changed". Instead of using the modified index, we
use the existence of the results. If the previous result was "not found"
and the new result is still "not found", we know we can ignore the
modified index and continue to block.

This is done by setting the minQueryIndex to the returned
queryMeta.Index, which prevents the query from returning before a state
change is observed.
Any query that returns a list of items is not part of this commit.
Otherwise when the query times out we might incorrectly send a value for
the reply, when we should send an empty reply.

Also document errNotFound and how to handle the result in that case.
@rboyer rboyer added the pr/no-changelog PR does not need a corresponding .changelog entry label Feb 25, 2022
@rboyer rboyer requested a review from a team February 25, 2022 20:33
@rboyer rboyer self-assigned this Feb 25, 2022
@kisunji kisunji self-requested a review February 25, 2022 20:45
@vercel vercel bot temporarily deployed to Preview – consul-ui-staging February 25, 2022 20:57 Inactive
@vercel vercel bot temporarily deployed to Preview – consul February 25, 2022 20:57 Inactive
Because we are not backporting #11632 which came before this set of
blocking query refactors on main, the changes to TestTxn_Read_ACLDeny
made the test agnostic to the QueryMeta.Index value.

fd0a9fd changed
the setQueryMeta function to set index=1 if index=0, which Txn.Read uses.

So in this commit we change the test assertion to expect 1 because
that's actually the new value.
@vercel vercel bot temporarily deployed to Preview – consul February 25, 2022 21:21 Inactive
@vercel vercel bot temporarily deployed to Preview – consul-ui-staging February 25, 2022 21:21 Inactive
@rboyer rboyer merged commit 2193b6b into release/1.10.x Feb 25, 2022
@rboyer rboyer deleted the 1-10-x-blocking-query-improvements branch February 25, 2022 21:55
@hc-github-team-consul-core
Copy link
Collaborator

🍒 If backport labels were added before merging, cherry-picking will start automatically.

To retroactively trigger a backport after merging, add backport labels and re-run https://circleci.com/gh/hashicorp/consul/597659.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/no-changelog PR does not need a corresponding .changelog entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants