priority and ring_hash LBs: fix interactions when using ring_hash under priority #29332

markdroth · 2022-04-06T17:06:18Z

Fixes interactions between the priority and ring_hash LB policies, as per the changes in grpc/proposal#296. Also fixes some unrelated bugs found while looking at the code.

Specific changes in priority:

Use the same idempotent logic to choose a priority in all cases (e.g., when processing a child connectivity state update and when processing an update from the parent). This avoids problems such as failing to stop using the current priority when it transitions from READY to CONNECTING, but then seeing a duplicate TRANSIENT_FAILURE update from a higher priority that causes us to deselect the current priority because it is in state CONNECTING but does not have a failover timer pending.
Keep track of whether a child has seen TRANSIENT_FAILURE more recently than IDLE or READY, and use this to decide whether to restart the failover timer when a child reports CONNECTING. This ensures that we properly start the failover timer when the ring_hash child policy transitions from IDLE to CONNECTING at startup.
Do not cancel failover timer when deactivating a child. This avoids problems where a priority is deactivated while connecting and then reactivated before it finishes connecting.

Specific changes in ring_hash:

Apply new aggregation rule: If there is at least one subchannel in state TRANSIENT_FAILURE and there are more than one subchannel, report state CONNECTING. If we hit this rule, proactively start a subchannel connection attempt.
Properly start a proactive subchannel connection attempt at subchannel list creation if the subchannels are already in a state that makes that appropriate. Also update the connectivity state visible to the picker at subchannel list creation.
Use a ring_hash picker regardless of what connectivity state we report.
Treat subchannels that go from READY to TRANSIENT_FAILURE as being in state IDLE, as per the gRFC.
Use logical connectivity state (after applying special cases for sticky-TF and IDLE) for both aggregation rules and for the state visible to the picker.

…cancel when deactivating

apolcyn

f80fcac looks mostly good

A couple of comments on the test

test/cpp/end2end/xds/xds_end2end_test.cc

markdroth

Note that I need to merge #29320 before this, so please review that as well. Thanks!

test/cpp/end2end/xds/xds_end2end_test.cc

markdroth · 2022-04-07T17:01:08Z

Looks like the test needs a little more work here. Since merging #29321, the test is no longer failing without the code change here. I'll work on changing it so that it does so.

markdroth · 2022-04-07T22:24:04Z

I significantly rewrote the test to cover a simpler case but to do it in a way that exerts more control over the timing of the individual subchannel connection attempts. I also simplified the logic in the priority policy so that we always use the same code to choose a priority whenever any child reports a connectivity state change, which should systemically avoid the class of bugs where the logic for choosing a child does not match the logic for reacting to the current child's state changes.

PTAL.

markdroth · 2022-04-07T23:50:05Z

Eric pointed out that the approach of always starting the failover timer in CONNECTING is not quite right, since that will cause us to select a higher-than-current priority that transitions back to CONNECTING when we have already been using a lower priority that is in state READY. We're still discussing the right way to fix this.

priority: - go back to starting failover timer upon CONNECTING, but only if seen READY or IDLE more recently than TRANSIENT_FAILURE ring_hash: - don't flap back and forth between IDLE and CONNECTING; once we go CONNECTING, we stay there until either TF or READY - after the first subchannel goes TF, we proactively start another subchannel connecting, just like we do after a second subchannel reports TF, to ensure that we don't stay in CONNECTING indefinitely if we aren't getting any new picks - always return ring hash's picker, regardless of connectivity state - update the subchannel connectivity state seen by the picker upon subchannel list creation - start proactive subchannel connection attempt upon subchannel list creation if needed

test/cpp/end2end/xds/xds_end2end_test.cc

src/core/ext/filters/client_channel/lb_policy/ring_hash/ring_hash.cc

test/cpp/end2end/xds/xds_end2end_test.cc

…es (#9084) (#9094) per gRFC change grpc/grpc#29332: Apply new aggregation rule: If there is at least one subchannel in state TRANSIENT_FAILURE and there are more than one subchannel, report state CONNECTING. If we hit this rule, proactively start a subchannel connection attempt.

…es (#9084) (#9107) part of ring-hash part of the change grpc/grpc#29332: Apply new aggregation rule: If there is at least one subchannel in state TRANSIENT_FAILURE and there are more than one subchannel, report state CONNECTING. If we hit this rule, proactively start a subchannel connection attempt.

…es (#9084) (#9108) part of ring-hash part of the change grpc/grpc#29332: Apply new aggregation rule: If there is at least one subchannel in state TRANSIENT_FAILURE and there are more than one subchannel, report state CONNECTING. If we hit this rule, proactively start a subchannel connection attempt.

ringHashPicker changes per gRFC: grpc/grpc#29332: previously it appears the picker logic is wrong, e.g. not request connecting on the any subchannel if it is in TRANSIENT_FAILURE Refactored the logic to mirror the pseudo-code more so easier to understand.

…er priority (grpc#29332) * refactor connection delay injection from client_lb_end2end_test * fix build * fix build on older compilers * clang-format * buildifier * a bit of code cleanup * start failover time whenever the child reports CONNECTING, and don't cancel when deactivating * clang-format * rewrite test * simplify logic in priority policy * clang-format * switch to using a bit to indicate child healthiness * fix reversed comment * more changes in priority and ring_hash. priority: - go back to starting failover timer upon CONNECTING, but only if seen READY or IDLE more recently than TRANSIENT_FAILURE ring_hash: - don't flap back and forth between IDLE and CONNECTING; once we go CONNECTING, we stay there until either TF or READY - after the first subchannel goes TF, we proactively start another subchannel connecting, just like we do after a second subchannel reports TF, to ensure that we don't stay in CONNECTING indefinitely if we aren't getting any new picks - always return ring hash's picker, regardless of connectivity state - update the subchannel connectivity state seen by the picker upon subchannel list creation - start proactive subchannel connection attempt upon subchannel list creation if needed * ring_hash: fix connectivity state seen by aggregation and picker * fix obiwan error * swap the order of ring_hash aggregation rules 3 and 4 * restore original test * refactor connection injector QueuedAttempt code * add test showing that ring_hash will continue connecting without picks * clang-format * don't actually need seen_failure_since_ready_ anymore * fix TSAN problem * address code review comments

…er priority (#29332) (#30253) * refactor connection delay injection from client_lb_end2end_test * fix build * fix build on older compilers * clang-format * buildifier * a bit of code cleanup * start failover time whenever the child reports CONNECTING, and don't cancel when deactivating * clang-format * rewrite test * simplify logic in priority policy * clang-format * switch to using a bit to indicate child healthiness * fix reversed comment * more changes in priority and ring_hash. priority: - go back to starting failover timer upon CONNECTING, but only if seen READY or IDLE more recently than TRANSIENT_FAILURE ring_hash: - don't flap back and forth between IDLE and CONNECTING; once we go CONNECTING, we stay there until either TF or READY - after the first subchannel goes TF, we proactively start another subchannel connecting, just like we do after a second subchannel reports TF, to ensure that we don't stay in CONNECTING indefinitely if we aren't getting any new picks - always return ring hash's picker, regardless of connectivity state - update the subchannel connectivity state seen by the picker upon subchannel list creation - start proactive subchannel connection attempt upon subchannel list creation if needed * ring_hash: fix connectivity state seen by aggregation and picker * fix obiwan error * swap the order of ring_hash aggregation rules 3 and 4 * restore original test * refactor connection injector QueuedAttempt code * add test showing that ring_hash will continue connecting without picks * clang-format * don't actually need seen_failure_since_ready_ anymore * fix TSAN problem * address code review comments Co-authored-by: Mark D. Roth <roth@google.com>

markdroth added 7 commits April 6, 2022 00:20

refactor connection delay injection from client_lb_end2end_test

0c6ccf1

fix build

96119c0

fix build on older compilers

6462019

clang-format

c9ea149

buildifier

58378ae

a bit of code cleanup

033c31e

start failover time whenever the child reports CONNECTING, and don't …

f80fcac

…cancel when deactivating

markdroth added the release notes: no Indicates if PR should not be in release notes label Apr 6, 2022

markdroth requested a review from apolcyn April 6, 2022 17:06

github-actions bot added lang/c++ lang/core labels Apr 6, 2022

grpc-checks bot added bloat/low per-call-memory/neutral labels Apr 6, 2022

clang-format

44eaf30

apolcyn reviewed Apr 6, 2022

View reviewed changes

test/cpp/end2end/xds/xds_end2end_test.cc Show resolved Hide resolved

test/cpp/end2end/xds/xds_end2end_test.cc Show resolved Hide resolved

markdroth commented Apr 6, 2022

View reviewed changes

test/cpp/end2end/xds/xds_end2end_test.cc Show resolved Hide resolved

test/cpp/end2end/xds/xds_end2end_test.cc Show resolved Hide resolved

Merge remote-tracking branch 'upstream/master' into priority_lb_fix

d1588cb

grpc-checks bot added bloat/none and removed bloat/low labels Apr 6, 2022

markdroth added 4 commits April 7, 2022 21:55

rewrite test

2467440

simplify logic in priority policy

75e26c2

Merge remote-tracking branch 'upstream/master' into priority_lb_fix

73ef75f

clang-format

830f049

switch to using a bit to indicate child healthiness

b245c89

markdroth changed the title ~~priority LB: start failover timer when child reports CONNECTING, and don't cancel when deactivating~~ priority LB: don't cancel failover timer when deactivating, and simplify logic Apr 8, 2022

markdroth added 2 commits April 8, 2022 17:57

fix reversed comment

eb103bf

grpc-checks bot added the bloat/improvement label Apr 13, 2022

Merge remote-tracking branch 'upstream/master' into priority_lb_fix

79127fc

YifeiZhuang mentioned this pull request Apr 14, 2022

xds: change ring_hash LB aggregation rule to handles transient_failures grpc/grpc-java#9084

Merged

fix TSAN problem

ba54014

grpc-checks bot added bloat/none and removed bloat/improvement labels Apr 14, 2022

apolcyn reviewed Apr 14, 2022

View reviewed changes

markdroth added 2 commits April 14, 2022 21:27

address code review comments

4f8fcd9

Merge remote-tracking branch 'upstream/master' into priority_lb_fix

9c99a8a

apolcyn reviewed Apr 14, 2022

View reviewed changes

test/cpp/end2end/xds/xds_end2end_test.cc Show resolved Hide resolved

test/cpp/end2end/xds/xds_end2end_test.cc Show resolved Hide resolved

apolcyn approved these changes Apr 14, 2022

View reviewed changes

markdroth merged commit 6273832 into grpc:master Apr 15, 2022

YifeiZhuang mentioned this pull request Apr 15, 2022

xds: fix ring-hash-picker behaviour grpc/grpc-java#9085

Merged

copybara-service bot added the imported Specifies if the PR has been imported to the internal repository label Apr 15, 2022

YifeiZhuang mentioned this pull request Apr 19, 2022

xds: change ring_hash LB aggregation rule to handle transient_failure (1.46.x backport) grpc/grpc-java#9094

Merged

YifeiZhuang mentioned this pull request Apr 20, 2022

xds: fix ring-hash-picker behaviour (1.46.x backport) grpc/grpc-java#9096

Merged

markdroth mentioned this pull request Apr 20, 2022

xds_ring_hash_end2end_test: fix flake in ContinuesConnectingWithoutPicks #29461

Merged

This was referenced Apr 25, 2022

xds: change ring_hash LB aggregation rule to handles transient_failures (#9084) (1.44.x backport) grpc/grpc-java#9107

Merged

xds: change ring_hash LB aggregation rule to handles transient_failures (#9084)(1.45.x backport) grpc/grpc-java#9108

Merged

This was referenced Apr 25, 2022

xds: fix ring-hash-picker behaviour (#9085) (1.44.x backport) grpc/grpc-java#9109

Merged

xds: fix ring-hash-picker behaviour (#9085) (1.45.x backport) grpc/grpc-java#9110

Merged

yashykt mentioned this pull request Jul 8, 2022

Backport to 1.46.x: priority and ring_hash LBs: fix interactions when using ring_hash under priority (#29332) #30253

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

priority and ring_hash LBs: fix interactions when using ring_hash under priority #29332

priority and ring_hash LBs: fix interactions when using ring_hash under priority #29332

markdroth commented Apr 6, 2022 •

edited

apolcyn left a comment

markdroth left a comment

markdroth commented Apr 7, 2022

markdroth commented Apr 7, 2022

markdroth commented Apr 7, 2022

priority and ring_hash LBs: fix interactions when using ring_hash under priority #29332

priority and ring_hash LBs: fix interactions when using ring_hash under priority #29332

Conversation

markdroth commented Apr 6, 2022 • edited

apolcyn left a comment

Choose a reason for hiding this comment

markdroth left a comment

Choose a reason for hiding this comment

markdroth commented Apr 7, 2022

markdroth commented Apr 7, 2022

markdroth commented Apr 7, 2022

markdroth commented Apr 6, 2022 •

edited