New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
grpclb: skip fallback if the LB is already in fallback mode #8253
grpclb: skip fallback if the LB is already in fallback mode #8253
Conversation
…gered by receiving updated addresses containing backend addresses only.
Updated to eliminate the strange code reuse of FallbackModeTask for checking if fallback operation is needed. Always manually check if it is already in fallback and do so directly if fallback operations are needed. PTAL. |
When merging, include all the useful details from this PR's description in the commit message, including the reference to the bug number. |
… receiving no LB addresses.
I built our repro test against this PR and deployed, looks good so far. With grpc 1.38.0 the bigtable client would reliably fail writes but with this PR I'm not seeing any errors. |
Manually checks if the gRPCLB policy is already in fallback mode when trying to fallback due to receiving address update without LB addresses. Commit b956f88 added an invariant check in the FallbackModeTask runnable to ensure the task is fired only when the LB is not already in fallback mode. However, that commit missed the case that receiving address updates without LB addresses can trigger the run of FallbackModeTask runnable, because the existing implementation chose to reuse the code in FallbackModeTask. In such case, running FallbackModeTask could break the invariant check as the LB policy may already in fallback mode. This change eliminates the reuse of FallbackModeTask for handling address update without LB address. That is, every time receiving address update, we manually check if it is already in fallback instead of reusing to FallbackModeTask perform the check. Note there was a discussion brought up whether we should force entering fallback (shutdown existing subchannels) or we should still keep the balancer connection. Different languages have already diverged on this. Go shuts down the balancer connection and all subchannel connections to force using fallback addresses. C-core keep the balancer connection working and does not shutdown subchannels, only let fallback happens after the existing balancer connection and subchannel connections become broken. Java shuts down the balancer connection but not subchannels. This change does not try to change the existing behavior, but only fixes the invariant check breakage. ------------------- See bug reported in b/190700476
… v1.38.x) (#8253) (#8259) Manually checks if the gRPCLB policy is already in fallback mode when trying to fallback due to receiving address update without LB addresses. Commit b956f88 added an invariant check in the FallbackModeTask runnable to ensure the task is fired only when the LB is not already in fallback mode. However, that commit missed the case that receiving address updates without LB addresses can trigger the run of FallbackModeTask runnable, because the existing implementation chose to reuse the code in FallbackModeTask. In such case, running FallbackModeTask could break the invariant check as the LB policy may already in fallback mode. This change eliminates the reuse of FallbackModeTask for handling address update without LB address. That is, every time receiving address update, we manually check if it is already in fallback instead of reusing to FallbackModeTask perform the check. Note there was a discussion brought up whether we should force entering fallback (shutdown existing subchannels) or we should still keep the balancer connection. Different languages have already diverged on this. Go shuts down the balancer connection and all subchannel connections to force using fallback addresses. C-core keep the balancer connection working and does not shutdown subchannels, only let fallback happens after the existing balancer connection and subchannel connections become broken. Java shuts down the balancer connection but not subchannels. This change does not try to change the existing behavior, but only fixes the invariant check breakage. ------------------- See bug reported in b/190700476 Backport of #8253
When receiving updated addresses from the resolver with backend addresses only, gRPCLB will executes the fallback task. However, the invariant check added by #8035 was trying to ensure the fallback task is never fired if the LB is already in fallback mode (avoid overwriting the
fallbackReason
). But seems #8035 missed the case that the fallback task can be run by receiving updated addresses with backend addresses only.This change eliminates the reuse of FallbackModeTask for handling address update without LB address. That is, every time receiving address update, we manually check if it is already in fallback instead of reusing to FallbackModeTask perform the check.
See reported error in b/190700476