Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(NODE-3769): retryable writes are not compliant with specification #3144

Merged
merged 10 commits into from Mar 14, 2022

Conversation

nbbeeken
Copy link
Contributor

@nbbeeken nbbeeken commented Feb 12, 2022

Description

Fix NODE-3769, test NODE-2111

What is changing?

  • Add a new MongoUnexpectedServerResponseError which is a subclass of MongoRuntimeError I could continue using MongoRuntimeError to avoid adding a new class, but the refactor/fix to execute operation called for it.
  • Refactoring / correcting of executeOperation's retry logic
  • isRetryableWrite now only checks the label if we're connected to a topology above a certain wire version, as required by the spec.
  • Refactored the retryable write runner, no logical changes, just async/await, I ran my changes against the unmodified runner on main, so I'm confident that my test runner changes are safe
  • Moved the labels to an enum MONGODB_ERROR_LABELS, This allows us to type check our labels, (mostly could catch future spelling mistakes if we miss actual test coverage) felt like it was still relevant to this work to keep in.
  • Deleted the unused clearAndRemoveTimerFrom function. entirely internal, never reached.
  • Deleted connectServers called only from one place and simply created a level of indirection that is frustrating to debug
  • Deleted isRetryableEndTransactionError, it had one condition, checking for the retryWriteLabel, inlining the condition has better readability, there could be reason to bring it back if there is more conditions.
  • Expanded the conditionals in supportsRetryableWrites and made it accept an optional server (removes an existence gate from a caller if stmt)
  • Removed the constrained timeout from LB testing, causes a lot of timeout failures when you debug locally
  • async-ify testOperations in the legacy spec runner, debugging.
  • Added to our error unit testing, we had a public error that wasn't exported at the top level, I think my test should catch all those cases now.
  • New instructions for launching LB cluster
  • ran prettier on global.d.ts, it doesn't run automagically currently 😿

What is the motivation for this change?

Our retryable logic was falling back on error codes, the spec explicitly states that on newer server versions the label should be the ONLY source of truth. So we were still retrying errors, and we will continue to do so, but the rules were incorrect.

Double check the following

  • Ran npm run check:lint script
  • Self-review completed using the steps outlined here
  • PR title follows the correct format: <type>(NODE-xxxx)<!>: <description>
  • Changes are covered by tests
  • New TODOs have a related JIRA ticket
    • TODO can we break the type of hasErrorLabel, addErrorLabel, narrowing from string to known labels
    • TODO file ticket about try catch in execOperation
    • TODO inside commandHandler of endTransaction is it okay that we only check the retry read condition before unpinning / adding the unknown commit label? I don't think so, but also the retry has already occurred at this stage, so I don't think the session gets reused, unpinning might be correct.

@nbbeeken nbbeeken marked this pull request as draft February 12, 2022 03:15
@nbbeeken nbbeeken force-pushed the NODE-3769/retryable-writes branch 4 times, most recently from 69d655d to 9d7894b Compare February 15, 2022 22:54
@nbbeeken nbbeeken marked this pull request as ready for review February 16, 2022 21:20
global.d.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@dariakp dariakp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial feedback so that I don't lose it, still got the main files left to go through

src/cmap/connection.ts Outdated Show resolved Hide resolved
docs/errors.md Outdated Show resolved Hide resolved
docs/errors.md Outdated Show resolved Hide resolved
docs/errors.md Outdated Show resolved Hide resolved
src/utils.ts Outdated Show resolved Hide resolved
test/tools/unified-spec-runner/entities.ts Outdated Show resolved Hide resolved
test/unit/error.test.js Outdated Show resolved Hide resolved
test/types/community/client.test-d.ts Outdated Show resolved Hide resolved
@dariakp dariakp added the Primary Review In Review with primary reviewer, not yet ready for team's eyes label Feb 18, 2022
@dariakp dariakp self-assigned this Feb 18, 2022
src/error.ts Outdated Show resolved Hide resolved
src/sessions.ts Outdated Show resolved Hide resolved
src/operations/execute_operation.ts Outdated Show resolved Hide resolved
src/operations/execute_operation.ts Outdated Show resolved Hide resolved
src/operations/execute_operation.ts Outdated Show resolved Hide resolved
src/error.ts Outdated Show resolved Hide resolved
src/operations/execute_operation.ts Show resolved Hide resolved
src/error.ts Outdated Show resolved Hide resolved
src/error.ts Outdated Show resolved Hide resolved
src/utils.ts Outdated Show resolved Hide resolved
src/sdam/topology.ts Outdated Show resolved Hide resolved
src/sdam/topology.ts Show resolved Hide resolved
src/error.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@dariakp dariakp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some partial feedback on the revisions

src/cmap/connection.ts Show resolved Hide resolved
test/tools/unified-spec-runner/runner.ts Outdated Show resolved Hide resolved
test/tools/unified-spec-runner/match.ts Outdated Show resolved Hide resolved
test/tools/unified-spec-runner/match.ts Outdated Show resolved Hide resolved
src/operations/execute_operation.ts Outdated Show resolved Hide resolved
src/operations/execute_operation.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@dariakp dariakp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the one additional comment here

src/error.ts Outdated Show resolved Hide resolved
test/unit/error.test.ts Outdated Show resolved Hide resolved
test/unit/error.test.ts Outdated Show resolved Hide resolved
test/unit/error.test.ts Outdated Show resolved Hide resolved
test/unit/error.test.ts Outdated Show resolved Hide resolved
test/unit/error.test.ts Outdated Show resolved Hide resolved
test/unit/error.test.ts Outdated Show resolved Hide resolved
test/unit/error.test.ts Outdated Show resolved Hide resolved
test/unit/error.test.ts Outdated Show resolved Hide resolved
test/unit/error.test.ts Outdated Show resolved Hide resolved
test/unit/error.test.ts Outdated Show resolved Hide resolved
@nbbeeken nbbeeken force-pushed the NODE-3769/retryable-writes branch 2 times, most recently from 65232e8 to 39b0636 Compare March 7, 2022 21:06
@nbbeeken nbbeeken force-pushed the NODE-3769/retryable-writes branch 2 times, most recently from abc766f to 6984fca Compare March 9, 2022 16:41
dariakp
dariakp previously approved these changes Mar 11, 2022
Copy link
Contributor

@dariakp dariakp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting an approve because these are minor:

global.d.ts Outdated Show resolved Hide resolved
test/unit/error.test.ts Show resolved Hide resolved
@dariakp dariakp added Team Review Needs review from team and removed Primary Review In Review with primary reviewer, not yet ready for team's eyes labels Mar 11, 2022
src/cmap/connection.ts Outdated Show resolved Hide resolved
src/operations/execute_operation.ts Show resolved Hide resolved
src/operations/execute_operation.ts Show resolved Hide resolved
src/utils.ts Show resolved Hide resolved
@nbbeeken nbbeeken merged commit ff26b12 into main Mar 14, 2022
@nbbeeken nbbeeken deleted the NODE-3769/retryable-writes branch March 14, 2022 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team Review Needs review from team
Projects
None yet
4 participants