Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default reconnect strategy uses exponential backoff and jitter #2736

Open
wants to merge 5 commits into
base: v5
Choose a base branch
from

Conversation

qsymmachus
Copy link

@qsymmachus qsymmachus commented Apr 5, 2024

Description

The current default reconnect strategy increases delay linearly with no jitter. This can lead to a thundering herd problem if many clients lose their connection at once, which is common during events like a Redis upgrade.

A reconnect strategy that uses exponential backoff plus a random jitter helps mitigate this risk and is recommended in many best practices documents for Redis clients, so I propose making this the default.

The default strategy I implemented will do the following:

  • Delay will be equal to (n^2) * 50 ms where n is the current retry count, with a maximum value of 2000 ms.
  • Jitter is a random value between 0 and 200 ms.

The millisecond values I choose are somewhat arbitrary, and I'm happy to change them if we think they're not right.

Checklist

  • Does npm test pass with this change (including linting)?
  • Is the new or changed code fully tested?
    • Existing unit tests for reconnectStrategy still pass. We have no tests for the default strategy, nor am I sure if and how we'd want to test it. Any thoughts on this?
  • Is a documentation update included (if this change modifies existing APIs, or introduces new ones)?

Both are recommended parts of client reconnect strategies to prevent
thundering herd problems when many clients lose their connection at once
(for example, during a Redis upgrade).
@qsymmachus qsymmachus changed the title fDefault reconnect strategy uses exponential backoff and jitter Default reconnect strategy uses exponential backoff and jitter Apr 5, 2024
@leibale
Copy link
Collaborator

leibale commented Apr 9, 2024

Hi @qsymmachus, thanks for contributing!
We are working on V5 which will add support for sentinel, RESP3, and client-side caching. If you can redo this change but for v5 it would be awesome! :)

@qsymmachus qsymmachus changed the base branch from master to v5 April 11, 2024 19:24
@qsymmachus
Copy link
Author

@leibale no problem. I changed the base branch to v5 and reworked my code slightly in 4a36c6f

Copy link
Collaborator

@leibale leibale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, just a few small comments and questions. You can ignore the nits and I'll do them myself.

@@ -13,7 +13,7 @@
| socket.keepAlive | `true` | Toggle [`keep-alive`](https://nodejs.org/api/net.html#socketsetkeepaliveenable-initialdelay) functionality |
| socket.keepAliveInitialDelay | `5000` | If set to a positive number, it sets the initial delay before the first keepalive probe is sent on an idle socket |
| socket.tls | | See explanation and examples [below](#TLS) |
| socket.reconnectStrategy | `retries => Math.min(retries * 50, 500)` | A function containing the [Reconnect Strategy](#reconnect-strategy) logic |
| socket.reconnectStrategy | `((retries^2) * 50 ms) + 0-200 ms` | A function containing the [Reconnect Strategy](#reconnect-strategy) logic |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the ^ operator is "Bitwise XOR" in JavaScript, not power.. we need to find a better way to document this..

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with a plain English explanation of the strategy in 38eb603, rather than try to figure out how to pseudo-code this in a legible way.

reconnectStrategy: retries => Math.min(retries * 50, 1000)
reconnectStrategy: retries => {
// Generate a random jitter between 0 – 200 ms:
const jitter = Math.floor(Math.random() * 200);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add an option to pass { jitter: { min: number; max: number; }; delay: number; }?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me that sounds like optimizing how clients can define their own custom strategy. But I see value in providing boilerplate to allow people to quickly define custom backoff + jitter strategies. Could we scope this PR to just the default strategy update, and then introduce this idea in a separate one?

@@ -82,6 +82,15 @@ export default class RedisSocket extends EventEmitter {
}

#createReconnectStrategy(options?: RedisSocketOptions): ReconnectStrategyFunction {
const defaultStrategy = (retries: number) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be moved to a #defaultReconnectStrategy function and used from lines 109 and 114

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in e7304ad 👍

@qsymmachus
Copy link
Author

@leibale I responded to your latest feedback; sorry for the delay!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants