Fix script loading behavior on script flush with pipelines #1497

marcbachmann · 2022-01-31T00:15:53Z

Lua scripts get loaded once and not retried unless a client side interval expires: https://github.com/luin/ioredis/blob/master/lib/pipeline.ts#L376

This causes issues with redis instances restarting or failing over.
There's no way of knowing whether a script is still present in the new instance.
We should reset the local state and just load them again when a reconnect happens and rely on the existing code that checks whether the scripts are present on the server.

~~I ran into that issue while testing the reliability of a service, but it might also fix #1405~~ See #1499 for the cluster fixes.

There's still a small time window where NOSCRIPT errors occur as messages are already queued for execution. ~~Previously the script handler only executed EVAL in the non-pipeline mode. This now changed and always gets executed.~~

Changelog

🐛 Reset loaded script hashes to force a reload of scripts after reconnect of redis

test/functional/pipeline.ts

marcbachmann · 2022-01-31T10:20:21Z

Regarding the tests, maybe listing the redis version during the docker run could be useful.

Somehow the node 6 tests failed before some small fix.
Probably because it uses redis 3.2.6 https://packages.debian.org/stretch/redis-server

But that should have worked with the CLIENT KILL ID clientId and CLIENT KILL addr:port commands.

artur-ma · 2022-01-31T10:58:46Z

Isnt this case is handled by this code?
https://github.com/luin/ioredis/blob/b8177479c348aa4bbd467fa944d61fe9b35aec19/lib/script.ts#L42

If the script is missing then we will get the NOSCRIPT error, which will force ioredis to load it.

marcbachmann · 2022-01-31T11:24:03Z

Not in pipeline mode. That gets skipped as the result of sendCommand is no promise.
await redis.pipeline(['something']).exec()

    const result = container.sendCommand(evalsha);
    // This is false, result is an instance of Pipeline
    if (isPromise(result)) {

marcbachmann · 2022-01-31T12:15:05Z

I did another commit to handle the NOSCRIPT errors in 53472a7

Resetting the scripts is still useful with this as without it, it would execute EVAL all the time, which is quite expensive.

I'm totally not sure whether those changes are fine. It might have some side effects in the ordering (in the tests it was fine). Not sure about the cluster support.
I'd prefer to just have the first commits in.

marcbachmann · 2022-01-31T12:46:31Z

I'm still getting NOSCRIPT errors even with the changes above. 🤔
It's able to recover for new queries, but queued ones fail despite the promise wrapping for the script. I guess it's because pipeline.sendCommand hooks up the promise earlier: Just didn't run npm run build

Now everything's working properly.

marcbachmann · 2022-01-31T14:11:43Z

lib/script.ts

+      if (err.toString().indexOf("NOSCRIPT") === -1) {
+        throw err;
+      }
+      const command = new Command("eval", [this.lua].concat(args), options);


Replacing that with a SCRIPT LOAD would be another task.

redis._addedScriptHashes[this.sha] = true return redis.pipeline(['script', 'load', this.lua], ['evalsha', this.sha]]) .exec() .then(([,res]) => res)

I might refactor the script load behavior once I have some time. Scripts themselves are loaded synchronously instead of using another pipeline or the same pipeline.

I'd have other fixes that uses pipeline-based script loading: https://github.com/marcbachmann/ioredis/compare/fix-redis-reconnect-NOSCRIPT-errors...marcbachmann:pipelined-evalsha?expand=1
As it's a bit bigger, and I don't like to merge it into this pr.

The only disadvantage is that it doesn't use script exits checks anymore and just uses script load after the interval expired. I guess redis already hashes by sha internally and only loads the script if it's not present. So that should be fine.
But as there's no roundtrip anymore and script caching time can be set much higher with it. NOSCRIPT errors are also retried in there (should also work better for the cluster support).

edit: Ok. it looks like I can simplify the code there. An eval also automatically caches the script in redis.

This is now ready in #1499
@artur-ma want to try? 😁

artur-ma · 2022-01-31T16:06:04Z

Resetting the scripts is still useful with this as without it, it would execute EVAL all the time, which is quite expensive.

IMO this is a serious statement, we have really high load on our services with a lot of lua scripts(we use redis with redisLabs as our main persistent DB), from time to time we loose connection so we have to reconnect. if it will cause sending the script string each time and run EVAL, it may be a disaster :)
also it will be hard to understand where this extra latency comes from after reconnect.

marcbachmann · 2022-01-31T16:14:32Z

Yes. This PR now fixes this behavior. New commands coming in will first load the script again. And at the moment it always does an exists check: https://github.com/luin/ioredis/blob/f275bc24de3825f80415a69ff227a45251dd1a3b/lib/pipeline.ts#L282

With the update, we only have the issue that already queued messages will do an eval. Somewhat like #1497 (comment) would fix this.

luin · 2022-02-05T16:33:05Z

Hey @marcbachmann 👋

Thanks for the contribution. I didn't realize this issue so it's a great fix!

The reason we don't retry on NOSCRIPT error in pipeline mode is we want to ensure the command order, so for example, in .pipeline().mycustom('foo').set('foo', 'bar').exec(), mycustom shouldn't be executed after the set command. Breaking this rule results in a breaking change, and I don't think we should include this change in a bug fix PR.

Resetting the scripts is still useful with this as without it, it would execute EVAL all the time, which is quite expensive.

Not sure if I understand this, why not resetting the scripts results in EVAL being executed all the time? Won't it be always EVALSHA first and then EVAL when failed? So only first several commands will execute EVAL in a small time window.

#1499 looks promising, as I still need some time reviewing it, for this PR, should we just delete the _addedScriptHashes flag for the sha in case of NOSCRIPT error so following custom commands in pipeline mode should work?

marcbachmann · 2022-02-05T17:39:47Z

You're welcome. Yes, then let's remove the fallback logic again and simplify that PR. #1499 might be a better approach to handle the script caches.

The reason we don't retry on NOSCRIPT error in pipeline mode is we want to ensure the command order, so for example, in .pipeline().mycustom('foo').set('foo', 'bar').exec(), mycustom shouldn't be executed after the set command. Breaking this rule results in a breaking change, and I don't think we should include this change in a bug fix PR.

Good to know, not sure this was mentioned anywhere.
That could make sense in some cases. But redis doesn't cancel commands followed after a NOSCRIPT of the evalsha command. Therefore multiple commands that belong together will always be an issue, even when they are in the same pipeline or within a MULTI transaction.
With that in mind, I'm not sure that is the better approach. I guess it completely depends on the use-case.
E.g. I'm doing an xack and some other script in the same pipeline. If the second script never executes, I'll have a worse state as the exactly-once delivery isn't guaranteed for those commands that failed.

Won't it be always EVALSHA first and then EVAL when failed? So only first several commands will execute EVAL in a small time window.

Yes, but it will execute the EVAL for all commands in the offline queue. Mainly because EVAL is queued after all the other commands already the queue. With #1499 we'll have some trade off of loading the script once on every reconnect, but will resolve those issues.

for this PR, should we just delete the _addedScriptHashes flag for the sha in case of NOSCRIPT error so following custom commands in pipeline mode should work?

👍

…connect of redis

marcbachmann · 2022-02-05T17:49:02Z

This is now ready to merge

luin · 2022-02-06T02:24:26Z

With that in mind, I'm not sure that is the better approach. I guess it completely depends on the use-case.

Fair point. I 💯 agree with that. I think we should then just priority code maintainability over the behavior choice, though introducing breaking changes means we have to wait for the schedule of the next major version.

## [4.28.5](v4.28.4...v4.28.5) (2022-02-06) ### Bug Fixes * Reset loaded script hashes to force a reload of scripts after reconnect of redis ([#1497](#1497)) ([f357a31](f357a31))

ioredis-robot · 2022-02-06T02:29:39Z

🎉 This PR is included in version 4.28.5 🎉

The release is available on:

Your semantic-release bot 📦🚀

## [4.28.5](redis/ioredis@v4.28.4...v4.28.5) (2022-02-06) ### Bug Fixes * Reset loaded script hashes to force a reload of scripts after reconnect of redis ([#1497](redis/ioredis#1497)) ([f357a31](redis/ioredis@f357a31))

marcbachmann force-pushed the fix-redis-reconnect-NOSCRIPT-errors branch from 020bac9 to b7b213c Compare January 31, 2022 00:41

marcbachmann commented Jan 31, 2022

View reviewed changes

test/functional/pipeline.ts Outdated Show resolved Hide resolved

marcbachmann force-pushed the fix-redis-reconnect-NOSCRIPT-errors branch from b7b213c to bdf88bd Compare January 31, 2022 10:19

marcbachmann force-pushed the fix-redis-reconnect-NOSCRIPT-errors branch 2 times, most recently from 731bbb4 to f36a582 Compare January 31, 2022 14:01

marcbachmann commented Jan 31, 2022

View reviewed changes

marcbachmann changed the title ~~Reset loaded script hashes to force a reload of scripts after reconnect of redis~~ Fix script loading behavior on script flush with pipelines Jan 31, 2022

marcbachmann mentioned this pull request Feb 1, 2022

Pipeline-based script loading and retries #1499

Merged

marcbachmann force-pushed the fix-redis-reconnect-NOSCRIPT-errors branch from f36a582 to 6439c20 Compare February 5, 2022 17:46

luin merged commit f357a31 into redis:master Feb 6, 2022

ioredis-robot added the released label Feb 6, 2022

ceconcarlsen mentioned this pull request May 31, 2024

[Snyk] Upgrade ioredis from 4.27.9 to 4.28.5 ceconcarlsen/circuit-breaker-2021-09-18#4

Open

rattus69 mentioned this pull request Jun 30, 2024

[Snyk] Upgrade ioredis from 4.19.4 to 4.28.5 rattus69/economy-simulator#8

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix script loading behavior on script flush with pipelines #1497

Fix script loading behavior on script flush with pipelines #1497

marcbachmann commented Jan 31, 2022 •

edited

Loading

marcbachmann commented Jan 31, 2022

artur-ma commented Jan 31, 2022

marcbachmann commented Jan 31, 2022

marcbachmann commented Jan 31, 2022 •

edited

Loading

marcbachmann commented Jan 31, 2022 •

edited

Loading

marcbachmann Jan 31, 2022 •

edited

Loading

marcbachmann Feb 1, 2022 •

edited

Loading

marcbachmann Feb 1, 2022

artur-ma commented Jan 31, 2022

marcbachmann commented Jan 31, 2022

luin commented Feb 5, 2022

marcbachmann commented Feb 5, 2022

marcbachmann commented Feb 5, 2022

luin commented Feb 6, 2022

ioredis-robot commented Feb 6, 2022

Fix script loading behavior on script flush with pipelines #1497

Fix script loading behavior on script flush with pipelines #1497

Conversation

marcbachmann commented Jan 31, 2022 • edited Loading

Changelog

marcbachmann commented Jan 31, 2022

artur-ma commented Jan 31, 2022

marcbachmann commented Jan 31, 2022

marcbachmann commented Jan 31, 2022 • edited Loading

marcbachmann commented Jan 31, 2022 • edited Loading

marcbachmann Jan 31, 2022 • edited Loading

Choose a reason for hiding this comment

marcbachmann Feb 1, 2022 • edited Loading

Choose a reason for hiding this comment

marcbachmann Feb 1, 2022

Choose a reason for hiding this comment

artur-ma commented Jan 31, 2022

marcbachmann commented Jan 31, 2022

luin commented Feb 5, 2022

marcbachmann commented Feb 5, 2022

marcbachmann commented Feb 5, 2022

luin commented Feb 6, 2022

ioredis-robot commented Feb 6, 2022

marcbachmann commented Jan 31, 2022 •

edited

Loading

marcbachmann commented Jan 31, 2022 •

edited

Loading

marcbachmann commented Jan 31, 2022 •

edited

Loading

marcbachmann Jan 31, 2022 •

edited

Loading

marcbachmann Feb 1, 2022 •

edited

Loading