Server-sent events may not work correctly if multiple server instances are running #411

Rossh87 · 2022-12-02T03:25:23Z

Description

Delivery of server-sent events to clients will become unreliable if more than 1 instance of touca-app is running as a cluster/single service. This may not be an important configuration to support, but I thought I'd flag it. Feel free to close if not relevant.

Environment

Any configuration that can deploy multiple instances of the server container, e.g. K8s, Docker Swarm.

Steps To Reproduce

Deploy as per Environment above, with at least 2 instances of touca-app running. Connect multiple clients to touca-app. Complete an action from one of the clients that should trigger the delivery of an event to all connected clients. The event may or may not be broadcast to all appropriate clients.

Expected Behavior

All clients should receive all relevant events, regardless of which instance of touca-app they are connected to.

Additional Context

Not a high-priority fix, but it will definitely come up if you ever need to horizontally scale Touca.

It happens because BullMQ does not support a 'fanout' job distribution pattern, and it doesn't seem like they will any time soon. Since each instance of touca-app keeps track of server-event subscriptions in its own process, and each enqueued job is consumed by exactly one worker and then discarded, only the clients who happen to be connected to the same process as the worker that consumes the job can receive the event.

The text was updated successfully, but these errors were encountered:

ghorbanzade · 2022-12-02T03:46:30Z

Hi @Rossh87. Great point. Thanks for raising this issue. This is an important limitation to keep in mind, even if supporting this setup configuration (horizontal scaling) is not a high priority at the moment. Do you have any suggestions on how to overcome this limitation?

Rossh87 · 2022-12-02T22:17:57Z

If you ever need a quick fix for this specific issue, I think redis streams can do what you need within the current architecture.

IMO it's more likely to come up in the context of a general move towards distributed services. You may find you want to move some of the cron processes out into their own service, have separate machines doing batch processing, or that you need better failover behavior. At that point, this issue will be one consideration among many, and you may realize you no longer want Redis in the mix at all. So if you can, I would wait until you understand those other considerations better.

ghorbanzade changed the title ~~Server events will not reach consumers if more than 1 instance of touca-app (server) is running~~ Server-sent events may not work correctly if multiple server instances are running Dec 6, 2022

ghorbanzade added the Low Priority Issues that won't upset anyone if we didn't ship in the next month label Dec 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Server-sent events may not work correctly if multiple server instances are running #411

Server-sent events may not work correctly if multiple server instances are running #411

Rossh87 commented Dec 2, 2022

ghorbanzade commented Dec 2, 2022

Rossh87 commented Dec 2, 2022

Server-sent events may not work correctly if multiple server instances are running #411

Server-sent events may not work correctly if multiple server instances are running #411

Comments

Rossh87 commented Dec 2, 2022

Description

Environment

Steps To Reproduce

Expected Behavior

Additional Context

ghorbanzade commented Dec 2, 2022

Rossh87 commented Dec 2, 2022