Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Fallback mechanism for SingleFlight #1792

Open
r-ashish opened this issue Sep 19, 2022 · 2 comments
Open

[RFC] Fallback mechanism for SingleFlight #1792

r-ashish opened this issue Sep 19, 2022 · 2 comments
Labels
proposal A proposal for discussion and possibly a vote

Comments

@r-ashish
Copy link
Member

r-ashish commented Sep 19, 2022

Issue

Currently Athens doesn't work at all if the SingleFlight store (etcd, redis etc.) is down or unavailable for some reason. New instances fail to start and the running instances also fail to continue working.

Proposal

I'm creating this as a proposal and a placeholder to get some comments before I actually start working on this.

The idea is to have another config to define the fallback mechanism, something like:

SingleFlightType = "redis"
SingleFlightFallbackType = "memory" // default to in-memory, possible values: none + all the values currently supported by SingleFlightType

When fallback is enabled and the primary SingleFlight store is down, Athens will fallback using the mechanism specified in the config and will continue trying to check the status of primary store in the background (retries with backoff) and switch to it when it's available.

This will only be a single layer fallback so if the fallback is also down then it'll still not work.

Also, this is mostly a good to have feature to improve the availability & resilience of the system. Our current fallback plan is to redeploy after changing the config, which is good enough for general use-case.

@manugupt1
Copy link
Member

I am not sure this will work. Consider You have 3 nodes behind an LB. If redis fails and then you have in-memory singleflight request, you may still get 3 single flight requests overwriting each other. I am not sure if there is a good way to guarantee uniqueness.

the easiest way is to probably write a new singleflight type backed by a managed by a cloud provider just like storage.

@r-ashish
Copy link
Member Author

r-ashish commented Sep 20, 2022

Consider You have 3 nodes behind an LB. If redis fails and then you have in-memory singleflight request, you may still get 3 single flight requests overwriting each other.

Yes if the fallback is configured to use in-memory then that's the expected behaviour after this implementation. The idea is to let it work even after redis is down, currently it doesn't work at all.

you may still get 3 single flight requests overwriting each other. I am not sure if there is a good way to guarantee uniqueness.

Also, like I described above - another distributed store (redis/ etcd etc.) can also be used as fallback so it can still guarantee uniqueness, it would just depend on your config.

Basically there're 3 possible scenarios:

  • Fallback configured to "none": Don't use any fallback and return an error like the current behavior.
    • Mainly for backwards compatibility or if you'd like to keep things simple and are okay with handling these situations manually.
  • Fallback configured to "memory": Fallback to memory until the distributed store is up. Athens will continue working but distributed singleflight won't work.
    • Use this If you'd want Athens to continue working without distributed locking.
  • Fallback configured to "etcd/redis/etc.": Fallback to another distributed cache until the primary one is up. Athens will continue working and distributed singleflight will also work.
    • Use this If you'd want Athens to continue working with distributed locking and are okay with maintaining a secondary distributed store.

So based on the resiliency requirements of the users they will be able to configure Athens accordingly.

@matt0x6F matt0x6F added the proposal A proposal for discussion and possibly a vote label Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal A proposal for discussion and possibly a vote
Projects
None yet
Development

No branches or pull requests

3 participants