Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Push jobs should fail eagerly to all data centers if primary data center fails #649

Open
2 of 10 tasks
ZacAttack opened this issue Sep 21, 2023 · 0 comments
Open
2 of 10 tasks
Labels
enhancement New feature or request

Comments

@ZacAttack
Copy link
Contributor

ZacAttack commented Sep 21, 2023

Feature Request Proposal

Background

In Venice's batch push architecture, data is first transmitted to a primary site called the 'nativeReplicationSourceFabric'. Once it's sent there, other sites can download the data.

The Ask

If the source fabric isn't able to successfully receive the data in the push for any reason, then there isn't much hope for the other sites to actually finish the job. They should eagerly abort and error the push should the upstream source fail for any reason.

Motivation

What is the use case for this feature?

This is a fast fail feature meant to save time for users. Currently the job has to time out, and this wastes everyones time, not to mention using some (albeit small) resources on the server.

Details

No response

What component(s) does this bug affect?

  • Controller: This is the control-plane for Venice. Used to create/update/query stores and their metadata.
  • Router: This is the stateless query-routing layer for serving read requests.
  • Server: This is the component that persists all the store data.
  • VenicePushJob: This is the component that pushes derived data from Hadoop to Venice backend.
  • VenicePulsarSink: This is a Sink connector for Apache Pulsar that pushes data from Pulsar into Venice.
  • Thin Client: This is a stateless client users use to query Venice Router for reading store data.
  • Fast Client: This is a stateful client users use to query Venice Server for reading store data.
  • Da Vinci Client: This is an embedded, stateful client that materializes store data locally.
  • Samza: This is the library users use to make nearline updates to store data.
  • Admin Tool: This is the stand-alone client used for ad-hoc operations on Venice.
@ZacAttack ZacAttack added the enhancement New feature or request label Sep 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant