Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Leader Election for Housekeeping in Sharded Cluster using MongoDB #529

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

krapie
Copy link
Member

@krapie krapie commented May 3, 2023

What this PR does / why we need it:

Add leader election for housekeeping in sharded cluster using mongodb.

  • server/backend/election package is added for leader election, with database implementation.
  • database implementation uses mongoDB and TTL indexes to ensure that only one node can acquire housekeeping leader lease by preventing other nodes try to update the same document simultaneously on document expiry.
  • yorkie-cluster Helm Chart is configured to enable --housekeeping-leader-election to perform leader-only housekeeping in sharded cluster.
  • Also, small error in hostname value in server/backend/backend.go is fixed. For more information, follow: Add user agent metrics PR

Which issue(s) this PR fixes:

Fixes #505

Special notes for your reviewer:

I have tested on K8s environment with yorkie-cluster Helm chart with minikube, and it worked as expected.

Does this PR introduce a user-facing change?:


Additional documentation:


Checklist:

  • Added relevant tests or not required
  • Didn't break anything

@krapie krapie added the enhancement 🌟 New feature or request label May 3, 2023
@krapie krapie requested a review from hackerwins May 3, 2023 10:20
@krapie krapie self-assigned this May 3, 2023
@codecov
Copy link

codecov bot commented May 3, 2023

Codecov Report

Merging #529 (23e9f1d) into main (2b15ac5) will increase coverage by 0.20%.
The diff coverage is 62.18%.

@@            Coverage Diff             @@
##             main     #529      +/-   ##
==========================================
+ Coverage   51.50%   51.70%   +0.20%     
==========================================
  Files          67       68       +1     
  Lines        6932     7048     +116     
==========================================
+ Hits         3570     3644      +74     
- Misses       2891     2930      +39     
- Partials      471      474       +3     
Impacted Files Coverage Δ
server/backend/backend.go 0.00% <0.00%> (ø)
server/backend/config.go 55.55% <ø> (ø)
server/backend/database/memory/database.go 52.61% <0.00%> (-0.55%) ⬇️
server/backend/database/mongo/indexes.go 50.00% <ø> (ø)
server/backend/database/mongo/client.go 40.81% <62.26%> (+1.45%) ⬆️
server/backend/election/mongo/election.go 82.97% <82.97%> (ø)
server/config.go 45.23% <100.00%> (+1.33%) ⬆️

Copy link
Member

@hackerwins hackerwins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution.
I left a simple question. 🙏

server/backend/election/database/election.go Outdated Show resolved Hide resolved
Copy link
Member

@hackerwins hackerwins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation.

Could you please add the tests for the two situations I asked about?

  1. Lease renewal while handling a long task.
  2. Handling background routines when shutting down the server.

server/backend/election/database/election.go Outdated Show resolved Hide resolved
@krapie krapie force-pushed the shard-cluster-housekeeping-with-mongodb branch 5 times, most recently from b5a41c0 to dad28f9 Compare July 8, 2023 12:44
@krapie
Copy link
Member Author

krapie commented Jul 8, 2023

I personally think we can also use K8s CronJob Object to run housekeeping in cluster mode without using leader election for housekeeping.

Using K8s CronJob will ease overhead for the server cluster to process leader elections. This might have a big impact when you have a huge cluster of servers.

@krapie krapie force-pushed the shard-cluster-housekeeping-with-mongodb branch from dad28f9 to 23e9f1d Compare July 8, 2023 12:52
@krapie
Copy link
Member Author

krapie commented Jul 21, 2023

I forgot to re-request review on this PR 😄 Requesting it now.

@krapie krapie requested a review from hackerwins July 21, 2023 10:09
Copy link
Member

@hackerwins hackerwins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for applying my requests.

I personally think we can also use K8s CronJob Object to run housekeeping in cluster mode without using leader election for housekeeping.

Using K8s CronJob will ease overhead for the server cluster to process leader elections. This might have a big impact when you have a huge cluster of servers.

Using CronJob seems to be a good and straightforward approach. If we choose to use it, we can run the cluster without a leader election.

However, we should consider how to provide private APIs that are meant to be exclusively called by internal components and should not be exposed publicly. The proposed idea doesn't seem to depend on k8s, as we only need to expose the private APIs.

https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/#concurrency-policy

How would you prefer to proceed?

@krapie
Copy link
Member Author

krapie commented Jul 31, 2023

@hackerwins At first glance, I thought we could introduce some sort of housekeeping-only mode on the server or CLI and run it periodically using K8s CronJob (or maybe just run a housekeeping-only server pod) since housekeeping does not require any context to receive or provide (housekeeping fetches all projects and clients from the database, and results for housekeeping only get applied to the database).

I will search for more context and information to get more ideas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 🌟 New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle housekeeping with sharded cluster
2 participants