Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore MemberAlreadyExist exception during startup. #5765

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

khushboobhatia01
Copy link
Contributor

During service startup, if service registry is enabled we register the service.
If the service member ID already exists in the registry, the service start up fails with an exception MemberAlreadyExist.
This change will ignore this exception and continue with service startup.

Why do we need this change?

Consider a scenario where pod with a Stackstorm service is running.

  1. During startup this pod would have been registered in the service registry with hostname = pod name.
  2. Let's say the service container inside the pod is OOMKilled.
  3. Now, because of OOM kill the de-reregistration of service from the registry might not complete.
  4. After some backoff, when the container is able to start up again in the same pod with the same pod name, the service startup fails with MemberAlreadyExist exception because the de-reregistration process did not complete.

@pull-request-size pull-request-size bot added the size/XS PR that changes 0-9 lines. Quick fix/merge. label Oct 10, 2022
@CLAassistant
Copy link

CLAassistant commented Oct 10, 2022

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@cognifloyd cognifloyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Needs a changelog entry, and one of your commits is using an email address that is not attached to your github account, so you'll need to rewrite the commits to fix that (or the CLA bot will keep complaining).

Also, I wonder if there's a decent way to add a test for this. Can you look into that?

@cognifloyd
Copy link
Member

@khushboobhatia01 I rebased and fixed the author email. Now the CLA bot is not complaining.

try:
return coordinator.join_group(group_id, capabilities=capabilities).get()
except MemberAlreadyExist:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More comment on legacy, but in the try we return whatever coordinator.join_group returns, but if memberalreadyexists we don't return anything...
I think in fact the join_group doesn't return anything - so probably just worth changing it so that we change the try to remove the "return" on the call to coordinator.join_group. Or if its expected to return something then we need to adjust the except clause...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/XS PR that changes 0-9 lines. Quick fix/merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants