Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vault-configurer] Pod should become unhealthy after failing many retries to apply config #1749

Open
Gentoli opened this issue Dec 19, 2022 · 2 comments
Labels
lifecycle/keep Denotes an issue or PR that should be preserved from going stale.

Comments

@Gentoli
Copy link

Gentoli commented Dec 19, 2022

Is your feature request related to a problem? Please describe.

vault-configurer pod created by the Vault CR fails to apply vault config. There is no clear indication on it failed either in the pod health check or Vault CR status. Had to check logs to see the failure.

Vault CR status:

status:
  conditions:
    - status: 'True'
      type: Healthy
  leader: vault-0
  nodes:
    - vault-0

Describe the solution you'd like

  • vault-configurer pod should becomes unhealthy indicating an issue after number of retires
  • operator should update the CR to show vault-configurer failed

Describe alternatives you've considered
None

Additional context
image

I am running in GKE with usually only a single node but with the vault CR configured to schedule two pods on different nodes in case of upgrade or replacing node (one pod is usually stuck on unschedulable).
Edit: Scaled the size to 1 and the Vault CR shows healthy even when vault-configurer stuck retrying.

@Gentoli Gentoli changed the title vault-configurer should become unhealthy after failing many retries to apply config [vault-configurer] Pod should become unhealthy after failing many retries to apply config Dec 19, 2022
Copy link

Thank you for your contribution! This issue has been automatically marked as stale because it has no recent activity in the last 60 days. It will be closed in 20 days, if no further activity occurs. If this issue is still relevant, please leave a comment to let us know, and the stale label will be automatically removed.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR that has become stale and will be auto-closed. label Dec 10, 2023
@ramizpolic
Copy link
Member

Hiya @Gentoli, thanks for reporting this! We could add an option (flag/env) to control the number of allowed retries before exiting with an error which should resolve this issue.

However, as this requires extensions to other components as well (including the CRs), we'd have to plan it ahead. If there's still interest for this feature, we could add it to our backlog. Cheers!

@ramizpolic ramizpolic added question lifecycle/keep Denotes an issue or PR that should be preserved from going stale. and removed lifecycle/stale Denotes an issue or PR that has become stale and will be auto-closed. labels Dec 22, 2023
@github-actions github-actions bot removed the question label Feb 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/keep Denotes an issue or PR that should be preserved from going stale.
Projects
None yet
Development

No branches or pull requests

2 participants