Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] [Gitops] source-controller of Flux has the CrashLoopBackOff status #3586

Closed
selmison opened this issue Mar 31, 2023 · 5 comments
Closed
Labels
action-required bug Needs Attention 👋 Issues needs attention/assignee/owner

Comments

@selmison
Copy link

Describe the bug
When using this Tutorial to deploy apps using GitOps in the AKS, the app sample running normally and the walk-through in this tutorial is success.

But when take a look in the details on the pods of flux inside of AKS cluster, I notice the container related to source-controller has the CrashLoopBackOff status and it is in a restart loop every 5 min.

I am using Azure as network-plugin and Calico as network policy.

The behavior is the same, even though I disable readiness/liveness probe is disable

To Reproduce
Steps to reproduce the behavior:
1.

az k8s-configuration flux create -g flux-demo-rg \
-c flux-demo-arc \
-n cluster-config \
--namespace cluster-config \
-t connectedClusters \
--scope cluster \
-u https://github.com/Azure/gitops-flux2-kustomize-helm-mt \
--branch main  \
--kustomization name=infra path=./infrastructure prune=true \
--kustomization name=apps path=./apps/staging prune=true dependsOn=\["infra"\]
  1. run kubectl get pod -n flux-system -w and wait for about 5 min

  2. run kubectl get pod -n flux-system -w and wait for about 5 min:

Screenshot 2023-03-31 135122

Note: the log in this pod does not show any error.

Expected behavior
The source-controller deploy keep healthy.

Environment (please complete the following information):

  • CLI Version 2.37
  • Kubernetes version: 1.24.9
  • CLI Extension version:
    • azure-devops: 0.26.0
    • k8s-extension: 1.4.0
    • k8s-configuration: 1.7.0
@selmison selmison added the bug label Mar 31, 2023
@selmison selmison changed the title [BUG] [BUG] [Gitops] source-controller of Flux has the CrashLoopBackOff status Mar 31, 2023
@ghost ghost added the action-required label Apr 25, 2023
@ghost
Copy link

ghost commented Apr 30, 2023

Action required from @Azure/aks-pm

@ghost ghost added the Needs Attention 👋 Issues needs attention/assignee/owner label Apr 30, 2023
@ghost
Copy link

ghost commented May 16, 2023

Issue needing attention of @Azure/aks-leads

1 similar comment
@ghost
Copy link

ghost commented May 31, 2023

Issue needing attention of @Azure/aks-leads

@carvido1
Copy link

Hello @selmison .

Can you try to get the events produced by the source controller? To do so you can use

kubectl describe deployment/source-controller -n flux-system
kubectl describe pod/source-controller... -n flux-system

There will be a section for events that can add some context here. On the screenshot you have shared we can appreciate that the pod has been killed once due to OOM.

In addition if you are monitoring your cluster, maybe you can present some metrics for the memory on the particular pod.

@selmison
Copy link
Author

Hello, the root of this issue was related to fluxcd/source-controller#929. My repo had a big size

@ghost ghost locked as resolved and limited conversation to collaborators Jul 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
action-required bug Needs Attention 👋 Issues needs attention/assignee/owner
Projects
None yet
Development

No branches or pull requests

2 participants