[Intermittent- RHEL cluster] - Knative activator pod is restarting continuously from crashloop back of with Liveness and Readiness probe failure #15171
Labels
kind/bug
Categorizes issue or PR as related to a bug.
What version of Knative?
V1.11.0
0.11.x
Expected Behavior
As part of the Kserve deployment, we are deploying Istio, Cert Manager, and Knative as dependencies. Intermittently, we are facing an issue in the Knative deployment step where the Knative activator pod is not running properly and goes into crashloop backoff regularly. Other pods in the Knative namespace are running properly.
Versions of Dependencies:
Environment Details:
Activator Pod description Log:
Knative activator:
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-8nkbz:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: Burstable
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
Normal Scheduled 12m default-scheduler Successfully assigned knative-serving/activator-59dff6d45c-wqt8w to v16regressionnode00002
Normal Pulled 12m kubelet Container image "gcr.io/knative-releases/knative.dev/serving/cmd/activator@sha256:6b98eed95dd6dcc3d957e673aea3d271b768225442504316d713c08524f44ebe" already present on machine
Normal Created 12m kubelet Created container activator
Normal Started 12m kubelet Started container activator
Warning Unhealthy 11m (x5 over 12m) kubelet Liveness probe failed: HTTP probe failed with statuscode: 500
Warning Unhealthy 2m20s (x137 over 12m) kubelet Readiness probe failed: HTTP probe failed with statuscode: 500
Activator pod logs:
[root@v16regressionnode00003 ~]# kubectl logs activator-7bcc758ddd-wk7cd -n knative-serving
2024/04/25 11:22:05 Registering 2 clients
2024/04/25 11:22:05 Registering 3 informer factories
2024/04/25 11:22:05 Registering 4 informers
{"severity":"INFO","timestamp":"2024-04-25T11:22:05.716400581Z","logger":"activator","caller":"activator/main.go:140","message":"Starting the knative activator","commit":"f1617ef","knative.dev/controller":"activator","knative.dev/pod":"activator-7bcc758ddd-wk7cd"}
{"severity":"INFO","timestamp":"2024-04-25T11:22:05.718542578Z","logger":"activator","caller":"activator/main.go:200","message":"Connecting to Autoscaler at ws://autoscaler.knative-serving.svc.cluster.local:8080","commit":"f1617ef","knative.dev/controller":"activator","knative.dev/pod":"activator-7bcc758ddd-wk7cd"}
{"severity":"INFO","timestamp":"2024-04-25T11:22:05.718768882Z","logger":"activator","caller":"websocket/connection.go:161","message":"Connecting to ws://autoscaler.knative-serving.svc.cluster.local:8080","commit":"f1617ef","knative.dev/controller":"activator","knative.dev/pod":"activator-7bcc758ddd-wk7cd"}
{"severity":"INFO","timestamp":"2024-04-25T11:22:05.719123778Z","logger":"activator","caller":"profiling/server.go:65","message":"Profiling enabled: false","commit":"f1617ef","knative.dev/controller":"activator","knative.dev/pod":"activator-7bcc758ddd-wk7cd"}
{"severity":"INFO","timestamp":"2024-04-25T11:22:05.7237912Z","logger":"activator","caller":"activator/request_log.go:45","message":"Updated the request log template.","commit":"f1617ef","knative.dev/controller":"activator","knative.dev/pod":"activator-7bcc758ddd-wk7cd","template":""}
{"severity":"WARNING","timestamp":"2024-04-25T11:22:06.685484891Z","logger":"activator","caller":"handler/healthz_handler.go:36","message":"Healthcheck failed: connection has not yet been established","commit":"f1617ef","knative.dev/controller":"activator","knative.dev/pod":"activator-7bcc758ddd-wk7cd"}
{"severity":"WARNING","timestamp":"2024-04-25T11:22:07.686801714Z","logger":"activator","caller":"handler/healthz_handler.go:36","message":"Healthcheck failed: connection has not yet been established","commit":"f1617ef","knative.dev/controller":"activator","knative.dev/pod":"activator-7bcc758ddd-wk7cd"}
{"severity":"ERROR","timestamp":"2024-04-25T11:22:08.719008181Z","logger":"activator","caller":"websocket/connection.go:144","message":"Websocket connection could not be established","commit":"f1617ef","knative.dev/controller":"activator","knative.dev/pod":"activator-7bcc758ddd-wk7cd","error":"dial tcp: lookup autoscaler.knative-serving.svc.cluster.local: i/o timeout","stacktrace":"knative.dev/pkg/websocket.NewDurableConnection.func1\n\tknative.dev/pkg@v0.0.0-20230718152110-aef227e72ead/websocket/connection.go:144\nknative.dev/pkg/websocket.(*ManagedConnection).connect.func1\n\tknative.dev/pkg@v0.0.0-20230718152110-aef227e72ead/websocket/connection.go:225\nk8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1\n\tk8s.io/apimachinery@v0.26.5/pkg/util/wait/wait.go:222\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext\n\tk8s.io/apimachinery@v0.26.5/pkg/util/wait/wait.go:235\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection\n\tk8s.io/apimachinery@v0.26.5/pkg/util/wait/wait.go:228\nk8s.io/apimachinery/pkg/util/wait.ExponentialBackoff\n\tk8s.io/apimachinery@v0.26.5/pkg/util/wait/wait.go:423\nknative.dev/pkg/websocket.(*ManagedConnection).connect\n\tknative.dev/pkg@v0.0.0-20230718152110-aef227e72ead/websocket/connection.go:222\nknative.dev/pkg/websocket.NewDurableConnection.func2\n\tknative.dev/pkg@v0.0.0-20230718152110-aef227e72ead/websocket/connection.go:162"}
Actual Behavior
Steps to Reproduce the Problem
The text was updated successfully, but these errors were encountered: