Skip to content

Playbook for Boskos failures

Arnaud M edited this page Apr 11, 2022 · 3 revisions

When do we need this?

When "Prow Monitoring" App starts logging alerts in #testing-ops channel on kubernetes slack (or) when we see boskos related failures in CI jobs.

Pods related to boskos

Here are the pods to check if they are healthy (and look at logs as well). If they are not, they need to be deleted so fresh instances of these pods can get created.

  • kubectl --context gke_k8s-infra-prow-build_us-central1_prow-build -n test-pods -l app=boskos get pods
  • kubectl --context gke_k8s-infra-prow-build_us-central1_prow-build -n test-pods -l app=boskos-reaper get pods
  • kubectl --context gke_k8s-infra-prow-build_us-central1_prow-build -n test-pods -l app=boskos-janitor get pods

Alternatively, you can use the labels selector:

kubectl --context gke_k8s-infra-prow-build_us-central1_prow-build get pods -n test-pods -l 'app in (boskos,boskos-janitor,boskos-reaper)'

More info

Need to update this page with tips in : https://github.com/kubernetes/k8s.io/issues/3600

Resources

Boskos resource usage dashboard