Playbook for Boskos failures

When do we need this?

When "Prow Monitoring" App starts logging alerts in #testing-ops channel on kubernetes slack (or) when we see boskos related failures in CI jobs.

Pods related to boskos

Here are the pods to check if they are healthy (and look at logs as well). If they are not, they need to be deleted so fresh instances of these pods can get created.

kubectl --context gke_k8s-infra-prow-build_us-central1_prow-build -n test-pods -l app=boskos get pods
kubectl --context gke_k8s-infra-prow-build_us-central1_prow-build -n test-pods -l app=boskos-reaper get pods
kubectl --context gke_k8s-infra-prow-build_us-central1_prow-build -n test-pods -l app=boskos-janitor get pods

Alternatively, you can use the labels selector:

kubectl --context gke_k8s-infra-prow-build_us-central1_prow-build get pods -n test-pods -l 'app in (boskos,boskos-janitor,boskos-reaper)'

More info

Need to update this page with tips in : https://github.com/kubernetes/k8s.io/issues/3600

Resources

Boskos resource usage dashboard

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Playbook for Boskos failures

When do we need this?

Pods related to boskos

More info

Resources

Clone this wiki locally