Skip to content

A Kubernetes DaemonSet to gracefully delete pods 2 minutes before an EC2 Spot Instance gets terminated

License

Notifications You must be signed in to change notification settings

bombbomb/kube-spot-termination-notice-handler

 
 

Repository files navigation

Kube Spot Termination Notice Handler

A Kubernetes DaemonSet to run 1 container per node to periodically polls the EC2 Spot Instance Termination Notices endpoint. Once a termination notice is received, it will try to gracefully stop all the pods running on the Kubernetes node, up to 2 minutes before the EC2 Spot Instance backing the node is terminated.

BombBomb Fork

Making Modifications

  • Check out a new branch.
  • Increment the version in version.txt.
  • Build using the script below.
VERSION=$(<version.txt)
docker build . -t docker-private.bombbomb.io/kube-spot-termination-notice-handler:$VERSION
  • PR into master.
  • Tag the commit and push the tag using.
git tag v$VERSION
git push origin --tags
  • Push to nexus using docker push docker-private.bombbomb.io/kube-spot-termination-notice-handler:$VERSION.
  • Deploy new tag via infrastructure-as-code project in lighthouse kubernetes configs.
    • Use the docker.bombbomb.io host when deploying to kubernetes. docker-private.bombbomb.io is for pushing only.

Installation

Helm

A helm chart has been created for this tool, and at time of writing was in the stable repository.

$ helm install stable/k8s-spot-termination-handler

Available docker images/tags

Tags denotes Kubernetes/kubectl versions. Using the same version for your Kubernetes cluster and spot-termination-notice-handler is recommended. Note that the -1 (or similar) is the revision of this tool, in case we need versioning.

  • kubeaws/kube-spot-termination-notice-handler:1.8.5-1
  • kubeaws/kube-spot-termination-notice-handler:1.9.0-1
  • kubeaws/kube-spot-termination-notice-handler:1.10.11-2
  • kubeaws/kube-spot-termination-notice-handler:1.11.3-1
  • kubeaws/kube-spot-termination-notice-handler:1.12.0-2
  • kubeaws/kube-spot-termination-notice-handler:1.13.7-1

Why use it

  • So that your kubernetes jobs backed by spot instances can keep running on another instances (typically on-demand instances)

How it works

Each spot-termination-notice-handler pod polls the notice endpoint until it returns a http status 200. That status means a termination is scheduled for the EC2 spot instance running the handler pod, according to my study).

Run kubectl logs against the handler pod to watch how it works.

$ kubectl logs --namespace kube-system spot-termination-notice-handler-ibyo6
This script polls the "EC2 Spot Instance Termination Notices" endpoint to gracefully stop and then reschedule all the pods running on this Kubernetes node, up to 2 minutes before the EC2 Spot Instance backing the node is terminated.
See https://aws.amazon.com/jp/blogs/aws/new-ec2-spot-instance-termination-notices/ for more information.
`kubectl drain minikubevm` will be executed once a termination notice is made.
Polling http://169.254.169.254/latest/meta-data/spot/termination-time every 5 second(s)
Fri Jul 29 07:38:59 UTC 2016: 404
Fri Jul 29 07:39:04 UTC 2016: 404
Fri Jul 29 07:39:09 UTC 2016: 404
Fri Jul 29 07:39:14 UTC 2016: 404
...
Fri Jul 29 hh:mm:ss UTC 2016: 200

Building against a specific version of Kubernetes

Run KUBE_VERSION=<your desired k8s version> make build to specify the version number of k8s/kubectl.

Slack Notifications

Introduced in version 0.9.2 of this application (the @mumoshu version), you are able to setup a Slack incoming web hook in order to send slack notifications to a channel, notifying the users that an instance has been terminated.

Incoming WebHooks require that you set the SLACK_URL environmental variable as part of your PodSpec.

You can also set SLACK_CHANNEL to send message to different slack channel insisted of default slack webhook url's channel.

The URL should look something like: https://hooks.slack.com/services/T67UBFNHQ/B4Q7WQM52/1ctEoFjkjdjwsa22934

Slack Setup:

Show where things are happening by setting the CLUSTER environment variable to whatever you call your cluster. Very handy if you have several clusters that report to the same Slack channel.

Example Pod Spec:

        env:
          - name: POD_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: NAMESPACE
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace
          - name: SLACK_URL
            value: "https://hooks.slack.com/services/T67UBFNHQ/B4Q7WQM52/1ctEoFjkjdjwsa22934"
          - name: SLACK_CHANNEL
          - value: "#devops"
          - name: CLUSTER
            value: development

Credits

kube-spot-termination-notice-handler is a collaborative project to unify @mumoshu and @kylegato's initial work and @egeland's fork with various enhancements and simplifications.

The project is currently maintained by:

  • @egeland
  • @kylegato
  • @mumoshu

RipSecrets

We implement pipeline secret scanning on all pull request events to prevent credentials from being merged. If the pipeline scanner detects a secret in your changed files it will gate the pull request and you will need to purge the found credential from your code and re-open the PR. To prevent getting gated by this tool and as best practice you should install the secret scanner locally in a pre-commit hook to prevent the secret from ever being committed to the repo in the first place. You can find documentation on how to set it up locally here
Ripsecrets has ways to bypass secret scanning although we should not be ignoring secrets that turn up in the scans. If something is out of your control and blocking the pipeline you can bypass it in one of the following ways

  1. Adding "# pragma: allowlist secret" to the end of the line with the secret.
  2. Adding the specific secret underneath the "[secrets]" block in .secretsignore
  3. Adding the filepath to ignore the whole file aboove the "[secrets]" block in .secretsignore

About

A Kubernetes DaemonSet to gracefully delete pods 2 minutes before an EC2 Spot Instance gets terminated

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Shell 90.9%
  • Dockerfile 9.1%