Skip to content
This repository has been archived by the owner on Nov 17, 2022. It is now read-only.

k8s-spot-rescheduler doesn't trigger cluster autoscale up if no spot instances available #53

Open
morganwalker opened this issue Jan 17, 2019 · 4 comments

Comments

@morganwalker
Copy link

morganwalker commented Jan 17, 2019

We're using kops 1.10.0 and k8s 1.10.11. We're using two separate instance groups (IG), nodes (on-demand) and spots (spot), both spread across 3 availability zones. I've applied the appropriate nodeLabels and have defined the following in my k8s-spot-rescheduler deployment manifest:

- --on-demand-node-label=on-demand
- --spot-node-label=spot

The nodes IG has the spot=false:PreferNoSchedule taint so the spots IG is preferred. I'm using the cluster autoscaler to autodiscover both IGs via the --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,kubernetes.io/cluster/kubernetes.metis.wtf and these tags exist on both IGs. I've confirmed that pods on most nodes nodes are able to be drained and moved to spots nodes. With an exception:

  • The spots IG was set to minSize: 1 and maxSize=3 and we had one spots node up and running in us-east-1c
  • k8s-spot-rescheduler attempted to drain the pods on a nodes node but failed with
I0117 02:16:49.099271       1 rescheduler.go:288] Considering ip-172-20-127-232.ec2.internal for removal
I0117 02:16:49.099797       1 rescheduler.go:293] Cannot drain node: pod metis-internal/rabbitmq-0 can't be rescheduled on any existing spot node
  • metis-internal/rabbitmq-0 is a statefulSet with a PVC
  • the PVC resides in us-east-1a so it makes sense why it couldn't be scheduled on the spots node

Why didn't the failure to schedule metis-internal/rabbitmq-0 trigger the cluster autoscaler to try to provision a new spots node until it created one in the same availability zone? I'm wondering if k8s-spot-rescheduler would have actually evicted the pod, the cluster autoscaler would have noticed that a pod needed to be scheduled and would have spun up a new node in the spots IG.

@obellagamba
Copy link

Any news on this front?

If you guys have a strategy, I'm more than willing to help with the implementation of this feature, as it seems important for us.

@Antony450
Copy link

Taint can be add to on-demand instance group other than spot-instance IG like below.
labels = "kubernetes.io/role=common,lifecycle=OnDemand"
taints = "lifecycle=OnDemand:PreferNoSchedule"
This works for me.

@CharlieC3
Copy link

In my experience the taint just tells the K8s scheduler to try scheduling any unscheduled pods onto an existing spot instance node, and it doesn't tell the cluster autoscaler to scale up on spot instances to make room if there aren't any spot instances available.

@yogeshkk
Copy link

yogeshkk commented Apr 2, 2020

Hi,

I having the same issue so I was thinking creating automation which will see if there is an on-demand node is up in the environment and if yes I will add a few spot node so k8s-spot-rescheduler can move the pod to this spot and we will get rid of the on-demand node.

We can implement similer in k8s-spot-rescheduler. Was thinking we can have a parameter which will take the name of spot IG or ASG and if we don't have spot capacity we will scale that IG or ASG(can use CA's code for scaling).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants