New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to deploy EKS-A to vSphere cluster #7954
Comments
I would check the |
@Darth-Weider thanks for the reply. For those which are having similar issues here is a TL;DR;:
I've just finally figured out what is going on here. There were a few things that weren't really clear when reading the docs:
|
The fun fact is that this is not consistent. I've created the same config multiple times on the same environment and sometimes the process fail in the end with Here is the config: apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: awsemu
spec:
clusterNetwork:
cniConfig:
cilium: {}
pods:
cidrBlocks:
- 172.18.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
controlPlaneConfiguration:
count: 3
endpoint:
host: "172.16.1.1"
machineGroupRef:
kind: VSphereMachineConfig
name: awsemu-cp
datacenterRef:
kind: VSphereDatacenterConfig
name: datacenter
externalEtcdConfiguration:
count: 3
machineGroupRef:
kind: VSphereMachineConfig
name: awsemu-etcd
kubernetesVersion: "1.29"
managementCluster:
name: awsemu
workerNodeGroupConfigurations:
- count: 1
machineGroupRef:
kind: VSphereMachineConfig
name: awsemu
name: md-0
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: VSphereDatacenterConfig
metadata:
name: datacenter
spec:
datacenter: datacenter
insecure: false
network: workload
server: 192.168.8.12
thumbprint: "27:44:A2:74:89:B4:D3:4E:97:30:D7:AF:3B:88:06:F4:08:0C:4F:D7"
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: VSphereMachineConfig
metadata:
name: awsemu-cp
spec:
cloneMode: linkedClone
datastore: vsandatastore
folder: Kubernetes/Management/Control Plane
memoryMiB: 8192
numCPUs: 2
osFamily: bottlerocket
resourcePool: /datacenter/host/hwcluster/Resources
storagePolicyName: ""
users:
- name: ec2-user
sshAuthorizedKeys:
- ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDGidVzdPHSLPNq7i4+r1AD2bfAQmEC8NmZM1V0vN7jMIW2QZSflL2LrCpGk0969FHesOUTM1x61B5oYepsLjYgSKDC2mNxIg2jZONPYCg30fxE5vOxWUJObCGuc4trKfz9DLPx7+C3fGgXQaFmnugMgRbqYurdrr8HDeXsavwN361x/MesKpY4E26SBt/RG/sZEssVnzeIPbM8S9LDOX62znFYIXRlgmmx9un68TqQpMti6CnIWUlYwx90MJkV0avL5BeSg9ex3JxYH1THQw3tcj5gyh9GY9yWVxXA7bs3wh5vd8JAJEtPpeqaafRaqXfBFWzC3/L21GxVCwgvGAjovhdDGk3vn6PNRKf4b1MydHnVK7/lZnpNpenDYCszSEebkS5joqehpkaJ4eED1ACvJeh/0urupu47RMN6DcwLUR7j3o7sxcXZK31lecgogC7yvC5eZGK/B6rwHyV3xX7KaVcfabJJeiiJgrb2cKesiKDFgR8DlQ+sUrdwUIcsxsoOskYZJQuvH/h2Gi7lZv71uABnQLvcAeF6OSj7vnrsQ7oUKdcJhAfoRdJCOEt1PtgyDfe2WJ9gH3KRbuHxnNVyQKNZaI5OtEPCxlPIyXbGQnsTwZ1AiWj/RYbj3DP3aCM3Iu7Lg7z/dVGSnRfWJk0zdcZekGch0O43H0EX7611kQ==
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: VSphereMachineConfig
metadata:
name: awsemu
spec:
cloneMode: linkedClone
datastore: vsandatastore
folder: Kubernetes/Management/Worker Nodes
memoryMiB: 8192
numCPUs: 2
osFamily: bottlerocket
resourcePool: /datacenter/host/hwcluster/Resources
storagePolicyName: ""
users:
- name: ec2-user
sshAuthorizedKeys:
- ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDGidVzdPHSLPNq7i4+r1AD2bfAQmEC8NmZM1V0vN7jMIW2QZSflL2LrCpGk0969FHesOUTM1x61B5oYepsLjYgSKDC2mNxIg2jZONPYCg30fxE5vOxWUJObCGuc4trKfz9DLPx7+C3fGgXQaFmnugMgRbqYurdrr8HDeXsavwN361x/MesKpY4E26SBt/RG/sZEssVnzeIPbM8S9LDOX62znFYIXRlgmmx9un68TqQpMti6CnIWUlYwx90MJkV0avL5BeSg9ex3JxYH1THQw3tcj5gyh9GY9yWVxXA7bs3wh5vd8JAJEtPpeqaafRaqXfBFWzC3/L21GxVCwgvGAjovhdDGk3vn6PNRKf4b1MydHnVK7/lZnpNpenDYCszSEebkS5joqehpkaJ4eED1ACvJeh/0urupu47RMN6DcwLUR7j3o7sxcXZK31lecgogC7yvC5eZGK/B6rwHyV3xX7KaVcfabJJeiiJgrb2cKesiKDFgR8DlQ+sUrdwUIcsxsoOskYZJQuvH/h2Gi7lZv71uABnQLvcAeF6OSj7vnrsQ7oUKdcJhAfoRdJCOEt1PtgyDfe2WJ9gH3KRbuHxnNVyQKNZaI5OtEPCxlPIyXbGQnsTwZ1AiWj/RYbj3DP3aCM3Iu7Lg7z/dVGSnRfWJk0zdcZekGch0O43H0EX7611kQ==
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: VSphereMachineConfig
metadata:
name: awsemu-etcd
spec:
cloneMode: linkedClone
datastore: vsandatastore
folder: Kubernetes/Management/ETCD
memoryMiB: 8192
numCPUs: 2
osFamily: bottlerocket
resourcePool: /datacenter/host/hwcluster/Resources
storagePolicyName: ""
users:
- name: ec2-user
sshAuthorizedKeys:
- ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDGidVzdPHSLPNq7i4+r1AD2bfAQmEC8NmZM1V0vN7jMIW2QZSflL2LrCpGk0969FHesOUTM1x61B5oYepsLjYgSKDC2mNxIg2jZONPYCg30fxE5vOxWUJObCGuc4trKfz9DLPx7+C3fGgXQaFmnugMgRbqYurdrr8HDeXsavwN361x/MesKpY4E26SBt/RG/sZEssVnzeIPbM8S9LDOX62znFYIXRlgmmx9un68TqQpMti6CnIWUlYwx90MJkV0avL5BeSg9ex3JxYH1THQw3tcj5gyh9GY9yWVxXA7bs3wh5vd8JAJEtPpeqaafRaqXfBFWzC3/L21GxVCwgvGAjovhdDGk3vn6PNRKf4b1MydHnVK7/lZnpNpenDYCszSEebkS5joqehpkaJ4eED1ACvJeh/0urupu47RMN6DcwLUR7j3o7sxcXZK31lecgogC7yvC5eZGK/B6rwHyV3xX7KaVcfabJJeiiJgrb2cKesiKDFgR8DlQ+sUrdwUIcsxsoOskYZJQuvH/h2Gi7lZv71uABnQLvcAeF6OSj7vnrsQ7oUKdcJhAfoRdJCOEt1PtgyDfe2WJ9gH3KRbuHxnNVyQKNZaI5OtEPCxlPIyXbGQnsTwZ1AiWj/RYbj3DP3aCM3Iu7Lg7z/dVGSnRfWJk0zdcZekGch0O43H0EX7611kQ==
--- This also leaves all VMs created behind and the cluster in a state that it isn't ready nor can I delete with eksctl so all we can do is to manually stop and delete each VM... |
galvesribeiro Can you try fullclone instead linkedclone ? Also the CP node ip address is set to "172.16.1.1" ? Is it your vlan gateway IP ? And does your EKS-A vlan have access to your vCenter API endpoint ? I |
Full clone is what was causing vSphere to fail with that message as you see the picture (A specified parameter was not correct: spec.config.deviceChange[0].operation). I was only able to make it pass thru it and deploy the VMs with
No. The network is:
Yep. vCenter is 192.168.8.12 which is routable thru the 172.16.0.1 gateway. |
What happened:
Unable to deploy EKS-A on ESXI 8 U1
What you expected to happen:
The initial cluster to be deployed
How to reproduce it (as minimally and precisely as possible):
Just follow the process from the documentation to deploy the initial cluster.
When it tries to deploy the first etcd VM from the templates, the VM is created, but then briefly after creation it is removed and I see the following error:
I've tried with multiple BR versions starting from 1.26 to 1.29 and all of them fail. Also tried on two completely separated ESXI/vSphere clusters with the same results.
Environment:
Latest EKS-A CLI (from brew) on macOS Sonoma (fully updated) deploying to ESXi/vCenter/vSAN 8U1.
The text was updated successfully, but these errors were encountered: