Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows Node SystemUUID #201

Open
rhockenbury opened this issue Jun 15, 2019 · 12 comments
Open

Windows Node SystemUUID #201

rhockenbury opened this issue Jun 15, 2019 · 12 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Milestone

Comments

@rhockenbury
Copy link

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

The out-of-tree VCP depends on having systemUUID set on the node object. The systemUUID property does not get set on Windows - kubernetes/kubernetes#75978

This may prevent the windows kubelet from starting when VCP is enabled. Although, the vSphere CSI won't work on windows, it would be nice to be able to run the out-of-tree VCP on windows to have it set the topology tags.

@dvonthenen
Copy link
Contributor

@rhockenbury As soon as that is implemented in k/k for Windows, the vSphere CCM should automatically work without any code modification.

@frapposelli frapposelli added this to the Next milestone Jul 3, 2019
@frapposelli frapposelli added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Jul 3, 2019
@rhockenbury
Copy link
Author

Should be resolved with the 1.16 release - kubernetes/kubernetes#80486

@frapposelli
Copy link
Member

/kind feature
/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. labels Aug 5, 2019
@dvonthenen
Copy link
Contributor

dvonthenen commented Aug 29, 2019

Just a general FYI. The PR that implements Windows UUID was merged upstream... In theory, this should just work using the latest release of the CPI. Unfortunately, I have no means of testing this.

@frapposelli
Copy link
Member

This looks like a testing effort and currently we have no CI or testing environment with Windows.

@rhockenbury do you know if there's any Windows testing infra we can leverage for this?

@frapposelli frapposelli removed this from the Next milestone Sep 4, 2019
@rhockenbury
Copy link
Author

I suspect there is - I believe @benmoss had set up verification tests for windows nodes on vsphere. Hopefully, he can chime in on the state of this, what could be leveraged, and the level of effort.

On a related note, there's also been a lot of recent work on the windows csi-proxy which would enable running the vsphere-csi-driver on windows. Unsure of what (if any) test infra has been proposed for testing that.

Certainly feels like it could be beneficial for both sig-windows and sig-vmware to discuss what's needed going forward to run tests on windows for vsphere cloud providers (in-tree, out-of-tree), the vsphere-csi-driver, and the windows csi-proxy. @PatrickLang and @michmike are probably the best to help coordinate this effort.

@benmoss
Copy link
Member

benmoss commented Sep 4, 2019

I have a CI system that deploys a cluster with a Windows node and runs conformance tests against it, the results are posted here. The CI isn't accessible to the public internet and isn't super easily portable, it's a rather complicated setup that uses BOSH and kubo to deploy the cluster.

I'm open to ideas on how we could make something more portable, but I don't know much about how you currently test vSphere functionality.

@dvonthenen
Copy link
Contributor

I have no idea what I am looking other than a lot of red 🙃

A couple of things I would look for in the logs would be that the node name and internal/external hostname/ips are being set correctly. If zones are being used, then the zone/region labels are properly being populated.

@frapposelli
Copy link
Member

@benmoss thanks for chiming in, our current testing rig is running on vSphere on AWS (a.k.a. VMC) that VMware is sponsoring, it is triggered by Prow and we have a set of presubmits and postsubmits that test the cloud provider functionalities end-to-end.

If there is a way for us to trigger a pre and post submit job on your infra we could have a way to test changes on Windows, do you think that's doable?

@benmoss
Copy link
Member

benmoss commented Sep 6, 2019

Yeah, the Windows tests tend to be quite flaky in my experience. I haven't had time to debug the problems, I know others have recommended not running tests in parallel to avoid this flakiness but that's never sat well with me. Even with that the other builds from Microsoft and Google are still pretty flaky.

My setup isn't through Prow, it's a custom CI pipeline. It'd take some work to get it to build/deploy arbitrary branches/forks. I don't think that it's very sustainable to run it this way.

Kubeadm Windows support is going to be in alpha with Kubernetes 1.16, maybe we can figure out a way to use that as part of the Prow deploys.

@frapposelli
Copy link
Member

@benmoss was there any progress on getting the tests running through prow?

@benmoss
Copy link
Member

benmoss commented Oct 2, 2019

No, I don't have the bandwidth for this right now. I know there is some talk of getting kubeadm test signal for Windows clusters, maybe we can piggyback on their work when it is complete: kubernetes-sigs/sig-windows-tools#14

@dvonthenen dvonthenen added this to the Next milestone Feb 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

5 participants