Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

versions: Upgrade to Cloud Hypervisor v39.0 #9575

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

likebreath
Copy link
Contributor

This patch upgrades Cloud Hypervisor to v39.0 from v36.0, which contains
fixes of several security advisories from dependencies. Details can be
found from #9574.

Fixes: #8694, #9574

Signed-off-by: Bo Chen chen.bo@intel.com

@likebreath
Copy link
Contributor Author

/test

@katacontainersbot katacontainersbot added the size/huge Largest and most complex task (probably needs breaking into small pieces) label Apr 30, 2024
@likebreath
Copy link
Contributor Author

@jodh-intel As we discussed offline, let's move Cloud Hypervisor to the latest release so that it will have the fixes for several security advisories from its dependencies. We will make sure Cloud Hypervisor related CI workers pass for both golang and rust-lang runtime, particularly the ones recently added from #9525. Meanwhile, we will follow-up with additional PRs to tackle #9522.

@jodh-intel
Copy link
Contributor

Thanks @likebreath. Fwics, the API changes are simply additions and since the rust definition of CH's VmConfig doesn't have (or need) them yet, those objects won't be serialised so they will default to whatever defaults CH uses itself.

Speaking of defaults, I did notice one potential issue though: openapi.yaml appears to be supposed to be identical to cloud-hypervisor.yaml with the addition of examples only. In other words:

openapi.yaml - examples = cloud-hypervisor.yaml

However, after sorting the files, I did notice a couple of differences: cloud-hypervisory.yaml specified the following, neither of which are present in openapi.yaml:

MemoryConfig.properties.size.default = 512MB
MemoryZoneConfig.properties.size.default = 512MB

@likebreath
Copy link
Contributor Author

Thanks @likebreath. Fwics, the API changes are simply additions and since the rust definition of CH's VmConfig doesn't have (or need) them yet, those objects won't be serialised so they will default to whatever defaults CH uses itself.

I agree. We can definitely take "lazy evaluation" approach here to add them only when they are needed. This also keeps the VmConfig we carried from rust runtime simpler which is better, IMHO.

Speaking of defaults, I did notice one potential issue though: openapi.yaml appears to be supposed to be identical to cloud-hypervisor.yaml with the addition of examples only. In other words:

openapi.yaml - examples = cloud-hypervisor.yaml

However, after sorting the files, I did notice a couple of differences: cloud-hypervisory.yaml specified the following, neither of which are present in openapi.yaml:

MemoryConfig.properties.size.default = 512MB
MemoryZoneConfig.properties.size.default = 512MB

These are good catch. I took a closer look at them and found they are caused by inconsistencies from the openapi spec in Cloud Hypervisor, e.g. default values do not make sense for required fields. Thanks for reporting it. Just sent a PR to fix it from CH side: cloud-hypervisor/cloud-hypervisor#6431.

@likebreath
Copy link
Contributor Author

@jodh-intel @amshinde While most CI workers are working fine (including ones on runtime-rs), the set of tests e.g. run-k8s-tests (cbl-mariner, clh, *) are constantly failing for the following reason. It looks like the api endpoint is not available (say Cloud Hypervisor is not launched successfully). Does it ring any bells for you? Thank you.

# Warning FailedCreatePodSandBox 7s (x7 over 89s) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: error: Put "http://localhost/api/v1/vm.boot": EOF reason: : unknown

Logs: https://github.com/kata-containers/kata-containers/actions/runs/8899253970/job/24474391384?pr=9575

@danmihai1
Copy link
Contributor

@jodh-intel @amshinde While most CI workers are working fine (including ones on runtime-rs), the set of tests e.g. run-k8s-tests (cbl-mariner, clh, *) are constantly failing for the following reason. It looks like the api endpoint is not available (say Cloud Hypervisor is not launched successfully). Does it ring any bells for you? Thank you.

# Warning FailedCreatePodSandBox 7s (x7 over 89s) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: error: Put "http://localhost/api/v1/vm.boot": EOF reason: : unknown

Logs: https://github.com/kata-containers/kata-containers/actions/runs/8899253970/job/24474391384?pr=9575

@likebreath , sharing this just in case you don't know this already: the cbl-mariner Host VMs are using /dev/mshv instead of /dev/kvm. Let me know in case you need help with these Hosts.

@likebreath
Copy link
Contributor Author

@jodh-intel @amshinde While most CI workers are working fine (including ones on runtime-rs), the set of tests e.g. run-k8s-tests (cbl-mariner, clh, *) are constantly failing for the following reason. It looks like the api endpoint is not available (say Cloud Hypervisor is not launched successfully). Does it ring any bells for you? Thank you.
# Warning FailedCreatePodSandBox 7s (x7 over 89s) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: error: Put "http://localhost/api/v1/vm.boot": EOF reason: : unknown
Logs: https://github.com/kata-containers/kata-containers/actions/runs/8899253970/job/24474391384?pr=9575

@likebreath , sharing this just in case you don't know this already: the cbl-mariner Host VMs are using /dev/mshv instead of /dev/kvm. Let me know in case you need help with these Hosts.

@danmihai1 That's very helpful context. Given these tests are constantly failing, we need to look into them manually. Are you able to help with setting up an environment to reproduce the errors? Or is there a better person (say the maintainer of these CI workers) I should talk with? Thank you.

@danmihai1
Copy link
Contributor

@likebreath , these are AKS Nodes/Kata Hosts that CI is dynamically provisioning. So, setting up one of them is not trivial. I will add below my hacky script to simulate what Kata CI is doing - just in case anyone is brave enough to try these steps.

But, either I or @sprt should be able to try your changes in our clusters if you want.

On the CI cluster that I have just created, it appears that the problem took place during the cl.BootVM() call:

containerd[14174]: time="2024-05-01T20:39:50.201387897Z" level=debug msg="Booting VM" name=containerd-shim-v2 pid=49895 
virtiofsd[49904]: Client connected, servicing requests
kernel: mshv_vp_ioctl: invalid ioctl: 0xc010b80a
systemd[1]: Started Process Core Dump (PID 49916/UID 0).
...
Process 49905 (cloud-hyperviso) of user 0 dumped core.

@danmihai1
Copy link
Contributor

These were the hacky steps I used to simulate the Kata CI set-up:

export AZ_RG=<Some resource group name>

export GH_PR_NUMBER=9575 && \
export DOCKER_TAG=9575-12492b48612e6da1950e1c05986ed6180b27f8a1-amd64
export DOCKER_REGISTRY=ghcr.io && \
export DOCKER_REPO=kata-containers/kata-deploy-ci && \
export K8S_TEST_HOST_TYPE=normal && \
export KATA_HYPERVISOR=clh && \
export KATA_HOST_OS=cbl-mariner && \
export KBS_INGRESS=aks && \
export KUBERNETES=vanilla && \
export USING_NFD=false

az extension add --allow-preview true --name aks-preview

az login --use-device-code

az group create --name "${AZ_RG}" --location eastus

export GOROOT=/usr/local/go && export GOPATH=$HOME/gopath && export PATH=$GOPATH/bin:$GOROOT/bin:$PATH

cd kata-containers/tests/integration/kubernetes/

git reset HEAD --hard

bash gha-run.sh create-cluster

Get your cluster name from your Azure Resource Group UI. It looks similarly to: k8s-9566-f04a7a55ed9b-clh-cbl-mariner-amd64-n-o

Then:

my_cluster_name=<your cluster name>

az aks get-credentials --resource-group "${AZ_RG}" --name ${my_cluster_name} --overwrite-existing

bash gha-run.sh deploy-kata-aks

pushd kata-containers/src/tools/genpolicy
make LIBC=gnu
sudo -s
umask 0022
mkdir -p /opt/kata/bin/ && \
cp target/x86_64-unknown-linux-gnu/release/genpolicy /opt/kata/bin/ && \
chmod a+rx /opt/kata/bin/genpolicy

mkdir -p /opt/kata/share/defaults/kata-containers/ && \
cp ./rules.rego /opt/kata/share/defaults/kata-containers/ && \
chmod a+r /opt/kata/share/defaults/kata-containers/rules.rego && \
cp ./genpolicy-settings.json /opt/kata/share/defaults/kata-containers/ && \
chmod a+r /opt/kata/share/defaults/kata-containers/genpolicy-settings.json
exit
popd

K8S_TEST_UNION="k8s-exec.bats" bash gha-run.sh run-tests

@likebreath
Copy link
Contributor Author

On the CI cluster that I have just created, it appears that the problem took place during the cl.BootVM() call:

containerd[14174]: time="2024-05-01T20:39:50.201387897Z" level=debug msg="Booting VM" name=containerd-shim-v2 pid=49895 
virtiofsd[49904]: Client connected, servicing requests
kernel: mshv_vp_ioctl: invalid ioctl: 0xc010b80a
systemd[1]: Started Process Core Dump (PID 49916/UID 0).
...
Process 49905 (cloud-hyperviso) of user 0 dumped core.

@danmihai1 Thank you for reproducing it locally. This is a crucial step. It suggests that the VM created by Cloud Hypervisor failed to boot due to an invalid mshv_vp_ioctl call.

Since I don't have the access to the reproducing environment, would you or @sprt please help with debugging?

First, I would check if this error is reproducible with Cloud Hypevisor v39.0 by itself (e.g. without Kata involved). You can simply try to launch a VM with Cloud Hypervisor on the CI environment. In case you need some references, you can take a look this: https://github.com/cloud-hypervisor/cloud-hypervisor?tab=readme-ov-file#firmware-booting

If you can, it will be much easier to bisect and locate the breaking commit for further investigation out of Kata's environment. Otherwise, we will need to bisect with Kata. This is still doable as I think the Cloud Hypervisor binary is easily swappable when using Kata to reproduce the failed tests.

Since this is likely an issue concerning mshv support in Cloud Hypervisor, will any of you can help with the debugging here too? @liuw @russell-islam @jinankjain

@liuw
Copy link

liuw commented May 1, 2024

On the CI cluster that I have just created, it appears that the problem took place during the cl.BootVM() call:

containerd[14174]: time="2024-05-01T20:39:50.201387897Z" level=debug msg="Booting VM" name=containerd-shim-v2 pid=49895 
virtiofsd[49904]: Client connected, servicing requests
kernel: mshv_vp_ioctl: invalid ioctl: 0xc010b80a
systemd[1]: Started Process Core Dump (PID 49916/UID 0).
...
Process 49905 (cloud-hyperviso) of user 0 dumped core.

@danmihai1 Thank you for reproducing it locally. This is a crucial step. It suggests that the VM created by Cloud Hypervisor failed to boot due to an invalid mshv_vp_ioctl call.

Since I don't have the access to the reproducing environment, would you or @sprt please help with debugging?

First, I would check if this error is reproducible with Cloud Hypevisor v39.0 by itself (e.g. without Kata involved). You can simply try to launch a VM with Cloud Hypervisor on the CI environment. In case you need some references, you can take a look this: https://github.com/cloud-hypervisor/cloud-hypervisor?tab=readme-ov-file#firmware-booting

If you can, it will be much easier to bisect and locate the breaking commit for further investigation out of Kata's environment. Otherwise, we will need to bisect with Kata. This is still doable as I think the Cloud Hypervisor binary is easily swappable when using Kata to reproduce the failed tests.

Since this is likely an issue concerning mshv support in Cloud Hypervisor, will any of you can help with the debugging here too? @liuw @russell-islam @jinankjain

@danmihai1 is this blocking? Perhaps we should upgrade the test host? The breakage could be due to our recent changes in the kernel IOCTL interfaces. CC @NunoDasNeves

@danmihai1
Copy link
Contributor

@liuw @NunoDasNeves yes, it is blocking. I will ping you tomorrow to make a plan together.

@NunoDasNeves
Copy link

NunoDasNeves commented May 2, 2024

@danmihai1 yes it is probably due to cloud-hypervisor/cloud-hypervisor#6339
Likely the kernel is out of sync with CLH

@danmihai1
Copy link
Contributor

@likebreath ,

@sprt is experimenting with a short-term workaround in #9588. He'll let us know what he finds. Thanks for your patience!

Separately, we discussed the longer-term fix with @liuw and his team at MSFT - basically updating the AKS host kernel images to support newer versions of CLH.

sprt added a commit to sprt/kata-containers that referenced this pull request May 3, 2024
The CH v39 upgrade in kata-containers#9575 is currently blocked because of a bug in the
Mariner host kernel. To address this, we temporarily tweak the Mariner
CI to use an Ubuntu host and the Kata guest kernel, while retaining the
Mariner initrd. This is tracked in kata-containers#9594.

Importantly, this allows us to preserve CI for genpolicy. We had to
tweak the default rules.rego however, as the OCI version is now
different in the Ubuntu host. This is tracked in kata-containers#9593.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
sprt added a commit to sprt/kata-containers that referenced this pull request May 3, 2024
The CH v39 upgrade in kata-containers#9575 is currently blocked because of a bug in the
Mariner host kernel. To address this, we temporarily tweak the Mariner
CI to use an Ubuntu host and the Kata guest kernel, while retaining the
Mariner initrd. This is tracked in kata-containers#9594.

Importantly, this allows us to preserve CI for genpolicy. We had to
tweak the default rules.rego however, as the OCI version is now
different in the Ubuntu host. This is tracked in kata-containers#9593.

This change has been tested together with CH v39 in kata-containers#9588.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
@sprt
Copy link
Contributor

sprt commented May 3, 2024

Hi @likebreath, I implemented the workaround in #9592 - once that gets merged we'll be able to proceed here. We're adapting the Mariner CI to unblock the CH v39 upgrade, I've tested that that change works with CH v39.

@likebreath
Copy link
Contributor Author

Hi @likebreath, I implemented the workaround in #9592 - once that gets merged we'll be able to proceed here. We're adapting the Mariner CI to unblock the CH v39 upgrade, I've tested that that change works with CH v39.

That's great to know. Thank you for the quick fix.

@likebreath
Copy link
Contributor Author

As the cbl-mariner tests being fixed, the only relevant test failure now is run-vfio (clh). The test failed for a very similar reason, e.g. HTTP end-point is not available, and it likely suggest Cloud Hypervisor failed to launch. Also, the test log shows some warning related to mshv.

@sprt @danmihai1 Does the test-vfio worker also run on top of mshv? Does the error message below ring any bells? Thank you.

2024-05-01T16:32:03.4534012Z May 01 16:31:44 test kata[1692]: time="2024-05-01T16:31:44.866497591Z" level=warning msg="Could not add /dev/mshv to the devices cgroup" name=containerd-shim-v2 pid=1692 sandbox=vfio-guest-kernel-11614 source=cgroups
2024-05-01T16:32:03.4534913Z May 01 16:31:44 test containerd[1533]: time="2024-05-01T16:31:44.866497591Z" level=warning msg="Could not add /dev/mshv to the devices cgroup" name=containerd-shim-v2 pid=1692 sandbox=vfio-guest-kernel-11614 source=cgroups
2024-05-01T16:32:03.4536817Z May 01 16:31:44 test kata[1692]: time="2024-05-01T16:31:44.897132677Z" level=warning msg="clh.VmmPingGet API call failed" error="Get \"http://localhost/api/v1/vmm.ping\": dial unix /run/vc/vm/vfio-guest-kernel-11614/clh-api.sock: connect: no such file or directory" name=containerd-shim-v2 pid=1692 sandbox=vfio-guest-kernel-11614 source=virtcontainers/hypervisor subsystem=cloudHypervisor
2024-05-01T16:32:03.4538560Z May 01 16:31:44 test containerd[1533]: time="2024-05-01T16:31:44.897132677Z" level=warning msg="clh.VmmPingGet API call failed" error="Get \"http://localhost/api/v1/vmm.ping\": dial unix /run/vc/vm/vfio-guest-kernel-11614/clh-api.sock: connect: no such file or directory" name=containerd-shim-v2 pid=1692 sandbox=vfio-guest-kernel-11614 source=virtcontainers/hypervisor subsystem=cloudHypervisor

@sprt
Copy link
Contributor

sprt commented May 6, 2024

@likebreath Once #9592 is merged, you'll have to rebase this PR on top of it, as right now this PR is still testing code without the fix.

The test-vfio test is passing in my PR - the mshv warning is expected.

Note there's a failing Jenkins test that's currently blocking #9592 and other PRs, looking into it.

@likebreath
Copy link
Contributor Author

@likebreath Once #9592 is merged, you'll have to rebase this PR on top of it, as right now this PR is still testing code without the fix.

Note there's a failing Jenkins test that's currently blocking #9592 and other PRs, looking into it.

Understand. I will rebase once your PR is landed.

The test-vfio test is passing in my PR - the mshv warning is expected.

Yes, I saw that too. The test-vfio test is only failing on this PR, e.g. with CH v39.0. And the is failing for a similar behavior as what we observed on the cbl-mariner tests, e.g. the HTTP endpoint is not available that suggests the CH process likely failed to launch.

@sprt I am assuming you mean the test_vfio failure is not related to mshv, correct? If it is clear this is a separate issue, we can look into it now. Do you know who is the maintainer of the test_vfio test?

@sprt
Copy link
Contributor

sprt commented May 6, 2024

Doesn't seem to be directly related to this PR, I see it failing in #9583 and #9598 as well. Rerunning in case it's a transient issue.

@lifupan lifupan marked this pull request as ready for review May 7, 2024 04:06
@sprt
Copy link
Contributor

sprt commented May 7, 2024

@likebreath Could you rebase your PR on top of main? The Mariner fix was just merged.

This patch upgrades Cloud Hypervisor to v39.0 from v36.0, which contains
fixes of several security advisories from dependencies. Details can be
found from kata-containers#9574.

Fixes: kata-containers#8694, kata-containers#9574

Signed-off-by: Bo Chen <chen.bo@intel.com>
This patch re-generates the client code for Cloud Hypervisor v39.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: kata-containers#8694, kata-containers#9574

Signed-off-by: Bo Chen <chen.bo@intel.com>
@likebreath
Copy link
Contributor Author

I looked into the test-vfio failure, and found this failure is related to a recent change from Cloud Hypervisor side regarding vfio: cloud-hypervisor/cloud-hypervisor#6175. (Note Cloud Hypervisor CI has its own vfio integration test and that works fine).

The real reason for the test-vfio failure in this PR is:

2024-05-08T11:15:27.8069926Z May 08 11:15:16 test kata[1700]: time="2024-05-08T11:15:16.39235711Z" level=error msg="failed to hotplug VFIO device" error="Failed to hotplug device &{ID:vfio-da08bad4f90d65dc0 BDF:0000:00:03.0 SysfsDev:/sys/bus/pci/devices/0000:00:03.0 VendorID: DeviceID: Class:0x020000 Bus: GuestPciPath: Type:1 IsPCIe:false APDevices:[] Rank:-1 Port:root-port} error: 500  reason: Error from device manager: VfioMapRegion(DmaMap(IommuDmaMap(Error(22))))" name=containerd-shim-v2 pid=1700 sandbox=vfio-guest-kernel-18252 source=virtcontainers subsystem=sandbox vfio-device-BDF="0000:00:03.0" vfio-device-ID=vfio-da08bad4f90d65dc0

This is error is reported while hot-plugging a VFIO device to Cloud Hypervisor [1 ]. We need to reproduce and debug this offline, as we need to collect more log to understand the issue further.

Any good contact of points for setting up a reproducing environment for the test-vfio worker? Thanks.

[1] here: https://github.com/cloud-hypervisor/cloud-hypervisor/blob/7e25cc2aa0b7a34a5f42a4edb2c5477ddcfa2381/pci/src/vfio.rs#L1657-L1665

/cc @amshinde @jodh-intel @chavafg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok-to-test size/huge Largest and most complex task (probably needs breaking into small pieces)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Upgrade to Cloud Hypervisor v37.0
7 participants