Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime-rs: Add pci_hotplug as a config option #9595

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ananos
Copy link
Member

@ananos ananos commented May 4, 2024

There are cases where enabling PCI hotplug by default, breaks
Dragonball's initial setup [see #9596], resulting in the following
error message:

FATA[0000] failed to create shim task: Others("failed to handle message try
init runtime instance\n\nCaused by:\n    0: init runtime handler\n
1: start sandbox\n    2: start vm\n  3: start vmm instance\n
4: Failed to start vmm\n    5: Failed to start MicroVm\n
6: vmm action error: StartMicroVm(CreateVfioDevice(NoResource))"): unknown

Adding this as a config option enables normal booting even on nodes that
lack the default functionality.

Fixes: #9509

@katacontainersbot katacontainersbot added the size/small Small and simple task label May 4, 2024
@ananos ananos added runtime-rs ok-to-test area/gpu Issues specific to GPU/PCIe labels May 4, 2024
Copy link
Contributor

@Apokleos Apokleos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx @ananos LGTM!

@Apokleos
Copy link
Contributor

Apokleos commented May 6, 2024

/test

Copy link
Member

@studychao studychao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! a few comments.

# PCI hotplug
# Enable or disable pci_hotplug
# will be set to @DEFPCIHOTPLUG@
pci_hotplug = @DEFPCIHOTPLUG@
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add a comment here that this currently is a dragonball-only config, so that other people could know that this config won't take effect on other VMMs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK, thanks!

Comment on lines +454 to +457
pub pci_hotplug: bool,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add a comment here that this currently is a dragonball-only config, so that other people could know that this config won't take effect on other VMMs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, thanks!

There are cases where enabling PCI hotplug by default, breaks
Dragonball's initial setup [see kata-containers#9596], resulting in the following
error message:

```
FATA[0000] failed to create shim task: Others("failed to handle message try
init runtime instance\n\nCaused by:\n    0: init runtime handler\n
1: start sandbox\n    2: start vm\n  3: start vmm instance\n
4: Failed to start vmm\n    5: Failed to start MicroVm\n
6: vmm action error: StartMicroVm(CreateVfioDevice(NoResource))"): unknown
```

Adding this as a config option enables normal booting even on nodes that
lack the default functionality.

Fixes: kata-containers#9509

Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
@BbolroC
Copy link
Member

BbolroC commented May 6, 2024

For CI failures for s390x, I am working on it and will get it green again. Sorry for the inconvenience.


UPDATE: all CI jobs for s390x have been passed.

@zvonkok
Copy link
Contributor

zvonkok commented May 6, 2024

We should try to keep the discrepancies between dragonball and other VMMs as small as possible. To not introduce too many different configs variables. For qemu/CLH e.g. we have hot_plug_vfio="..." or cold_plug_vfio="...".

@Apokleos
Copy link
Contributor

Apokleos commented May 6, 2024

/test

@studychao
Copy link
Member

studychao commented May 7, 2024

We have introduced #9596 to fix the problem for CreateVfioDevice(NoResource) error in Dragonball side.
So this PR could be close because the problem won't appear after #9596 is merged.

@ananos
Copy link
Member Author

ananos commented May 7, 2024

I tend to agree to ditch this PR. My only thought is whether we would be interested in keeping this option in terms of "faster" booting maybe?

Off the top of my head, removing the PCI hotplug initialization part when the user knows it is not needed, could save some time; so it might be useful to some people.. Not sure though -- any thoughts on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/gpu Issues specific to GPU/PCIe ok-to-test runtime-rs size/small Small and simple task
Projects
None yet
Development

Successfully merging this pull request may close these issues.

runtime-rs: It fails to run kata with dragonball with error message StartMicroVm(CreateVfioDevice(NoResource))
9 participants