Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling cgroup v2 in Guest Containers #9555

Open
haswelliris opened this issue Apr 25, 2024 · 7 comments
Open

Enabling cgroup v2 in Guest Containers #9555

haswelliris opened this issue Apr 25, 2024 · 7 comments
Labels
question Requires an answer

Comments

@haswelliris
Copy link

As this doc and this issue said: Set agent.unified_cgroup_hierarchy to 1 or true to enable cgroups v2 in the guest.
I've configured the kernel command line in kata's config file with agent.unified_cgroup_hierarchy=true. However, when I run ls -la /sys/fs/cgroup/, the output suggests that cgroup v2 is not enabled:

# ls -la /sys/fs/cgroup/
total 0
dr-xr-xr-x 2 root root 0 Apr 25 10:41 .
drwxr-xr-x 9 root root 0 Apr 25 10:41 ..

At the same time, the output of /proc/filesystems indicates that cgroupv2 is indeed supported:

# grep cgroup /proc/filesystems
nodev   cgroup
nodev   cgroup2

This leads me to believe there might be some confusion regarding the usage of cgroup v2.

My question is: How can I enable cgroup v2 within a guest container so that I can manage cgroups as if I were operating in a genuine VM?

@haswelliris haswelliris added the question Requires an answer label Apr 25, 2024
@lifupan
Copy link
Member

lifupan commented Apr 26, 2024

Hi @haswelliris

Where did you run "ls -la /sys/fs/cgroup/" command, in guest system or in container? Hi @Apokleos , would you like to take a look at this issue?

@Apokleos
Copy link
Contributor

Hi @haswelliris sorry for that the cgroup v2 setting make you confused.
I think you should read the related issue.

And to address your problem, cloud you please try such setting as describe in the issue #9336

...
kernel_params = "... systemd.unified_cgroup_hierarchy=true"
...

@haswelliris
Copy link
Author

@lifupan Thank you for your response. My objective is to utilize cgroupv2 to regulate the resource usage of subprocesses within my code, inside Kata's containers. Essentially, I intend to employ Kata as a container runtime to enhance security.

For clarity, my host operating system is Ubuntu 2204 and I am using cgroup v2, containerd (v1.7.2), and kata-runtime (3.4.0). Presently, I am attempting to alter the cgroup v2 files located in /sys/fs/cgroup/ within the kata-runtime container. However, this doesn't appear to be working, here are no file in /sys/fs/cgroup/ within the kata-runtime container . Please note that I am not passing the host OS cgroup path to the container. I would prefer the container's cgroup behavior to mimic what I would typically do within a virtual machine.

@Apokleos Thanks for suggestion. I've updated /opt/kata/share/defaults/kata-containers/configuration-qemu.toml and added kernel_params = " systemd.unified_cgroup_hierarchy=true" and sandbox_cgroup_only=true. Then I ran a Kata container and checked its dmesg:

[    0.000000] Command line: tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k cryptomgr.notests net.ifnames=0 pci=lastbus=0 root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro ro rootfstype=ext4 console=hvc0 console=hvc1 quiet systemd.show_status=false panic=1 nr_cpus=240 selinux=0 systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket scsi_mod.scan=none systemd.unified_cgroup_hierarchy=true
[    0.058619] Kernel command line: tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k cryptomgr.notests net.ifnames=0 pci=lastbus=0 root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro ro rootfstype=ext4 console=hvc0 console=hvc1 quiet systemd.show_status=false panic=1 nr_cpus=240 selinux=0 systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket scsi_mod.scan=none systemd.unified_cgroup_hierarchy=true

This indicates that systemd.unified_cgroup_hierarchy=true is set. However, inside the container I cannot find any cgroup v2 files. Running the container with or without privileges does not affect the result. Any insights on this behavior would be greatly appreciated.

@Apokleos
Copy link
Contributor

However, inside the container I cannot find any cgroup v2 files.

Could you give more info about the result of "cannot find any cgroup v2 files" ? what's the concrete info about it ?

@haswelliris
Copy link
Author

haswelliris commented Apr 28, 2024

Could you give more info about the result of "cannot find any cgroup v2 files" ? what's the concrete info about it ?

Normally,on an OS that has cgroup v2 enabled (be it on a host node or in a VM), mount|grep cgroup should reveal the location of cgroup2. For instance:

cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime)

However, when I run a Kata container, executing these commands inside the Kata container returns nothing. For example, when running ls /sys/fs/cgroup, I get:

total 0
dr-xr-xr-x 2 root root 0 Apr 28 14:45 .
drwxr-xr-x 9 root root 0 Apr 28 14:45 ..

Despite this, the container's /proc/filesystem suggests that cgroup2 support is available. This can be seen from grep cgroup /proc/filesystems command's output:

nodev   cgroup
nodev   cgroup2

This leads me to believe there might be an issue with how systemd mounts the cgroupfs when the container's kernel starts up.

update

Just now, I ran a Kata container with privileged access and executed the following command within the container:

mount -t cgroup2 none /sys/fs/cgroup/

This allowed me to view the cgroup filesystem in path /sys/fs/cgroup/ . However, I encountered an issue when trying to modify cgroup attributes.
For instance, when I attempted to alter the cgroup limits for the current process 20 in the Kata container:

echo 20 >  /sys/fs/cgroup/main.scope/main.scope/cgroup.procs

I received the error: Operation not supported (os error 95). Could you provide some insights on this?

@Apokleos
Copy link
Contributor

However, when I run a Kata container, executing these commands inside the Kata container returns nothing. For example, when running ls /sys/fs/cgroup, I get:

total 0
dr-xr-xr-x 2 root root 0 Apr 28 14:45 .
drwxr-xr-x 9 root root 0 Apr 28 14:45 ..

But I get a result differs from yours, regardless of in guest or in container, I will see the result as below:

[root@localhost /]# ls /sys/fs/cgroup/
cgroup.controllers  cgroup.max.descendants  cgroup.stat             cgroup.threads  cpuset.cpus.effective  init.scope  memory.numa_stat  memory.stat
cgroup.max.depth    cgroup.procs            cgroup.subtree_control  cpu.stat        cpuset.mems.effective  io.stat     memory.reclaim    system.slice

Despite this, the container's /proc/filesystem suggests that cgroup2 support is available. This can be seen from grep cgroup /proc/filesystems command's output:

nodev   cgroup
nodev   cgroup2

This leads me to believe there might be an issue with how systemd mounts the cgroupfs when the container's kernel starts up.

update

Just now, I ran a Kata container with privileged access and executed the following command within the container:

mount -t cgroup2 none /sys/fs/cgroup/

This allowed me to view the cgroup filesystem in path /sys/fs/cgroup/ . However, I encountered an issue when trying to modify cgroup attributes. For instance, when I attempted to alter the cgroup limits for the current process 20 in the Kata container:

echo 20 >  /sys/fs/cgroup/main.scope/main.scope/cgroup.procs

I received the error: Operation not supported (os error 95). Could you provide some insights on this?

IMO, first of all, you'd better address why cgroup files not found. Cloud you please change another version of kata(3.3.0) and have a try ?

@haswelliris
Copy link
Author

@Apokleos Here are some recent updates:

The command "ctr" from containerd doesn't mount the cgroup filesystem by default. So, running mount -t cgroup2 none /sys/fs/cgroup/ allows me to view the cgroupfs.

However, in Kubernetes (like when using crictl runp), it mounts the cgroup filesystem with service limits, which makes /sys/fs/cgroup/ read-only.

After un-mounting and mounting again with the following commands:

umount /sys/fs/cgroup
mount -t cgroup2 none /sys/fs/cgroup/

Then, I'm able to access the cgroup filesystem in Kata containers with privileged access.

New problem

I've encountered a new issue: I'm trying to run a runc container within the Kata container, but I'm experiencing an error. The error message is as follows:

Error initializing the container process: unable to start container process: error during container init: read init-p: connection reset by peer

I'm suspecting that the rootfs's OS might be too simple to support the operation of runc.
Could you provide any insights or suggestions that could help resolve this issue? I would greatly appreciate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Requires an answer
Projects
Issue backlog
  
To do
Development

No branches or pull requests

3 participants