Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime-rs sandbox_cgroup_only is broken #8245

Open
sepich opened this issue Oct 17, 2023 · 1 comment · May be fixed by #9572
Open

runtime-rs sandbox_cgroup_only is broken #8245

sepich opened this issue Oct 17, 2023 · 1 comment · May be fixed by #9572
Labels
bug Incorrect behaviour needs-review Needs to be assessed by the team. runtime-rs

Comments

@sepich
Copy link

sepich commented Oct 17, 2023

Description of problem

Installed latest v3.2.0-rc0 and starting runtime-rs with default /opt/kata/share/defaults/kata-containers/configuration-dragonball.toml

Expected result

cgroup would be created in the correct place (and inherit container mem limit)

Actual result

  1. sandbox_cgroup_only=false container does not start:
E1017 14:20:17.319380  136650 remote_runtime.go:193] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: Others(\"failed to handler message create container\\n\\nCaused by:\\n    0: create\\n    1: using method in wrong cgroup mode.\"): unknown"
  1. with default value sandbox_cgroup_only=true cgroup is created without memory limit, and in a wrong place:
# systemd-cgls memory

Control group /:
├─kubepods-burstable-pod7432e27c_9134_4c82_9320_a13e0cf65f62.slice:cri-containerd:b3c5a2685e567bb3c9765138079f9e924540a63695a280e7dd6fc0d166f6ef83
│ └─163071 /opt/kata/runtime-rs/bin/containerd-shim-kata-v2 -id b3c5a2685e567bb…
└─kubepods.slice
  ├─kubepods-burstable.slice
  │ ├─kubepods-burstable-podb9f10f16090c87895b1c288d42df1915.slice
  │ │ ├─cri-containerd-220a5de21396daaeab31eecb8acb0ee7025845eb48df5f373f1b3af9f78e1f0d.scope …
  │ │ │ └─2404 /pause
...
  
# cat /sys/fs/cgroup/kubepods-burstable-pod7432e27c_9134_4c82_9320_a13e0cf65f62.slice\:cri-containerd\:b3c5a2685e567bb3c9765138079f9e924540a63695a280e7dd6fc0d166f6ef83/memory.high
max

Which contradicts docs (and how it works for СLH, QEMU):
https://github.com/kata-containers/kata-containers/blob/main/docs/design/host-cgroups.md#sandbox_cgroup_only--true
So, sandbox_cgroup_only is broken for both values.

Maybe because you are trying to work with cgroup v1, and it is cgroup v2 already from kernel v4:

time="2023-10-17T14:18:03.295983994Z" level=error msg="failed to shutdown shim task and the shim might be leaked" error="Others(\"failed to handler message handler request\\n\\nCaused by:\\n    0: do shutdown\\n    1: do the clean up\\n    2: resource clean up\\n    3: delete cgroup\\n    4: unable to read a control group file /sys/fs/cgroup/cgroup.type caused by: Os { code: 2, kind: NotFound, message: \\\"No such file or directory\\\" }\\n    5: No such file or directory (os error 2)\"): unknown" id=b3c5a2685e567bb3c9765138079f9e924540a63695a280e7dd6fc0d166f6ef83

Further information

Host kernel: Ubuntu 22.04.3 LTS 5.15.0-85-generic
cgroups: v2
kubelet: cgroupDriver: systemd

@sepich sepich added bug Incorrect behaviour needs-review Needs to be assessed by the team. labels Oct 17, 2023
@katacontainersbot katacontainersbot moved this from To do to In progress in Issue backlog Apr 30, 2024
@Champ-Goblem
Copy link
Contributor

I found that the issue was constrain_hypervisors uses add_task instead of add_task_by_tgid, I have opened a PR to fix the issue. After this change pods start correctly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behaviour needs-review Needs to be assessed by the team. runtime-rs
Projects
Issue backlog
  
In progress
Development

Successfully merging a pull request may close this issue.

3 participants