Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rootless Docker CDI Injection: error modifying OCI spec: failed to inject CDI devices: unresolvable CDI devices nvidia.com/gpu=all: unknown. #434

Open
LukasIAO opened this issue Mar 31, 2024 · 12 comments
Assignees

Comments

@LukasIAO
Copy link

LukasIAO commented Mar 31, 2024

Hello everyone,

we have recently set up a rootless docker instance alongside our existing docker on one of our servers, but ran into issues mounting host GPUs into the rootless containers. A workaround was presented in issue #85 (toggling no-cgroups to switch between rootful and rootless) with a mention of a better solution in the form of Nvidia CDI coming as an experimental feature in Docker 25.

After updating to the newest Docker releases and setting up CDI, our regular Docker instance behaved as we expected based on the documentation, but the rootless instance still runs into issues.

Setup to reproduce:

Distributor ID: Ubuntu
Description:    Ubuntu 22.04.4 LTS
Release:        22.04
Codename:       jammy

NVIDIA Container Toolkit CLI version 1.14.6
commit: 5605d191332dcfeea802c4497360d60a65c7887e

rootless: containerd github.com/containerd/containerd v1.7.13 7c3aca7a610df76212171d200ca3811ff6096eb8
rootful: containerd containerd.io 1.6.28 ae07eda36dd25f8a1b98dfbf587313b99c0190bb
+---------------------------------------------------------------------------------------+`
| NVIDIA-SMI 535.161.07             Driver Version: 535.161.07   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A100-SXM4-40GB          On  | 00000000:01:00.0 Off |                    0 |
| N/A   40C    P0              61W / 275W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA A100-SXM4-40GB          On  | 00000000:47:00.0 Off |                    0 |
| N/A   39C    P0              55W / 275W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   2  NVIDIA A100-SXM4-40GB          On  | 00000000:81:00.0 Off |                    0 |
| N/A   39C    P0              57W / 275W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   3  NVIDIA DGX Display             On  | 00000000:C1:00.0 Off |                  N/A |
| 34%   41C    P8              N/A /  50W |      1MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   4  NVIDIA A100-SXM4-40GB          On  | 00000000:C2:00.0 Off |                    0 |
| N/A   39C    P0              58W / 275W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
config.toml (click to expand)
#accept-nvidia-visible-devices-as-volume-mounts = false
#accept-nvidia-visible-devices-envvar-when-unprivileged = true
disable-require = false
#swarm-resource = "DOCKER_RESOURCE_GPU"

[nvidia-container-cli]
#debug = "/var/log/nvidia-container-toolkit.log"
environment = []
#ldcache = "/etc/ld.so.cache"
ldconfig = "@/sbin/ldconfig.real"
load-kmods = true
#no-cgroups = true
#no-cgroups = false
#path = "/usr/bin/nvidia-container-cli"
#root = "/run/nvidia/driver"
#user = "root:video"

[nvidia-container-runtime]
#debug = "/var/log/nvidia-container-runtime.log"
log-level = "info"
mode = "auto"
runtimes = ["docker-runc", "runc"]

[nvidia-container-runtime.modes]

[nvidia-container-runtime.modes.csv]
mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"
  • sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
nvidia.yaml (click to expand)
cdiVersion: 0.5.0
containerEdits:
  deviceNodes:
  - path: /dev/nvidia-modeset
  - path: /dev/nvidia-uvm
  - path: /dev/nvidia-uvm-tools
  - path: /dev/nvidiactl
  hooks:
  - args:
    - nvidia-ctk
    - hook
    - create-symlinks
    - --link
    - libglxserver_nvidia.so.535.161.07::/lib/x86_64-linux-gnu/nvidia/xorg/libglxserver_nvidia.so
    hookName: createContainer
    path: /usr/bin/nvidia-ctk
  - args:
    - nvidia-ctk
    - hook
    - update-ldcache
    - --folder
    - /lib/x86_64-linux-gnu
    hookName: createContainer
    path: /usr/bin/nvidia-ctk
  mounts:
  - containerPath: /lib/x86_64-linux-gnu/libEGL_nvidia.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libEGL_nvidia.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libGLESv2_nvidia.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libGLESv2_nvidia.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libGLX_nvidia.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libGLX_nvidia.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libcuda.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libcuda.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libcudadebugger.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libcudadebugger.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvcuvid.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvcuvid.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-allocator.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-allocator.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-cfg.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-cfg.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-egl-gbm.so.1.1.0
    hostPath: /lib/x86_64-linux-gnu/libnvidia-egl-gbm.so.1.1.0
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-eglcore.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-eglcore.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-encode.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-encode.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-fbc.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-fbc.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-glcore.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-glcore.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-glsi.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-glsi.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-ml.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-ml.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-ngx.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-ngx.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-nscq.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-nscq.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-nvvm.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-nvvm.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-opencl.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-opencl.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-opticalflow.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-opticalflow.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-pkcs11-openssl3.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-pkcs11-openssl3.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-pkcs11.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-pkcs11.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-rtcore.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-rtcore.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-tls.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-tls.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-vulkan-producer.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-vulkan-producer.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvoptix.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvoptix.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /run/nvidia-persistenced/socket
    hostPath: /run/nvidia-persistenced/socket
    options:
    - ro
    - nosuid
    - nodev
    - bind
    - noexec
  - containerPath: /usr/bin/nvidia-cuda-mps-control
    hostPath: /usr/bin/nvidia-cuda-mps-control
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/bin/nvidia-cuda-mps-server
    hostPath: /usr/bin/nvidia-cuda-mps-server
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/bin/nvidia-debugdump
    hostPath: /usr/bin/nvidia-debugdump
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/bin/nvidia-persistenced
    hostPath: /usr/bin/nvidia-persistenced
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/bin/nvidia-smi
    hostPath: /usr/bin/nvidia-smi
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/share/nvidia/nvoptix.bin
    hostPath: /usr/share/nvidia/nvoptix.bin
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/firmware/nvidia/535.161.07/gsp_ga10x.bin
    hostPath: /lib/firmware/nvidia/535.161.07/gsp_ga10x.bin
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/firmware/nvidia/535.161.07/gsp_tu10x.bin
    hostPath: /lib/firmware/nvidia/535.161.07/gsp_tu10x.bin
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/nvidia/xorg/libglxserver_nvidia.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/nvidia/xorg/libglxserver_nvidia.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/nvidia/xorg/nvidia_drv.so
    hostPath: /lib/x86_64-linux-gnu/nvidia/xorg/nvidia_drv.so
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/share/X11/xorg.conf.d/10-nvidia.conf
    hostPath: /usr/share/X11/xorg.conf.d/10-nvidia.conf
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/share/egl/egl_external_platform.d/15_nvidia_gbm.json
    hostPath: /usr/share/egl/egl_external_platform.d/15_nvidia_gbm.json
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/share/glvnd/egl_vendor.d/10_nvidia.json
    hostPath: /usr/share/glvnd/egl_vendor.d/10_nvidia.json
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/share/vulkan/icd.d/nvidia_icd.json
    hostPath: /usr/share/vulkan/icd.d/nvidia_icd.json
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/share/vulkan/implicit_layer.d/nvidia_layers.json
    hostPath: /usr/share/vulkan/implicit_layer.d/nvidia_layers.json
    options:
    - ro
    - nosuid
    - nodev
    - bind
devices:
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia4
    - path: /dev/dri/card5
    - path: /dev/dri/renderD132
    hooks:
    - args:
      - nvidia-ctk
      - hook
      - create-symlinks
      - --link
      - ../card5::/dev/dri/by-path/pci-0000:01:00.0-card
      - --link
      - ../renderD132::/dev/dri/by-path/pci-0000:01:00.0-render
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
    - args:
      - nvidia-ctk
      - hook
      - chmod
      - --mode
      - "755"
      - --path
      - /dev/dri
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
  name: "0"
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia3
    - path: /dev/dri/card4
    - path: /dev/dri/renderD131
    hooks:
    - args:
      - nvidia-ctk
      - hook
      - create-symlinks
      - --link
      - ../card4::/dev/dri/by-path/pci-0000:47:00.0-card
      - --link
      - ../renderD131::/dev/dri/by-path/pci-0000:47:00.0-render
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
    - args:
      - nvidia-ctk
      - hook
      - chmod
      - --mode
      - "755"
      - --path
      - /dev/dri
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
  name: "1"
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia2
    - path: /dev/dri/card3
    - path: /dev/dri/renderD130
    hooks:
    - args:
      - nvidia-ctk
      - hook
      - create-symlinks
      - --link
      - ../card3::/dev/dri/by-path/pci-0000:81:00.0-card
      - --link
      - ../renderD130::/dev/dri/by-path/pci-0000:81:00.0-render
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
    - args:
      - nvidia-ctk
      - hook
      - chmod
      - --mode
      - "755"
      - --path
      - /dev/dri
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
  name: "2"
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia1
    - path: /dev/dri/card2
    - path: /dev/dri/renderD129
    hooks:
    - args:
      - nvidia-ctk
      - hook
      - create-symlinks
      - --link
      - ../card2::/dev/dri/by-path/pci-0000:c2:00.0-card
      - --link
      - ../renderD129::/dev/dri/by-path/pci-0000:c2:00.0-render
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
    - args:
      - nvidia-ctk
      - hook
      - chmod
      - --mode
      - "755"
      - --path
      - /dev/dri
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
  name: "4"
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia1
    - path: /dev/nvidia2
    - path: /dev/nvidia3
    - path: /dev/nvidia4
    - path: /dev/dri/card2
    - path: /dev/dri/card3
    - path: /dev/dri/card4
    - path: /dev/dri/card5
    - path: /dev/dri/renderD129
    - path: /dev/dri/renderD130
    - path: /dev/dri/renderD131
    - path: /dev/dri/renderD132
    hooks:
    - args:
      - nvidia-ctk
      - hook
      - create-symlinks
      - --link
      - ../card5::/dev/dri/by-path/pci-0000:01:00.0-card
      - --link
      - ../renderD132::/dev/dri/by-path/pci-0000:01:00.0-render
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
    - args:
      - nvidia-ctk
      - hook
      - chmod
      - --mode
      - "755"
      - --path
      - /dev/dri
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
    - args:
      - nvidia-ctk
      - hook
      - create-symlinks
      - --link
      - ../card4::/dev/dri/by-path/pci-0000:47:00.0-card
      - --link
      - ../renderD131::/dev/dri/by-path/pci-0000:47:00.0-render
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
    - args:
      - nvidia-ctk
      - hook
      - create-symlinks
      - --link
      - ../card3::/dev/dri/by-path/pci-0000:81:00.0-card
      - --link
      - ../renderD130::/dev/dri/by-path/pci-0000:81:00.0-render
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
    - args:
      - nvidia-ctk
      - hook
      - create-symlinks
      - --link
      - ../card2::/dev/dri/by-path/pci-0000:c2:00.0-card
      - --link
      - ../renderD129::/dev/dri/by-path/pci-0000:c2:00.0-render
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
  name: all
kind: nvidia.com/gpu
INFO[0000] Found 5 CDI devices
nvidia.com/gpu=0
nvidia.com/gpu=1
nvidia.com/gpu=2
nvidia.com/gpu=4
nvidia.com/gpu=all
  • Rootfull Docker version 26.0.0, build 2ae903e
  • Rootless Docker version 26.0.0, build 2ae903e (install script)

The issue:
When no-cgroups = false CDI injection works fine for the regular Docker instance:

$ docker run --rm -ti --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=nvidia.com/gpu=all ubuntu nvidia-smi -L
GPU 0: NVIDIA A100-SXM4-40GB (UUID: GPU-b6022b4d-71db-8f15-15de-26a719f6b3e1)
GPU 1: NVIDIA A100-SXM4-40GB (UUID: GPU-22420f7d-6edb-e44a-c322-4ce539cade19)
GPU 2: NVIDIA A100-SXM4-40GB (UUID: GPU-5e3444e2-8577-0e99-c6ee-72f6eb2bd28c)
GPU 3: NVIDIA A100-SXM4-40GB (UUID: GPU-dd1f811d-a280-7e2e-bf7e-b84f7a977cc1)

but produces the following errors for the rootless version:

$ docker run --rm -ti --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=nvidia.com/gpu=all ubuntu nvidia-smi -L
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: could not apply required modification to OCI specification: error modifying OCI spec: failed to inject CDI devices: unresolvable CDI devices nvidia.com/gpu=all: unknown.

Running docker run --rm --gpus all ubuntu nvidia-smi results in the same error as without OCI. This seems to be consistent across all variations listed on the Specialized Configurations for Docker page:

docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: mount error: failed to add device rules: unable to find any existing device filters attached to the cgroup: bpf_prog_query(BPF_CGROUP_DEVICE) failed: operation not permitted: unknown.

Interestingly, setting no-cgroups = true disables the regular use of GPUs with rootful Docker:

$ docker run --rm --gpus all ubuntu nvidia-smi
Failed to initialize NVML: Unknown Error

but still allows for CDI injections:

$ docker run --rm -ti --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=nvidia.com/gpu=all ubuntu nvidia-smi -L
GPU 0: NVIDIA A100-SXM4-40GB (UUID: GPU-b6022b4d-71db-8f15-15de-26a719f6b3e1)
GPU 1: NVIDIA A100-SXM4-40GB (UUID: GPU-22420f7d-6edb-e44a-c322-4ce539cade19)
GPU 2: NVIDIA A100-SXM4-40GB (UUID: GPU-5e3444e2-8577-0e99-c6ee-72f6eb2bd28c)
GPU 3: NVIDIA A100-SXM4-40GB (UUID: GPU-dd1f811d-a280-7e2e-bf7e-b84f7a977cc1)

With control groups disabled, the rootless daemon is able to use exposed GPUs as outlined in the Docker docs:

$ docker run -it --rm --gpus '"device=0,2"' ubuntu nvidia-smi
Mon Apr  1 16:33:52 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07             Driver Version: 535.161.07   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A100-SXM4-40GB          Off | 00000000:01:00.0 Off |                    0 |
| N/A   37C    P0              60W / 275W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA A100-SXM4-40GB          Off | 00000000:81:00.0 Off |                    0 |
| N/A   36C    P0              56W / 275W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

TLDR
Disabling c-groups allows the rootless containers to use exposed GPUs using the regular docker run --gpus flag. This in turn disables the rootful container's GPU access. Leaving control groups enabled reverses the effect, as outlined in #85 .

Disabling c-groups and using Nvidia CDI, the rootful Docker can still use GPU injection, even though regular GPU access is barred, while the rootless container uses the exposed GPUs. CDI injection for rootless fails in both cases, however.

This seems like a definite improvement, but I'm not sure it's intended behavior. The CDI injection failing with rootless regardless of control group setting leads me to believe this is unintended, unless rootless is not yet supported by Nvidia CDI.

Any insights would be greatly appreciated!

@LukasIAO LukasIAO changed the title Rootless Docker OCI: error modifying OCI spec: failed to inject CDI devices: unresolvable CDI devices nvidia.com/gpu=all: unknown. Rootless Docker CDI Injection: error modifying OCI spec: failed to inject CDI devices: unresolvable CDI devices nvidia.com/gpu=all: unknown. Apr 1, 2024
@elezar
Copy link
Member

elezar commented Apr 2, 2024

The error:

docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: could not apply required modification to OCI specification: error modifying OCI spec: failed to inject CDI devices: unresolvable CDI devices nvidia.com/gpu=all: unknown.

Indidates that rootless docker cannot find the CDI specifications that were generated. As far as I am aware, rootless docker modifies the path used for /etc (and other paths) and this is what could be causing issues here for the runtime.

Since you're using a docker version that supports CDI (as an opt-in feature, I believe). Could you try the native CDI injection here.

Running:

nvidia-ctk runtime configure --runtime=docker --cdi.enabled

and restarting the docker daemon should enable this feature. (Note that the command may need to be adjusted for rootless mode to specify the config file path explicitly as per https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#rootless-mode).

Then with the CDI feature enabled in docker you should be able to run:

$ docker run --rm -ti --device=nvidia.com/gpu=all ubuntu nvidia-smi -L

and have the devices injected without using the nvidia runtime.

@LukasIAO
Copy link
Author

LukasIAO commented Apr 2, 2024

Hey @elezar, thank you for taking the time!

CDI injection seems to be a mainline feature in Docker 26.0.0. though it is till experimental, it no longer requires the user to set DOCKER_CLI_EXPERIMENTAL, as was the case in 25.x.

The native injection worked on rootful after configuring the daemon as suggested, though the rootless Docker still runs into issues as listed below.

Before applying the suggested configurations I tested the following on rootless:

$ docker run --rm -ti --device=nvidia.com/gpu=all ubuntu nvidia-smi -L
docker: Error response from daemon: could not select device driver "cdi" with capabilities: [].
ERRO[0000] error waiting for container: context canceled

$ docker run --rm -ti --runtime=nvidia --device=nvidia.com/gpu=all ubuntu nvidia-smi -L
docker: Error response from daemon: could not select device driver "cdi" with capabilities: [].

$ docker run --rm -ti --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=nvidia.com/gpu=all ubuntu nvidia-smi -L
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: could not apply required modification to OCI specification: error modifying OCI spec: failed to inject CDI devices: unresolvable CDI devices nvidia.com/gpu=all: unknown.

After applying the configuration with nvidia-ctk runtime configure --runtime=docker --cdi.enabled --config=$HOME/.config/docker/daemon.json the daemon.json looks like this:

{
    "features": {
        "cdi": true
    },
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}

Restarting Docker and testing the CDI injections again lead to the following regardless of c-group setting:

$ docker run --rm -ti --device=nvidia.com/gpu=all ubuntu nvidia-smi -L
docker: Error response from daemon: CDI device injection failed: unresolvable CDI devices nvidia.com/gpu=all.

$ docker run --rm -ti --runtime=nvidia --device=nvidia.com/gpu=all ubuntu nvidia-smi -L
docker: Error response from daemon: CDI device injection failed: unresolvable CDI devices nvidia.com/gpu=all.

$ docker run --rm -ti --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=nvidia.com/gpu=all ubuntu nvidia-smi -L
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: could not apply required modification to OCI specification: error modifying OCI spec: failed to inject CDI devices: unresolvable CDI devices nvidia.com/gpu=all: unknown.

I checked the location for the configurations for both docker clients:

rootless (click to expand)
Client:
 Version:    26.0.0
 Context:    rootless
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.13.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.5.0
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 4
  Running: 0
  Paused: 0
  Stopped: 4
 Images: 3
 Server Version: 26.0.0
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: false
  userxattr: true
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 CDI spec directories:
  /etc/cdi
  /var/run/cdi
 Swarm: inactive
 Runtimes: nvidia runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7c3aca7a610df76212171d200ca3811ff6096eb8
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  rootless
  cgroupns
 Kernel Version: 5.15.0-1047-nvidia
 Operating System: Ubuntu 22.04.4 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 128
 Total Memory: 503.5GiB
 Name: DGX-Station-A100-920-23487-2530-0R0
 ID: 48ae789a-3d2d-43d8-841a-9a34c9bdc46e
 Docker Root Dir: /home/ver23371/.local/share/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine

WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support
WARNING: No cpu shares support
WARNING: No cpuset support
WARNING: No io.weight support
WARNING: No io.weight (per device) support
WARNING: No io.max (rbps) support
WARNING: No io.max (wbps) support
WARNING: No io.max (riops) support
WARNING: No io.max (wiops) support
rootful (click to expand)
Client: Docker Engine - Community
 Version:    26.0.0
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.13.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.5.0
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 8
  Running: 0
  Paused: 0
  Stopped: 8
 Images: 52
 Server Version: 26.0.0
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 CDI spec directories:
  /etc/cdi
  /var/run/cdi
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 nvidia runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: ae07eda36dd25f8a1b98dfbf587313b99c0190bb
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.15.0-1047-nvidia
 Operating System: Ubuntu 22.04.4 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 128
 Total Memory: 503.5GiB
 Name: DGX-Station-A100-920-23487-2530-0R0
 ID: a59ada2d-f489-4072-9c54-4d7a3efa0906
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Both point to:

 CDI spec directories:
  /etc/cdi
  /var/run/cdi

However, it looks like nothing was created under /var/run/cdi. Permissions for nvidia.yaml:

/etc/cdi$ ls -la
total 32
drwxr-xr-x   2 root root  4096 ožu  29 23:22 .
drwxr-xr-x 167 root root 12288 ožu  29 23:22 ..
-rw-r--r--   1 root root 13203 ožu  29 23:22 nvidia.yaml

The Docker docs for enabling CDI devices suggest manually setting the spec location, but it does not seem to make a difference in this case.

{
    "features": {
        "cdi": true
    },
    "cdi-spec-dirs": ["/etc/cdi/", "/var/run/cdi"],
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}

@elezar elezar self-assigned this Apr 2, 2024
@elezar
Copy link
Member

elezar commented Apr 2, 2024

Could you try generate (or copy) a CDI spec to /var/run/cdi in addition to /etc/cdi and see if this fixes the rootless case.

@LukasIAO
Copy link
Author

LukasIAO commented Apr 3, 2024

I copied the yaml to /var/run/cdi, restarted both Dockers, and tested again. Unfortunatly, there was no change in behavior.

/var/run/cdi$ ls -la
total 16
drwxr-xr-x  2 root root    60 tra   3 10:02 .
drwxr-xr-x 51 root root  1580 tra   3 10:02 ..
-rw-r--r--  1 root root 13203 tra   3 10:02 nvidia.yaml

@elezar
Copy link
Member

elezar commented Apr 3, 2024

I think the key is the following: https://github.com/moby/moby/blob/8599f2a3fb884afcbbf1471ec793fbcbc327cd35/cmd/dockerd/docker.go#L65C1-L72C1

I would assume that for the docker daemon running with the rootless kit, the path where it is trying to resolve the CDI device specifications is not /var/run/cdi or /etc/cdi. It may be good to create an issue (or transfer this one) to https://github.com/moby/moby so that we can get input from the developers there as to where these paths map to.

It may be sufficient to copy the spec file to a location that is readable by the daemon to confirm.

Note that plugins are also handled differently for rootless mode: https://github.com/moby/moby/blob/8599f2a3fb884afcbbf1471ec793fbcbc327cd35/pkg/plugins/discovery_unix.go#L11

@klueska
Copy link
Contributor

klueska commented Apr 3, 2024

I wonder if this implies that the "correct" location for rootless is $HOME/.docker/cdi or $HOME/.docker/run/cdi?

@LukasIAO
Copy link
Author

LukasIAO commented Apr 3, 2024

I just tested @klueska idea, by copying the yaml to $HOME/.docker/cdi and $HOME/.docker/run/cdi respectively, and specifying the custom location in the daemon.

{
    "features": {
        "cdi": true
    },
    "cdi-spec-dirs": ["/home/username/.docker/cdi/", "/home/username/.docker/run/cdi/"],
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}
CDI spec directories:
  /home/username/.docker/cdi/
  /home/username/.docker/run/cdi/

With this change, the native CDI injection does indeed run on rootless.

/.config/docker$ docker run --rm -ti --device=nvidia.com/gpu=all ubuntu nvidia-smi -L
GPU 0: NVIDIA A100-SXM4-40GB (UUID: GPU-b6022b4d-71db-8f15-15de-26a719f6b3e1)
GPU 1: NVIDIA A100-SXM4-40GB (UUID: GPU-22420f7d-6edb-e44a-c322-4ce539cade19)
GPU 2: NVIDIA A100-SXM4-40GB (UUID: GPU-5e3444e2-8577-0e99-c6ee-72f6eb2bd28c)
GPU 3: NVIDIA A100-SXM4-40GB (UUID: GPU-dd1f811d-a280-7e2e-bf7e-b84f7a977cc1)

@klueska
Copy link
Contributor

klueska commented Apr 3, 2024

It's good to know there is a path to making this work. I'd be interested to know if these are the "default" locations if you remove cdi-spec-dirs entirely.

@elezar
Copy link
Member

elezar commented Apr 3, 2024

It's good to know there is a path to making this work. I'd be interested to know if these are the "default" locations if you remove cdi-spec-dirs entirely.

I would be surprised if this is the case since iirc we explicitly set /etc/cdi and /var/run/cdi in the Daemon.

@LukasIAO
Copy link
Author

LukasIAO commented Apr 3, 2024

You can see the Docker info of the rootles client in my original reply to @elezar. Before specifying it explicitly, I wanted to check where the client was looking for the config. Once CDI is enabled, both rootless and rootful seems to default to:

CDI spec directories:
  /etc/cdi
  /var/run/cdi

The choice of ./docker/cdi seemed fitting, however.

@klueska
Copy link
Contributor

klueska commented Apr 3, 2024

That seems like a bug that should be filed against moby/docker then.

@LukasIAO
Copy link
Author

LukasIAO commented Apr 3, 2024

It might also be worth including in the documentation for the CDI, that a rootless Docker client requires the yaml to be generated/moved to a location the daemon has access to, wherever that may end up being.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants