Rootless Docker CDI Injection: error modifying OCI spec: failed to inject CDI devices: unresolvable CDI devices nvidia.com/gpu=all: unknown. #434

LukasIAO · 2024-03-31T10:32:17Z

Hello everyone,

we have recently set up a rootless docker instance alongside our existing docker on one of our servers, but ran into issues mounting host GPUs into the rootless containers. A workaround was presented in issue #85 (toggling no-cgroups to switch between rootful and rootless) with a mention of a better solution in the form of Nvidia CDI coming as an experimental feature in Docker 25.

After updating to the newest Docker releases and setting up CDI, our regular Docker instance behaved as we expected based on the documentation, but the rootless instance still runs into issues.

Setup to reproduce:

Distributor ID: Ubuntu
Description:    Ubuntu 22.04.4 LTS
Release:        22.04
Codename:       jammy

NVIDIA Container Toolkit CLI version 1.14.6
commit: 5605d191332dcfeea802c4497360d60a65c7887e

rootless: containerd github.com/containerd/containerd v1.7.13 7c3aca7a610df76212171d200ca3811ff6096eb8
rootful: containerd containerd.io 1.6.28 ae07eda36dd25f8a1b98dfbf587313b99c0190bb
+---------------------------------------------------------------------------------------+`
| NVIDIA-SMI 535.161.07             Driver Version: 535.161.07   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A100-SXM4-40GB          On  | 00000000:01:00.0 Off |                    0 |
| N/A   40C    P0              61W / 275W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA A100-SXM4-40GB          On  | 00000000:47:00.0 Off |                    0 |
| N/A   39C    P0              55W / 275W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   2  NVIDIA A100-SXM4-40GB          On  | 00000000:81:00.0 Off |                    0 |
| N/A   39C    P0              57W / 275W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   3  NVIDIA DGX Display             On  | 00000000:C1:00.0 Off |                  N/A |
| 34%   41C    P8              N/A /  50W |      1MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   4  NVIDIA A100-SXM4-40GB          On  | 00000000:C2:00.0 Off |                    0 |
| N/A   39C    P0              58W / 275W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

config.toml (click to expand)

#accept-nvidia-visible-devices-as-volume-mounts = false
#accept-nvidia-visible-devices-envvar-when-unprivileged = true
disable-require = false
#swarm-resource = "DOCKER_RESOURCE_GPU"

[nvidia-container-cli]
#debug = "/var/log/nvidia-container-toolkit.log"
environment = []
#ldcache = "/etc/ld.so.cache"
ldconfig = "@/sbin/ldconfig.real"
load-kmods = true
#no-cgroups = true
#no-cgroups = false
#path = "/usr/bin/nvidia-container-cli"
#root = "/run/nvidia/driver"
#user = "root:video"

[nvidia-container-runtime]
#debug = "/var/log/nvidia-container-runtime.log"
log-level = "info"
mode = "auto"
runtimes = ["docker-runc", "runc"]

[nvidia-container-runtime.modes]

[nvidia-container-runtime.modes.csv]
mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"

sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml

nvidia.yaml (click to expand)

cdiVersion: 0.5.0
containerEdits:
  deviceNodes:
  - path: /dev/nvidia-modeset
  - path: /dev/nvidia-uvm
  - path: /dev/nvidia-uvm-tools
  - path: /dev/nvidiactl
  hooks:
  - args:
    - nvidia-ctk
    - hook
    - create-symlinks
    - --link
    - libglxserver_nvidia.so.535.161.07::/lib/x86_64-linux-gnu/nvidia/xorg/libglxserver_nvidia.so
    hookName: createContainer
    path: /usr/bin/nvidia-ctk
  - args:
    - nvidia-ctk
    - hook
    - update-ldcache
    - --folder
    - /lib/x86_64-linux-gnu
    hookName: createContainer
    path: /usr/bin/nvidia-ctk
  mounts:
  - containerPath: /lib/x86_64-linux-gnu/libEGL_nvidia.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libEGL_nvidia.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libGLESv2_nvidia.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libGLESv2_nvidia.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libGLX_nvidia.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libGLX_nvidia.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libcuda.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libcuda.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libcudadebugger.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libcudadebugger.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvcuvid.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvcuvid.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-allocator.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-allocator.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-cfg.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-cfg.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-egl-gbm.so.1.1.0
    hostPath: /lib/x86_64-linux-gnu/libnvidia-egl-gbm.so.1.1.0
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-eglcore.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-eglcore.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-encode.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-encode.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-fbc.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-fbc.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-glcore.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-glcore.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-glsi.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-glsi.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-ml.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-ml.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-ngx.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-ngx.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-nscq.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-nscq.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-nvvm.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-nvvm.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-opencl.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-opencl.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-opticalflow.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-opticalflow.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-pkcs11-openssl3.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-pkcs11-openssl3.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-pkcs11.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-pkcs11.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-rtcore.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-rtcore.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-tls.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-tls.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvidia-vulkan-producer.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvidia-vulkan-producer.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/libnvoptix.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/libnvoptix.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /run/nvidia-persistenced/socket
    hostPath: /run/nvidia-persistenced/socket
    options:
    - ro
    - nosuid
    - nodev
    - bind
    - noexec
  - containerPath: /usr/bin/nvidia-cuda-mps-control
    hostPath: /usr/bin/nvidia-cuda-mps-control
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/bin/nvidia-cuda-mps-server
    hostPath: /usr/bin/nvidia-cuda-mps-server
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/bin/nvidia-debugdump
    hostPath: /usr/bin/nvidia-debugdump
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/bin/nvidia-persistenced
    hostPath: /usr/bin/nvidia-persistenced
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/bin/nvidia-smi
    hostPath: /usr/bin/nvidia-smi
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/share/nvidia/nvoptix.bin
    hostPath: /usr/share/nvidia/nvoptix.bin
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/firmware/nvidia/535.161.07/gsp_ga10x.bin
    hostPath: /lib/firmware/nvidia/535.161.07/gsp_ga10x.bin
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/firmware/nvidia/535.161.07/gsp_tu10x.bin
    hostPath: /lib/firmware/nvidia/535.161.07/gsp_tu10x.bin
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/nvidia/xorg/libglxserver_nvidia.so.535.161.07
    hostPath: /lib/x86_64-linux-gnu/nvidia/xorg/libglxserver_nvidia.so.535.161.07
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /lib/x86_64-linux-gnu/nvidia/xorg/nvidia_drv.so
    hostPath: /lib/x86_64-linux-gnu/nvidia/xorg/nvidia_drv.so
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/share/X11/xorg.conf.d/10-nvidia.conf
    hostPath: /usr/share/X11/xorg.conf.d/10-nvidia.conf
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/share/egl/egl_external_platform.d/15_nvidia_gbm.json
    hostPath: /usr/share/egl/egl_external_platform.d/15_nvidia_gbm.json
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/share/glvnd/egl_vendor.d/10_nvidia.json
    hostPath: /usr/share/glvnd/egl_vendor.d/10_nvidia.json
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/share/vulkan/icd.d/nvidia_icd.json
    hostPath: /usr/share/vulkan/icd.d/nvidia_icd.json
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/share/vulkan/implicit_layer.d/nvidia_layers.json
    hostPath: /usr/share/vulkan/implicit_layer.d/nvidia_layers.json
    options:
    - ro
    - nosuid
    - nodev
    - bind
devices:
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia4
    - path: /dev/dri/card5
    - path: /dev/dri/renderD132
    hooks:
    - args:
      - nvidia-ctk
      - hook
      - create-symlinks
      - --link
      - ../card5::/dev/dri/by-path/pci-0000:01:00.0-card
      - --link
      - ../renderD132::/dev/dri/by-path/pci-0000:01:00.0-render
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
    - args:
      - nvidia-ctk
      - hook
      - chmod
      - --mode
      - "755"
      - --path
      - /dev/dri
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
  name: "0"
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia3
    - path: /dev/dri/card4
    - path: /dev/dri/renderD131
    hooks:
    - args:
      - nvidia-ctk
      - hook
      - create-symlinks
      - --link
      - ../card4::/dev/dri/by-path/pci-0000:47:00.0-card
      - --link
      - ../renderD131::/dev/dri/by-path/pci-0000:47:00.0-render
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
    - args:
      - nvidia-ctk
      - hook
      - chmod
      - --mode
      - "755"
      - --path
      - /dev/dri
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
  name: "1"
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia2
    - path: /dev/dri/card3
    - path: /dev/dri/renderD130
    hooks:
    - args:
      - nvidia-ctk
      - hook
      - create-symlinks
      - --link
      - ../card3::/dev/dri/by-path/pci-0000:81:00.0-card
      - --link
      - ../renderD130::/dev/dri/by-path/pci-0000:81:00.0-render
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
    - args:
      - nvidia-ctk
      - hook
      - chmod
      - --mode
      - "755"
      - --path
      - /dev/dri
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
  name: "2"
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia1
    - path: /dev/dri/card2
    - path: /dev/dri/renderD129
    hooks:
    - args:
      - nvidia-ctk
      - hook
      - create-symlinks
      - --link
      - ../card2::/dev/dri/by-path/pci-0000:c2:00.0-card
      - --link
      - ../renderD129::/dev/dri/by-path/pci-0000:c2:00.0-render
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
    - args:
      - nvidia-ctk
      - hook
      - chmod
      - --mode
      - "755"
      - --path
      - /dev/dri
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
  name: "4"
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia1
    - path: /dev/nvidia2
    - path: /dev/nvidia3
    - path: /dev/nvidia4
    - path: /dev/dri/card2
    - path: /dev/dri/card3
    - path: /dev/dri/card4
    - path: /dev/dri/card5
    - path: /dev/dri/renderD129
    - path: /dev/dri/renderD130
    - path: /dev/dri/renderD131
    - path: /dev/dri/renderD132
    hooks:
    - args:
      - nvidia-ctk
      - hook
      - create-symlinks
      - --link
      - ../card5::/dev/dri/by-path/pci-0000:01:00.0-card
      - --link
      - ../renderD132::/dev/dri/by-path/pci-0000:01:00.0-render
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
    - args:
      - nvidia-ctk
      - hook
      - chmod
      - --mode
      - "755"
      - --path
      - /dev/dri
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
    - args:
      - nvidia-ctk
      - hook
      - create-symlinks
      - --link
      - ../card4::/dev/dri/by-path/pci-0000:47:00.0-card
      - --link
      - ../renderD131::/dev/dri/by-path/pci-0000:47:00.0-render
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
    - args:
      - nvidia-ctk
      - hook
      - create-symlinks
      - --link
      - ../card3::/dev/dri/by-path/pci-0000:81:00.0-card
      - --link
      - ../renderD130::/dev/dri/by-path/pci-0000:81:00.0-render
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
    - args:
      - nvidia-ctk
      - hook
      - create-symlinks
      - --link
      - ../card2::/dev/dri/by-path/pci-0000:c2:00.0-card
      - --link
      - ../renderD129::/dev/dri/by-path/pci-0000:c2:00.0-render
      hookName: createContainer
      path: /usr/bin/nvidia-ctk
  name: all
kind: nvidia.com/gpu

INFO[0000] Found 5 CDI devices
nvidia.com/gpu=0
nvidia.com/gpu=1
nvidia.com/gpu=2
nvidia.com/gpu=4
nvidia.com/gpu=all

Rootfull Docker version 26.0.0, build 2ae903e
Rootless Docker version 26.0.0, build 2ae903e (install script)

The issue:
When no-cgroups = false CDI injection works fine for the regular Docker instance:

$ docker run --rm -ti --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=nvidia.com/gpu=all ubuntu nvidia-smi -L
GPU 0: NVIDIA A100-SXM4-40GB (UUID: GPU-b6022b4d-71db-8f15-15de-26a719f6b3e1)
GPU 1: NVIDIA A100-SXM4-40GB (UUID: GPU-22420f7d-6edb-e44a-c322-4ce539cade19)
GPU 2: NVIDIA A100-SXM4-40GB (UUID: GPU-5e3444e2-8577-0e99-c6ee-72f6eb2bd28c)
GPU 3: NVIDIA A100-SXM4-40GB (UUID: GPU-dd1f811d-a280-7e2e-bf7e-b84f7a977cc1)

but produces the following errors for the rootless version:

$ docker run --rm -ti --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=nvidia.com/gpu=all ubuntu nvidia-smi -L
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: could not apply required modification to OCI specification: error modifying OCI spec: failed to inject CDI devices: unresolvable CDI devices nvidia.com/gpu=all: unknown.

Running docker run --rm --gpus all ubuntu nvidia-smi results in the same error as without OCI. This seems to be consistent across all variations listed on the Specialized Configurations for Docker page:

docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: mount error: failed to add device rules: unable to find any existing device filters attached to the cgroup: bpf_prog_query(BPF_CGROUP_DEVICE) failed: operation not permitted: unknown.

Interestingly, setting no-cgroups = true disables the regular use of GPUs with rootful Docker:

$ docker run --rm --gpus all ubuntu nvidia-smi
Failed to initialize NVML: Unknown Error

but still allows for CDI injections:

$ docker run --rm -ti --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=nvidia.com/gpu=all ubuntu nvidia-smi -L
GPU 0: NVIDIA A100-SXM4-40GB (UUID: GPU-b6022b4d-71db-8f15-15de-26a719f6b3e1)
GPU 1: NVIDIA A100-SXM4-40GB (UUID: GPU-22420f7d-6edb-e44a-c322-4ce539cade19)
GPU 2: NVIDIA A100-SXM4-40GB (UUID: GPU-5e3444e2-8577-0e99-c6ee-72f6eb2bd28c)
GPU 3: NVIDIA A100-SXM4-40GB (UUID: GPU-dd1f811d-a280-7e2e-bf7e-b84f7a977cc1)

With control groups disabled, the rootless daemon is able to use exposed GPUs as outlined in the Docker docs:

$ docker run -it --rm --gpus '"device=0,2"' ubuntu nvidia-smi
Mon Apr  1 16:33:52 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07             Driver Version: 535.161.07   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A100-SXM4-40GB          Off | 00000000:01:00.0 Off |                    0 |
| N/A   37C    P0              60W / 275W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA A100-SXM4-40GB          Off | 00000000:81:00.0 Off |                    0 |
| N/A   36C    P0              56W / 275W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

TLDR
Disabling c-groups allows the rootless containers to use exposed GPUs using the regular docker run --gpus flag. This in turn disables the rootful container's GPU access. Leaving control groups enabled reverses the effect, as outlined in #85 .

Disabling c-groups and using Nvidia CDI, the rootful Docker can still use GPU injection, even though regular GPU access is barred, while the rootless container uses the exposed GPUs. CDI injection for rootless fails in both cases, however.

This seems like a definite improvement, but I'm not sure it's intended behavior. The CDI injection failing with rootless regardless of control group setting leads me to believe this is unintended, unless rootless is not yet supported by Nvidia CDI.

Any insights would be greatly appreciated!

The text was updated successfully, but these errors were encountered:

elezar · 2024-04-02T09:28:50Z

The error:

docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: could not apply required modification to OCI specification: error modifying OCI spec: failed to inject CDI devices: unresolvable CDI devices nvidia.com/gpu=all: unknown.

Indidates that rootless docker cannot find the CDI specifications that were generated. As far as I am aware, rootless docker modifies the path used for /etc (and other paths) and this is what could be causing issues here for the runtime.

Since you're using a docker version that supports CDI (as an opt-in feature, I believe). Could you try the native CDI injection here.

Running:

nvidia-ctk runtime configure --runtime=docker --cdi.enabled

and restarting the docker daemon should enable this feature. (Note that the command may need to be adjusted for rootless mode to specify the config file path explicitly as per https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#rootless-mode).

Then with the CDI feature enabled in docker you should be able to run:

$ docker run --rm -ti --device=nvidia.com/gpu=all ubuntu nvidia-smi -L

and have the devices injected without using the nvidia runtime.

LukasIAO · 2024-04-02T20:23:19Z

Hey @elezar, thank you for taking the time!

CDI injection seems to be a mainline feature in Docker 26.0.0. though it is till experimental, it no longer requires the user to set DOCKER_CLI_EXPERIMENTAL, as was the case in 25.x.

The native injection worked on rootful after configuring the daemon as suggested, though the rootless Docker still runs into issues as listed below.

Before applying the suggested configurations I tested the following on rootless:

$ docker run --rm -ti --device=nvidia.com/gpu=all ubuntu nvidia-smi -L
docker: Error response from daemon: could not select device driver "cdi" with capabilities: [].
ERRO[0000] error waiting for container: context canceled

$ docker run --rm -ti --runtime=nvidia --device=nvidia.com/gpu=all ubuntu nvidia-smi -L
docker: Error response from daemon: could not select device driver "cdi" with capabilities: [].

$ docker run --rm -ti --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=nvidia.com/gpu=all ubuntu nvidia-smi -L
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: could not apply required modification to OCI specification: error modifying OCI spec: failed to inject CDI devices: unresolvable CDI devices nvidia.com/gpu=all: unknown.

After applying the configuration with nvidia-ctk runtime configure --runtime=docker --cdi.enabled --config=$HOME/.config/docker/daemon.json the daemon.json looks like this:

{
    "features": {
        "cdi": true
    },
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}

Restarting Docker and testing the CDI injections again lead to the following regardless of c-group setting:

$ docker run --rm -ti --device=nvidia.com/gpu=all ubuntu nvidia-smi -L
docker: Error response from daemon: CDI device injection failed: unresolvable CDI devices nvidia.com/gpu=all.

$ docker run --rm -ti --runtime=nvidia --device=nvidia.com/gpu=all ubuntu nvidia-smi -L
docker: Error response from daemon: CDI device injection failed: unresolvable CDI devices nvidia.com/gpu=all.

$ docker run --rm -ti --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=nvidia.com/gpu=all ubuntu nvidia-smi -L
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: could not apply required modification to OCI specification: error modifying OCI spec: failed to inject CDI devices: unresolvable CDI devices nvidia.com/gpu=all: unknown.

I checked the location for the configurations for both docker clients:

rootless (click to expand)

Client:
 Version:    26.0.0
 Context:    rootless
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.13.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.5.0
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 4
  Running: 0
  Paused: 0
  Stopped: 4
 Images: 3
 Server Version: 26.0.0
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: false
  userxattr: true
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 CDI spec directories:
  /etc/cdi
  /var/run/cdi
 Swarm: inactive
 Runtimes: nvidia runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7c3aca7a610df76212171d200ca3811ff6096eb8
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  rootless
  cgroupns
 Kernel Version: 5.15.0-1047-nvidia
 Operating System: Ubuntu 22.04.4 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 128
 Total Memory: 503.5GiB
 Name: DGX-Station-A100-920-23487-2530-0R0
 ID: 48ae789a-3d2d-43d8-841a-9a34c9bdc46e
 Docker Root Dir: /home/ver23371/.local/share/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine

WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support
WARNING: No cpu shares support
WARNING: No cpuset support
WARNING: No io.weight support
WARNING: No io.weight (per device) support
WARNING: No io.max (rbps) support
WARNING: No io.max (wbps) support
WARNING: No io.max (riops) support
WARNING: No io.max (wiops) support

rootful (click to expand)

Client: Docker Engine - Community
 Version:    26.0.0
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.13.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.5.0
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 8
  Running: 0
  Paused: 0
  Stopped: 8
 Images: 52
 Server Version: 26.0.0
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 CDI spec directories:
  /etc/cdi
  /var/run/cdi
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 nvidia runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: ae07eda36dd25f8a1b98dfbf587313b99c0190bb
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.15.0-1047-nvidia
 Operating System: Ubuntu 22.04.4 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 128
 Total Memory: 503.5GiB
 Name: DGX-Station-A100-920-23487-2530-0R0
 ID: a59ada2d-f489-4072-9c54-4d7a3efa0906
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Both point to:

 CDI spec directories:
  /etc/cdi
  /var/run/cdi

However, it looks like nothing was created under /var/run/cdi. Permissions for nvidia.yaml:

/etc/cdi$ ls -la
total 32
drwxr-xr-x   2 root root  4096 ožu  29 23:22 .
drwxr-xr-x 167 root root 12288 ožu  29 23:22 ..
-rw-r--r--   1 root root 13203 ožu  29 23:22 nvidia.yaml

The Docker docs for enabling CDI devices suggest manually setting the spec location, but it does not seem to make a difference in this case.

{
    "features": {
        "cdi": true
    },
    "cdi-spec-dirs": ["/etc/cdi/", "/var/run/cdi"],
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}

elezar · 2024-04-02T20:42:53Z

Could you try generate (or copy) a CDI spec to /var/run/cdi in addition to /etc/cdi and see if this fixes the rootless case.

LukasIAO · 2024-04-03T08:08:27Z

I copied the yaml to /var/run/cdi, restarted both Dockers, and tested again. Unfortunatly, there was no change in behavior.

/var/run/cdi$ ls -la
total 16
drwxr-xr-x  2 root root    60 tra   3 10:02 .
drwxr-xr-x 51 root root  1580 tra   3 10:02 ..
-rw-r--r--  1 root root 13203 tra   3 10:02 nvidia.yaml

elezar · 2024-04-03T08:55:33Z

I think the key is the following: https://github.com/moby/moby/blob/8599f2a3fb884afcbbf1471ec793fbcbc327cd35/cmd/dockerd/docker.go#L65C1-L72C1

I would assume that for the docker daemon running with the rootless kit, the path where it is trying to resolve the CDI device specifications is not /var/run/cdi or /etc/cdi. It may be good to create an issue (or transfer this one) to https://github.com/moby/moby so that we can get input from the developers there as to where these paths map to.

It may be sufficient to copy the spec file to a location that is readable by the daemon to confirm.

Note that plugins are also handled differently for rootless mode: https://github.com/moby/moby/blob/8599f2a3fb884afcbbf1471ec793fbcbc327cd35/pkg/plugins/discovery_unix.go#L11

klueska · 2024-04-03T09:11:17Z

I wonder if this implies that the "correct" location for rootless is $HOME/.docker/cdi or $HOME/.docker/run/cdi?

LukasIAO · 2024-04-03T09:52:54Z

I just tested @klueska idea, by copying the yaml to $HOME/.docker/cdi and $HOME/.docker/run/cdi respectively, and specifying the custom location in the daemon.

{
    "features": {
        "cdi": true
    },
    "cdi-spec-dirs": ["/home/username/.docker/cdi/", "/home/username/.docker/run/cdi/"],
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}

CDI spec directories:
  /home/username/.docker/cdi/
  /home/username/.docker/run/cdi/

With this change, the native CDI injection does indeed run on rootless.

/.config/docker$ docker run --rm -ti --device=nvidia.com/gpu=all ubuntu nvidia-smi -L
GPU 0: NVIDIA A100-SXM4-40GB (UUID: GPU-b6022b4d-71db-8f15-15de-26a719f6b3e1)
GPU 1: NVIDIA A100-SXM4-40GB (UUID: GPU-22420f7d-6edb-e44a-c322-4ce539cade19)
GPU 2: NVIDIA A100-SXM4-40GB (UUID: GPU-5e3444e2-8577-0e99-c6ee-72f6eb2bd28c)
GPU 3: NVIDIA A100-SXM4-40GB (UUID: GPU-dd1f811d-a280-7e2e-bf7e-b84f7a977cc1)

klueska · 2024-04-03T10:52:16Z

It's good to know there is a path to making this work. I'd be interested to know if these are the "default" locations if you remove cdi-spec-dirs entirely.

elezar · 2024-04-03T10:57:54Z

It's good to know there is a path to making this work. I'd be interested to know if these are the "default" locations if you remove cdi-spec-dirs entirely.

I would be surprised if this is the case since iirc we explicitly set /etc/cdi and /var/run/cdi in the Daemon.

LukasIAO · 2024-04-03T11:09:53Z

You can see the Docker info of the rootles client in my original reply to @elezar. Before specifying it explicitly, I wanted to check where the client was looking for the config. Once CDI is enabled, both rootless and rootful seems to default to:

CDI spec directories:
  /etc/cdi
  /var/run/cdi

The choice of ./docker/cdi seemed fitting, however.

klueska · 2024-04-03T11:24:01Z

That seems like a bug that should be filed against moby/docker then.

LukasIAO · 2024-04-03T11:46:37Z

It might also be worth including in the documentation for the CDI, that a rootless Docker client requires the yaml to be generated/moved to a location the daemon has access to, wherever that may end up being.

elezar self-assigned this Apr 2, 2024

LukasIAO mentioned this issue Apr 4, 2024

Default Nvidia CDI spec location on rootless kit seems to be unaccessible moby/moby#47676

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rootless Docker CDI Injection: error modifying OCI spec: failed to inject CDI devices: unresolvable CDI devices nvidia.com/gpu=all: unknown. #434

Rootless Docker CDI Injection: error modifying OCI spec: failed to inject CDI devices: unresolvable CDI devices nvidia.com/gpu=all: unknown. #434

LukasIAO commented Mar 31, 2024 •

edited

elezar commented Apr 2, 2024

LukasIAO commented Apr 2, 2024

elezar commented Apr 2, 2024

LukasIAO commented Apr 3, 2024

elezar commented Apr 3, 2024

klueska commented Apr 3, 2024 •

edited

LukasIAO commented Apr 3, 2024

klueska commented Apr 3, 2024

elezar commented Apr 3, 2024

LukasIAO commented Apr 3, 2024 •

edited

klueska commented Apr 3, 2024

LukasIAO commented Apr 3, 2024 •

edited

Rootless Docker CDI Injection: error modifying OCI spec: failed to inject CDI devices: unresolvable CDI devices nvidia.com/gpu=all: unknown. #434

Rootless Docker CDI Injection: error modifying OCI spec: failed to inject CDI devices: unresolvable CDI devices nvidia.com/gpu=all: unknown. #434

Comments

LukasIAO commented Mar 31, 2024 • edited

elezar commented Apr 2, 2024

LukasIAO commented Apr 2, 2024

elezar commented Apr 2, 2024

LukasIAO commented Apr 3, 2024

elezar commented Apr 3, 2024

klueska commented Apr 3, 2024 • edited

LukasIAO commented Apr 3, 2024

klueska commented Apr 3, 2024

elezar commented Apr 3, 2024

LukasIAO commented Apr 3, 2024 • edited

klueska commented Apr 3, 2024

LukasIAO commented Apr 3, 2024 • edited

LukasIAO commented Mar 31, 2024 •

edited

klueska commented Apr 3, 2024 •

edited

LukasIAO commented Apr 3, 2024 •

edited

LukasIAO commented Apr 3, 2024 •

edited