Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TMPDIR isn't used for committing containers when connecting to a remote Podman service #20839

Open
primeos-work opened this issue Nov 29, 2023 · 3 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. remote Problem is in podman-remote

Comments

@primeos-work
Copy link

primeos-work commented Nov 29, 2023

Issue Description

Podman supports the TMPDIR environment variable to "Set the temporary storage location of downloaded container images. Podman defaults to use /var/tmp."

This works as expected until one connects to a Podman service (--remote, --connection, $CONTAINER_HOST, etc.) despite the Podman service having $TMPDIR set.
Most temporary files still end up in TMPDIR, only(?) the files prefixed with container_images_storage end up in /var/tmp instead of $TMPDIR.

See containers/image#2197 for technical details.
I first reported this issue at containers/image (containers/image#2197) as it could(/should?) be prevented there as well (which should ideally be done too, IMO) but I guess it should mainly be fixed on Podman's side (s. containers/image#2197 (comment)) by setting sys.BigFilesTemporaryDir to $TMPDIR for that code path as well.

I've hit this issue on a RHEL9 system and reproduced this on a Fedora 38 system with Podman 4.8.0 using the steps below.

Some relevant output to confirm that the setup should be correct:

[michael@groot ~]$ echo "$TMPDIR"
/tmp/podman
[michael@groot ~]$ systemctl --user cat podman.service | tail -n3
# /home/michael/.config/systemd/user/podman.service.d/override.conf
[Service]
Environment="TMPDIR=/tmp/podman"
[michael@groot ~]$ podman --version
podman version 4.8.0
[michael@groot ~]$ podman --remote info | grep -e imageCopyTmpDir -e "/var/tmp" -e APIVersion
  imageCopyTmpDir: /tmp/podman
  APIVersion: 4.8.0

cc @mtrmac (FYI / due to the other issue in the c/image repo)

Steps to reproduce the issue

Steps to reproduce the issue:

  1. Optional: Podman 4.8.0 is currently in testing so I used this command: dnf install podman --enablerepo=updates-testing,updates-testing-modular --best
  2. Start/enable the Podman service/socket: systemctl --user start podman.socket
  3. Set $TMPDIR (Podman service + shell env (the latter likely shouldn't matter))
  4. Build a simple (ideally somewhat big) image and watch /var/tmp for files (find /var/tmp/ -maxdepth 1 -name "container_images_storage*")

Note: I used a Fedora 38 system for testing.

One can also temporarily restrict Podman from writing to /var/tmp to trigger such errors instead:

Error: committing container for step {Env:[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] Command:run Args:[apt update] Flags:[] Attrs:map[] Message:RUN apt update Heredocs:[] Original:RUN apt update}: copying layers and metadata for container "475cfa0613faf663f0b20fa094f96f6cf4b0c7bf5a154429abc7c785e5b54c1c": initializing destination containers-storage:[overlay@/home/michael/.local/share/containers/storage+/run/user/1000/containers]docker.io/library/cd2cce4951a9345feeeedf577f0e69161821c27b779b798fe9ee9abad11e96ae-tmp:latest: creating a temporary directory: mkdir /var/tmp/container_images_storage141965151: permission denied

Describe the results you received

Podman creates files/directories in /var/tmp. This became an issue on a RHEL9 system where /var/tmp is on a tmpfs and the container images/layers are so large that it cannot fit (RAM+swap too small).

Describe the results you expected

All temporary files should end up in $TMPDIR.

podman info output

host:
  arch: amd64
  buildahVersion: 1.33.2
  cgroupControllers:
  - cpu
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.7-2.fc38.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.7, commit: '
  cpuUtilization:
    idlePercent: 99.84
    systemPercent: 0.12
    userPercent: 0.04
  cpus: 4
  databaseBackend: boltdb
  distribution:
    distribution: fedora
    version: "38"
  eventLogger: journald
  freeLocks: 2046
  hostname: groot
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 6.5.10-200.fc38.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 1229758464
  memTotal: 8273833984
  networkBackend: cni
  networkBackendInfo:
    backend: cni
    dns: {}
    package: containernetworking-plugins-1.3.0-2.fc38.x86_64
    path: /usr/libexec/cni
  ociRuntime:
    name: crun
    package: crun-1.11.2-1.fc38.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.11.2
      commit: ab0edeef1c331840b025e8f1d38090cfb8a0509d
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20231107.g56d9f6d-1.fc38.x86_64
    version: |
      pasta 0^20231107.g56d9f6d-1.fc38.x86_64-pasta
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.2-1.fc38.x86_64
    version: |-
      slirp4netns version 1.2.2
      commit: 0ee2d87523e906518d34a6b423271e4826f71faf
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 8272998400
  swapTotal: 8273260544
  uptime: 293h 14m 19.00s (Approximately 12.21 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /home/michael/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/michael/.local/share/containers/storage
  graphRootAllocated: 80520151040
  graphRootUsed: 32686235648
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /tmp/podman
  imageStore:
    number: 24
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/michael/.local/share/containers/storage/volumes
version:
  APIVersion: 4.8.0
  Built: 1701165510
  BuiltTime: Tue Nov 28 10:58:30 2023
  GitCommit: ""
  GoVersion: go1.20.11
  Os: linux
  OsArch: linux/amd64
  Version: 4.8.0

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

A "normal" test/dev VM with Fedora Linux.

Additional information

Only happens when connecting to a Podman service - doesn't matter if privileged or rootless.

@primeos-work primeos-work added the kind/bug Categorizes issue or PR as related to a bug. label Nov 29, 2023
@github-actions github-actions bot added the remote Problem is in podman-remote label Nov 29, 2023
@flouthoc
Copy link
Collaborator

@primeos-work I think the issue is happening that env TMPDIR is not really set for the podman server running as service. I suspect issue is with your podman.service file, I face similar issue when I run service with no ENV defined and works fine when I manually run service on a separate shell using TMPDIR=/tmp ./podman --log-level debug system service --time 0

Can you share entire service file ?

If it helps there is a field to configure image_copy_tmp_dir directly in containers.conf.

@primeos-work
Copy link
Author

@flouthoc thanks a lot for the quick reply and trying to reproduce this issue! :)

That's weird that it seems to work in your case. Are you sure that no "container_images_storage*" files ended up in /var/tmp?
I did carefully check that the TMPDIR env var is set for the Podman service process. I did the following checks to be sure:

  • I observed that the other temporary files (auth.json.*, buildah*, and libpod_builder*) end up in $TMPDIR
  • podman --remote info shows imageCopyTmpDir: /tmp/podman instead of imageCopyTmpDir: /var/tmp
  • I even used this quick and dirty hack to verify that the env var is set: grep -a TMPDIR "/proc/$(pgrep -a podman | grep "system service" | cut -d" " -f1)/environ"

The best test might be to restrict the user or systemd service from writing to /var/tmp (or use inotify to watch /var/tmp, etc.).

Can you share entire service file ?

Sure :)

systemctl-cat output (I only added the last file via systemctl-edit)
[michael@groot ~]$ systemctl --user cat podman.socket
# /usr/lib/systemd/user/podman.socket
[Unit]
Description=Podman API Socket
Documentation=man:podman-system-service(1)

[Socket]
ListenStream=%t/podman/podman.sock
SocketMode=0660

[Install]
WantedBy=sockets.target
[michael@groot ~]$ systemctl --user cat podman.service
# /usr/lib/systemd/user/podman.service
[Unit]
Description=Podman API Service
Requires=podman.socket
After=podman.socket
Documentation=man:podman-system-service(1)
StartLimitIntervalSec=0

[Service]
Delegate=true
Type=exec
KillMode=process
Environment=LOGGING="--log-level=info"
ExecStart=/usr/bin/podman $LOGGING system service

[Install]
WantedBy=default.target

# /usr/lib/systemd/user/service.d/10-timeout-abort.conf
# This file is part of the systemd package.
# See https://fedoraproject.org/wiki/Changes/Shorter_Shutdown_Timer.
#
# To facilitate debugging when a service fails to stop cleanly,
# TimeoutStopFailureMode=abort is set to "crash" services that fail to stop in
# the time allotted. This will cause the service to be terminated with SIGABRT
# and a coredump to be generated.
#
# To undo this configuration change, create a mask file:
#   sudo mkdir -p /etc/systemd/user/service.d
#   sudo ln -sv /dev/null /etc/systemd/user/service.d/10-timeout-abort.conf

[Service]
TimeoutStopFailureMode=abort

# /home/michael/.config/systemd/user/podman.service.d/override.conf
[Service]
Environment="TMPDIR=/tmp/podman"

I also added the engine.image_copy_tmp_dir setting to my config to be sure:

[michael@groot ~]$ cat ~/.config/containers/containers.conf
[engine]
image_copy_tmp_dir="/tmp/podman2"

The container_images_storage* directories/files still end up in /var/tmp though.

@flouthoc
Copy link
Collaborator

My bad I did not actually verify if files are being created in different directory I only checked value of imageCopyTmpDir in output of podman info, so I'll have to try again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. remote Problem is in podman-remote
Projects
None yet
Development

No branches or pull requests

2 participants