Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to do podman pull starting with version 4.4.1 #745

Closed
jfgaudreault-p opened this issue Feb 2, 2023 · 22 comments
Closed

Unable to do podman pull starting with version 4.4.1 #745

jfgaudreault-p opened this issue Feb 2, 2023 · 22 comments

Comments

@jfgaudreault-p
Copy link

jfgaudreault-p commented Feb 2, 2023

When I do

podman pull maptiler/tileserver-gl:v4.4.1

as a NON-root user, (or any newer version of tileserver-gl) I get an error with podman:

Error: copying system image from manifest list: writing blob: adding layer with blob "sha256:a4eaa9047bb6face9dee8828bfb8392f692ae204a4f614865a50c0ddd6f0ad78": processing tar file(potentially insufficient UIDs or GIDs available in user namespace (requested 1516583083:0 for /usr/src/app/node_modules/content-type/HISTORY.md): Check /etc/subuid and /etc/subgid if configured locally and run podman-system-migrate: lchown /usr/src/app/node_modules/content-type/HISTORY.md: invalid argument): exit status 1

This wasn't happening with v4.4.0 or before.
I looked at podman issue list and tried some solutions for similar problems, but so far I have been unsuccessful.
Any help on that would be appreciated.

Also, I am posting this here in case this has to do with the way the container is built that could have changed since it was working fine before.

I am using podman 4.3.1, but was using 3.4.4 before upgrading but geting the same error.
I am on Ubuntu 22.04 based ditro (PopOS).
The issue was also reproduced on Mac inside a linux fedora-coreos-37.20230122.2.0 install and the same problem happen as non-root.
It works as root, but that is not the recommended way for podman.

@acalcutt
Copy link
Collaborator

acalcutt commented Feb 2, 2023

does the same thing happen with v4.4.4 ?

As far as i know, the docker build process between 4.4.0 and 4.4.1 should be the same.

@jfgaudreault-p
Copy link
Author

Yes, it does, I originally wanted to use version v4.4.4 (yesterday...) to test the X-Forwarded-Host
I tracked it down to this specific version (v4.4.1) when it started.

@jfgaudreault-p
Copy link
Author

It looks like some npm package may be trying to set a specific chown for /usr/src/app/node_modules/content-type/HISTORY.md but it seems it can't because of non-root user limits/range.
I tried to change the limits/range, but will need to probably try again since it broke something forpodman when I tried. I am not so familiar with these /etc/subuid and /etc/subgid, so I am not sure of the implications of changing all of that for my user.

However if someone knows exactly what I should run to set these properly I could try that.

@scara
Copy link

scara commented Feb 2, 2023

It looks like some npm package may be trying to set a specific chown for /usr/src/app/node_modules/content-type/HISTORY.md but it seems it can't because of non-root user limits/range.

It's something that should be fixed at build time and not at runtime, due to the root-less context.

HTH,
Matteo

@acalcutt
Copy link
Collaborator

acalcutt commented Feb 2, 2023

Did you try running the 'podman-system-migrate' it suggests?

From https://docs.podman.io/en/latest/markdown/podman-system-migrate.1.html

“Rootless Podman uses a pause process to keep the unprivileged namespaces alive. This prevents any change to the /etc/subuid and /etc/subgid files from being propagated to the rootless containers while the pause process is running.

For these changes to be propagated, it is necessary to first stop all running containers associated with the user and to also stop the pause process and delete its pid file. Instead of doing it manually, podman system migrate can be used to stop both the running containers and the pause process. The /etc/subuid and /etc/subgid files can then be edited or changed with usermod to recreate the user namespace with the newly configured mappings.

@acalcutt
Copy link
Collaborator

acalcutt commented Feb 2, 2023

The docker image makes a node user at build and tries to run as that user, just FYI

@jfgaudreault-p
Copy link
Author

Did you try running the 'podman-system-migrate' it suggests?

Yes, I tried that, but it did not work. Podman was giving an unexpected error kind of message (where nothing would work at all), so I reverted back my changes. I'll try again differntly to see.

@acalcutt
Copy link
Collaborator

acalcutt commented Feb 2, 2023

Just wondering, i wasn't sure it would help. it kind of sounds like it is out of UIDs to assign.

@jfgaudreault-p
Copy link
Author

Ok, this time it worked. I edited my subuid and subgid directly. Previously I used the usermod --add-subuids and usermod --add-subgids commands which wasn't doing what I thought would do (as mentioned here: containers/podman#12715)
The only thing is that I gave it a very big range I think now, since the user uid was very high (1516583083)
user:100000:2147383647

I'm not sure if there are other implications for changing these limits and if that would cause other issues in the future.
Anyway, it would be great if podman just works out of the box as non-root, but I am not sure if this is for podman to resolve this or for tileserver-gl to work around this.

@acalcutt
Copy link
Collaborator

acalcutt commented Feb 2, 2023

I wonder if we should be setting a consistent UID for the node user the docker image creates. that UID does seem really high...i wonder if that is the UID of the node user we create.

@jfgaudreault-p
Copy link
Author

I think this would be great, I was thinking that too. Maybe changing the UID somehow.

@acalcutt
Copy link
Collaborator

acalcutt commented Feb 2, 2023

My thinking is it would be done where we add the user https://github.com/maptiler/tileserver-gl/blob/master/Dockerfile#L59-L60

I have seen that in similar uses before, like https://github.com/nodejs/docker-node/blob/main/docs/BestPractices.md#non-root-user where a uid for the user and group is set in the adduser and addgroup commands.

@scara
Copy link

scara commented Feb 2, 2023

i wonder if that is the UID of the node user we create.

My thinking is it would be done where we add the user https://github.com/maptiler/tileserver-gl/blob/master/Dockerfile#L59-L60

It's the one podman "randomly" uses regardless the actual UID provided by the USER instruction and registered via useradd.
https://blog.christophersmart.com/2021/01/26/user-ids-and-rootless-containers-with-podman/ could give a picture of that.

My guessing is that we should chown -R node:node the node_modules directory at build time to keep the UID/GID under control.

HTH,
Matteo

@lifeofguenter
Copy link

Related: projectkudu/kudu#2512 (comment)

I am experiencing this issue as well when pulling the image inside of dind, e.g. to replicate:

$ docker run --privileged -it --rm --entrypoint sh docker:23-dind-rootless -c "dockerd-entrypoint.sh & sleep 10; DOCKER_HOST=unix:///run/user/1000/docker.sock docker pull maptiler/tileserver-gl

INFO[2023-02-10T17:37:52.429268691Z] Attempting next endpoint for pull after error: failed to register layer: ApplyLayer exit status 1 stdout: stderr: failed to Lchown "/usr/src/app/node_modules/content-type/HISTORY.md" for UID 1516583083, GID 0 (try increasing the number of subordinate IDs in /etc/subuid and /etc/subgid): lchown /usr/src/app/node_modules/content-type/HISTORY.md: invalid argument

whereas v4.4.0 seems to be working:

docker run --privileged -it --rm --entrypoint sh docker:23-dind-rootless -c "dockerd-entrypoint.sh & sleep 10; DOCKER_HOST=unix:///run/user/1000/docker.sock docker pull maptiler/tileserver-gl:v4.4.0

Status: Downloaded newer image for maptiler/tileserver-gl:v4.4.0
docker.io/maptiler/tileserver-gl:v4.4.0

@acalcutt
Copy link
Collaborator

Unfortunately I don't have an answer for this. I'd welcome a PR if someone has ideas on how to fix this.

My feeling is this issue is local to your user mapping in your local environment, but this type of situation may be improved if we set a specific uid on the node user we are creating in the dockerfile. but that is a guess based on some other PRs I have seen.

@lifeofguenter
Copy link

you van not reproduce with the command above?

@acalcutt
Copy link
Collaborator

I can confirm with the command you gave I get the same error.

I tried to replicate it using my own fork, with wifidb/tileserver-gl:v5.2.1-pre.4, which is basically the same https://github.com/acalcutt/tileserver-gl/compare

However when I try with my fork, it works...hmm
docker run --privileged -it --rm --entrypoint sh docker:23-dind-rootless -c "dockerd-entrypoint.sh & sleep 10; DOCKER_HOST=unix:///run/user/1000/docker.sock docker run wifidb/tileserver-gl:v5.2.1-pre.4"

acalcutt added a commit that referenced this issue Feb 21, 2023
acalcutt added a commit that referenced this issue Feb 21, 2023
@acalcutt
Copy link
Collaborator

acalcutt commented Feb 21, 2023

I think I figured out a fix to this in v4.4.7. It seems like the permissions in the content-type dependency files (inside the docker image) changed between 4.4.0 to 4.4.1(-4.4.6). For some reason those files changed from being owned by root, to a high UID user 1516583083. Only the files in this folder seem to have this high UID user. At first I was thinking maybe this was caused by stangness with the github workflow, however In testing I found this happened even if I built it locally on ubuntu.

I was looking into the error message
failed to Lchown "/usr/src/app/node_modules/content-type/HISTORY.md" for UID 1516583083, GID 0 (try increasing the number of subordinate IDs in /etc/subuid and /etc/subgid): lchown /usr/src/app/node_modules/content-type/HISTORY.md: invalid argument.

and decided to try and look at the permissions by bypassing the entrypoint.

For v4.4.0 I ran
docker run -it --entrypoint /bin/bash maptiler/tileserver-gl:v4.4.0
and went and looked at the permissions in /usr/src/app/node_modules/content-type/
image
you can see, everything is owned by root.

However if you look at v.4.4.1 (to v4.4.6), you can see eveything in that folder is owned by a high UID user
docker run -it --entrypoint /bin/bash maptiler/tileserver-gl:v4.4.1
image

In v4.4.7 I added a chmod to the DockerFile that sets the files in /usr/src/app/ to be owned by root, and this seems to fix the issue.
docker run -it --entrypoint /bin/bash maptiler/tileserver-gl:v4.4.7
image

docker run --privileged -it --rm --entrypoint sh docker:23-dind-rootless -c "dockerd-entrypoint.sh & sleep 10; DOCKER_HOST=unix:///run/user/1000/docker.sock docker run maptiler/tileserver-gl:v4.4.7"
image

@acalcutt
Copy link
Collaborator

acalcutt commented Feb 21, 2023

I found this article that mentions this issue
https://azureossd.github.io/2022/06/30/Docker-User-Namespace-remapping-issues/
(see the "NPM based projects causing userns remap exceptions" section)

@acalcutt
Copy link
Collaborator

@jfgaudreault-p can you confirm this is working again in v4.4.7

@jfgaudreault-p
Copy link
Author

I had to revert back my extended uids in /etc to revert back my previous changes, and yes I was able to confirm that the new image 4.4.7 no longer give the error! (I also confirmed the older one was giving the error again after I reverted my workaround changes).

Thanks a lot!

@acalcutt
Copy link
Collaborator

acalcutt commented Mar 1, 2023

Good to hear. I'll close this issue for now.

v4.4.8 was just released which should fix this issue in the light version too. there is also another SIGTERM fix which related to podman. If you have any issues with the new version let us know.

@acalcutt acalcutt closed this as completed Mar 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants