Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using mountopt metacopy=on results in build layers that include everything beneath #640

Closed
joelsmith opened this issue Jun 2, 2020 · 12 comments · Fixed by #1314
Closed
Labels

Comments

@joelsmith
Copy link

When running buildah on my system, new build layers included everything from layers beneath. When I do a podman save and extract the contents of the two layers, diff shows that most of the files in the two layers are identical. The new layer is slightly larger than the old layer.

@nalind helped me track it down to the metacopy=on mountopt. Removing it fixed the issue for me. Here's what he had to say on slack:

Looks like something goes wrong when we try to use the metacopy=on option there. it disables the native diff logic in the overlay driver, and the naive diff logic doesn't seem to react well to metacopy.

buildah info output:

{
    "host": {
        "CgroupVersion": "v1",
        "Distribution": {
            "distribution": "fedora",
            "version": "30"
        },
        "MemTotal": 66861326336,
        "MenFree": 4907630592,
        "OCIRuntime": "runc",
        "SwapFree": 0,
        "SwapTotal": 0,
        "arch": "amd64",
        "cpus": 8,
        "hostname": "xanadu.remote.redhat.com",
        "kernel": "5.5.16-100.fc30.x86_64",
        "os": "linux",
        "rootless": false,
        "uptime": "931h 10m 46.36s (Approximately 38.79 days)"
    },
    "store": {
        "ContainerStore": {
            "number": 19
        },
        "GraphDriverName": "overlay",
        "GraphOptions": [
            "overlay.mountopt=nodev,metacopy=on"
        ],
        "GraphRoot": "/var/lib/containers/storage",
        "GraphStatus": {
            "Backing Filesystem": "btrfs",
            "Native Overlay Diff": "false",
            "Supports d_type": "true",
            "Using metacopy": "true"
        },
        "ImageStore": {
            "number": 16
        },
        "RunRoot": "/var/run/containers/storage"
    }
}

I'm on Fedora 30 with podman-1.8.0-4.fc30.x86_64 and buildah-1.12.0-2.fc30.x86_64

@rhatdan
Copy link
Member

rhatdan commented Jun 2, 2020

@rhvgoyal FYI

@rhatdan
Copy link
Member

rhatdan commented Jul 30, 2020

I don't believe this is true? Could you give me an easy reporoducer?

I am running podman on a container, then I do a podman commit.

podman image tree shows me something that looks like:

# podman run alpine echo hello
# podman commit 40adb25965e5 test1
# podman image tree test1
Image ID: 2b9341818629
Tags:     [localhost/test1:latest]
Size:     5.852MB
Image Layers
├──  ID: 50644c29ef5a Size: 5.845MB Top Layer of: [docker.io/library/alpine:latest]
└──  ID: 2b59478d0f6b Size:  5.12kB Top Layer of: [localhost/test1:latest]

@joelsmith
Copy link
Author

joelsmith commented Jul 30, 2020

I don't know why it works for you and doesn't work for me. I'm using btrfs, so maybe that has something to do with it. I tested 4 configs, and for me it was only broken when running as root with metacopy=on.

run as \ mount opts nodev,metacopy=on nodev
non-root: ✔️ works ✔️ works
root: ✖️ broken ✔️ works
  1. Running this as a normal user, with metacopy=on. Works as expected:
$ grep ^mountopt /etc/containers/storage.conf 
mountopt = "nodev,metacopy=on"

$ cat Dockerfile
FROM gcr.io/google-containers/busybox
RUN dd if=/dev/zero of=/bigempty bs=1000000 count=20

$ buildah bud -t layertest .
STEP 1: FROM gcr.io/google-containers/busybox
STEP 2: RUN dd if=/dev/zero of=/bigempty bs=1000000 count=20
20+0 records in
20+0 records out
STEP 3: COMMIT layertest
Getting image source signatures
Copying blob 5f70bf18a086 skipped: already exists
Copying blob 5f70bf18a086 skipped: already exists
Copying blob 44c2569c4504 skipped: already exists
Copying blob 5f70bf18a086 skipped: already exists
Copying blob d85c9040b0fd done
Copying config 30221dd371 done
Writing manifest to image destination
Storing signatures
30221dd37163182f25d5b268a368a9ac4677214d16fa81dca023975e8afb785d
30221dd37163182f25d5b268a368a9ac4677214d16fa81dca023975e8afb785d

$ podman image tree layertest
Image ID: 30221dd37163
Tags:    [localhost/layertest:latest]
Size:    22.65MB
Image Layers
├──  ID: 5f70bf18a086 Size: 1.024kB
├──  ID: 9b8ee3b34fd5 Size: 1.024kB
├──  ID: b110bf48c2ff Size: 2.644MB
├──  ID: 42a413a59099 Size: 1.024kB Top Layer of: [gcr.io/google-containers/busybox:latest]
└──  ID: cb5a1726463b Size:    20MB Top Layer of: [localhost/layertest:latest]

$ cat Dockerfile2 
FROM layertest
RUN touch /touchfile

$ buildah bud -f Dockerfile2 -t layertest2 .
STEP 1: FROM layertest
STEP 2: RUN touch /touchfile
STEP 3: COMMIT layertest2
Getting image source signatures
Copying blob 5f70bf18a086 skipped: already exists
Copying blob 5f70bf18a086 skipped: already exists
Copying blob 44c2569c4504 skipped: already exists
Copying blob 5f70bf18a086 skipped: already exists
Copying blob d85c9040b0fd skipped: already exists
Copying blob 91b2609df3c1 done
Copying config b84d06b6cc done
Writing manifest to image destination
Storing signatures
b84d06b6ccbabb440e9f5a09c5ca5fa5f4b53f8d92781dab60bbe73b93615005
b84d06b6ccbabb440e9f5a09c5ca5fa5f4b53f8d92781dab60bbe73b93615005

$ podman image tree layertest2
Image ID: b84d06b6ccba
Tags:    [localhost/layertest2:latest]
Size:    22.66MB
Image Layers
├──  ID: 5f70bf18a086 Size: 1.024kB
├──  ID: 9b8ee3b34fd5 Size: 1.024kB
├──  ID: b110bf48c2ff Size: 2.644MB
├──  ID: 42a413a59099 Size: 1.024kB Top Layer of: [gcr.io/google-containers/busybox:latest]
├──  ID: cb5a1726463b Size:    20MB Top Layer of: [localhost/layertest:latest]
└──  ID: d3368b08fb42 Size: 1.536kB Top Layer of: [localhost/layertest2:latest]
  1. Running this as a normal user, without metacopy=on. Works as expected, same output as above, except
$ grep ^mountopt /etc/containers/storage.conf 
mountopt = "nodev"
  1. Running this as root, with metacopy=on. Broken -- two layers with 20 MB:
$ grep ^mountopt /etc/containers/storage.conf 
mountopt = "nodev,metacopy=on"

$ cat Dockerfile
FROM gcr.io/google-containers/busybox
RUN dd if=/dev/zero of=/bigempty bs=1000000 count=20

$ sudo buildah bud -t layertest .
STEP 1: FROM gcr.io/google-containers/busybox
STEP 2: RUN dd if=/dev/zero of=/bigempty bs=1000000 count=20
20+0 records in
20+0 records out
STEP 3: COMMIT layertest
Getting image source signatures
Copying blob 5f70bf18a086 skipped: already exists
Copying blob 5f70bf18a086 skipped: already exists
Copying blob 44c2569c4504 skipped: already exists
Copying blob 5f70bf18a086 skipped: already exists
Copying blob de416675a612 done
Copying config 20b991298d done
Writing manifest to image destination
Storing signatures
20b991298d6a5fa88c43c2f305ecc452947384ecf87482d353d5a2904ee610b2
20b991298d6a5fa88c43c2f305ecc452947384ecf87482d353d5a2904ee610b2

$ sudo podman image tree layertest
Image ID: 20b991298d6a
Tags:    [localhost/layertest:latest]
Size:    22.65MB
Image Layers
├──  ID: 5f70bf18a086 Size: 1.024kB
├──  ID: 9b8ee3b34fd5 Size: 1.024kB
├──  ID: b110bf48c2ff Size: 2.644MB
├──  ID: 42a413a59099 Size: 1.024kB Top Layer of: [gcr.io/google-containers/busybox:latest]
└──  ID: 53cb182a02fd Size:    20MB Top Layer of: [localhost/layertest:latest]

$ cat Dockerfile2 
FROM layertest
RUN touch /touchfile

$ sudo buildah bud -f Dockerfile2 -t layertest2 .
STEP 1: FROM layertest
STEP 2: RUN touch /touchfile
STEP 3: COMMIT layertest2
Getting image source signatures
Copying blob 5f70bf18a086 skipped: already exists
Copying blob 5f70bf18a086 skipped: already exists
Copying blob 44c2569c4504 skipped: already exists
Copying blob 5f70bf18a086 skipped: already exists
Copying blob de416675a612 skipped: already exists
Copying blob 198c444e6085 done
Copying config a29677b8fe done
Writing manifest to image destination
Storing signatures
a29677b8fe600e13404f31622d8228aae4266fd05741ed31e14eac3c6b029a4f
a29677b8fe600e13404f31622d8228aae4266fd05741ed31e14eac3c6b029a4f

$ sudo podman image tree layertest2
Image ID: a29677b8fe60
Tags:    [localhost/layertest2:latest]
Size:    42.66MB
Image Layers
├──  ID: 5f70bf18a086 Size: 1.024kB
├──  ID: 9b8ee3b34fd5 Size: 1.024kB
├──  ID: b110bf48c2ff Size: 2.644MB
├──  ID: 42a413a59099 Size: 1.024kB Top Layer of: [gcr.io/google-containers/busybox:latest]
├──  ID: 53cb182a02fd Size:    20MB Top Layer of: [localhost/layertest:latest]
└──  ID: f16ef1ab634f Size:    20MB Top Layer of: [localhost/layertest2:latest]
  1. Running this as root, without metacopy=on. Works as expected:
$ grep ^mountopt /etc/containers/storage.conf 
mountopt = "nodev"

$ cat Dockerfile
FROM gcr.io/google-containers/busybox
RUN dd if=/dev/zero of=/bigempty bs=1000000 count=20

$ sudo buildah bud -t layertest .
STEP 1: FROM gcr.io/google-containers/busybox
STEP 2: RUN dd if=/dev/zero of=/bigempty bs=1000000 count=20
20+0 records in
20+0 records out
STEP 3: COMMIT layertest
Getting image source signatures
Copying blob 5f70bf18a086 skipped: already exists
Copying blob 5f70bf18a086 skipped: already exists
Copying blob 44c2569c4504 skipped: already exists
Copying blob 5f70bf18a086 skipped: already exists
Copying blob 140f20ed64be done
Copying config b2f693eac3 done
Writing manifest to image destination
Storing signatures
b2f693eac367a8adb39737c17fd04c75dea23b51afcffa8ed17a5dc70d10baa6
b2f693eac367a8adb39737c17fd04c75dea23b51afcffa8ed17a5dc70d10baa6

$ sudo podman image tree layertest
Image ID: b2f693eac367
Tags:    [localhost/layertest:latest]
Size:    22.65MB
Image Layers
├──  ID: 5f70bf18a086 Size: 1.024kB
├──  ID: 9b8ee3b34fd5 Size: 1.024kB
├──  ID: b110bf48c2ff Size: 2.644MB
├──  ID: 42a413a59099 Size: 1.024kB Top Layer of: [gcr.io/google-containers/busybox:latest]
└──  ID: 7ad04eef10f2 Size:    20MB Top Layer of: [localhost/layertest:latest]

$ cat Dockerfile2 
FROM layertest
RUN touch /touchfile

$ sudo buildah bud -f Dockerfile2 -t layertest2 .
STEP 1: FROM layertest
STEP 2: RUN touch /touchfile
STEP 3: COMMIT layertest2
Getting image source signatures
Copying blob 5f70bf18a086 skipped: already exists
Copying blob 5f70bf18a086 skipped: already exists
Copying blob 44c2569c4504 skipped: already exists
Copying blob 5f70bf18a086 skipped: already exists
Copying blob 140f20ed64be skipped: already exists
Copying blob bea99bd119e9 done
Copying config 134a1fb08c done
Writing manifest to image destination
Storing signatures
134a1fb08c192c6d4d6d9c36c7eb5bb1264ac6d06a6afae229af05fc80da7d38
134a1fb08c192c6d4d6d9c36c7eb5bb1264ac6d06a6afae229af05fc80da7d38

$ sudo podman image tree layertest2
Image ID: 134a1fb08c19
Tags:    [localhost/layertest2:latest]
Size:    22.66MB
Image Layers
├──  ID: 5f70bf18a086 Size: 1.024kB
├──  ID: 9b8ee3b34fd5 Size: 1.024kB
├──  ID: b110bf48c2ff Size: 2.644MB
├──  ID: 42a413a59099 Size: 1.024kB Top Layer of: [gcr.io/google-containers/busybox:latest]
├──  ID: 7ad04eef10f2 Size:    20MB Top Layer of: [localhost/layertest:latest]
└──  ID: 21f5446badca Size: 1.536kB Top Layer of: [localhost/layertest2:latest]

@rhatdan
Copy link
Member

rhatdan commented Aug 3, 2020

Just for completeness could you give us the podman info command.

@rhatdan
Copy link
Member

rhatdan commented Aug 3, 2020

@rhvgoyal @nalind Is this just the naivediff grabbing all of the contents just because the Inodes have changed?

@rhatdan
Copy link
Member

rhatdan commented Aug 3, 2020

Another thought, is, this breakage because of BTRFS being the underlying file system versus xfs or ext4.

@rhatdan
Copy link
Member

rhatdan commented Oct 7, 2020

@joelsmith Are you still having this problem?

@joelsmith
Copy link
Author

joelsmith commented Oct 7, 2020

Yes, if I use nodev,metacopy=on running as root. To avoid it, I just leave off metacopy=on. I didn't notice your comments from Aug 3. Here's my podman info output:

host:
  BuildahVersion: 1.13.1
  CgroupVersion: v1
  Conmon:
    package: conmon-2.0.13-1.fc30.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.13, commit: 0d76d92618b091af3623e9d4a60889b32fe4bff6'
  Distribution:
    distribution: fedora
    version: "30"
  MemFree: 32708448256
  MemTotal: 66861326336
  OCIRuntime:
    name: runc
    package: runc-1.0.0-102.dev.gitdc9208a.fc30.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc10
      commit: ffa084d279c26351e6e63bd2c3f28d43fa1f6e57
      spec: 1.0.1-dev
  SwapFree: 0
  SwapTotal: 0
  arch: amd64
  cpus: 8
  eventlogger: journald
  hostname: xanadu.remote.redhat.com
  kernel: 5.5.16-100.fc30.x86_64
  os: linux
  rootless: false
  uptime: 962h 27m 6.47s (Approximately 40.08 days)
registries:
  search:
  - docker.io
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - quay.io
store:
  ConfigFile: /etc/containers/storage.conf
  ContainerStore:
    number: 25
  GraphDriverName: overlay
  GraphOptions:
    overlay.mountopt: nodev
  GraphRoot: /var/lib/containers/storage
  GraphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 18
  RunRoot: /var/run/containers/storage
  VolumePath: /var/lib/containers/storage/volumes

@nalind
Copy link
Member

nalind commented Oct 23, 2020

I can reproduce the error @joelsmith is seeing.

buildah v1.12.0 uses storage v1.15.3, and at image-commit-time it generates a layer diff by mounting both the child layer and the parent layer, and then walking the two mounted directories to compute the diff. When the overlay driver computes the mount options to specify in order to mount the parent layer, because it's being asked to mount the parent layer read-only, it omits the parent layer's "diff" directory from the list of directories that are specified in the mount options. This is wrong because the set of directories that it's left specifying amount to mounting the parent layer's parent layer instead, and as a result buildah gets a diff of the child layer relative to that.

It looks like we broke it in 1.13.1 and accidentally stopped triggering the bug just before v1.15.8: #519 stopped changing which lists of layers we used for the mountpoint by forcing readWrite to always be true, causing us to skip both of the code paths that were setting up incorrect mount options.

#628 added back code to toggle readWrite again, and fixed one of the code paths that had previously been broken but disabled. If I'm reading it right, when the list of mount options requires us to use a child process to do the mount, we still ignore the contents of the requested layer's diff directory, though, and we need to fix that.

@rhatdan
Copy link
Member

rhatdan commented Jun 14, 2021

@joelsmith @nalin was this ever fixed? Can we close this?

@StarpTech
Copy link

Hi, what's the status here?

@nalind
Copy link
Member

nalind commented Aug 29, 2022

#1314 should fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants