-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
c8d: Just enough Windows support to run the test suite #46539
Conversation
Added the label 👍 |
0d5edcd
to
7d9b386
Compare
daemon/containerd/image_builder.go
Outdated
// TODO: Do we actually need to put it back aterwards? | ||
// The only place that calls this in-tree is (b *Builder) exportImage and | ||
// that is called from the end of (b *Builder) performCopy which has a | ||
// `defer rwLayer.Release()` pending. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about the WCOW snapshotter implementation in containerd, once you Commit a snapshot, any future changes you make to it will leave it in an inconsistent state. This is because (unlike the graphdriver), Commit updates the snapshot in-place from a RWLayer to a ROLayer. The RWLayer data (sandbox.vhdx) is kept around so that 'Diff' works later, but it's actually impossible from the API to get a new read-write reference to that layer, as the old snapshot key is invalidated, and the new snapshot key can only be used as a parent for new layers.
We get around this here by capturing the mount data before Commit, but I realise now that's more hacky than I intended, as due to the difference with the graphdriver behaviour, anything touching that data will make future Diff
calls inconsistent.
So perhaps rwlayer.Commit()
should imply Release()
, matching containerd's API? That's how it's currently used now, anyway. That's not a container image store-specific change though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the interface Godoc that's true for the Commit
method in general, not only WCOW:
// Commit captures the changes between key and its parent into a snapshot
// identified by name. The name can then be used with the snapshotter's other
// methods to create subsequent snapshots.
//
// A committed snapshot will be created under name with the parent of the
// active snapshot.
//
// After commit, the snapshot identified by key is removed.
Commit(ctx context.Context, name, key string, opts ...Opt) error
So the current Linux implementation already assumes that the layer won't be used anymore after Commit
.
IMO, we can just skip the remount. It would be nice if we could keep the API consistent and make the graphdriver's Commit
also release the layer, but given the single usage and the fact that we're probably not going to have any new code that would use this I don't think it's strictly needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit that removes the remount.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about rebasing this PR the other day, so thank you for this.
If more non-trivial work is needed for this PR, and I'm doing it, do you have any concerns if I squash that "remount removal" commit into the commit it's modifying ("Temporarily unmount RWLayer during Commit")?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, I did it as a separate commit to not mix my changes with your original PR and make it more visible was was changed. Will squash it now.
GetLayerFolders(img *image.Image, rwLayer layer.RWLayer) ([]string, error) | ||
GetLayerFolders(img *image.Image, rwLayer layer.RWLayer, containerID string) ([]string, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be thrilled if someone points out a better thing to pass into this API that can provide what's needed for both graphdriver and snapshotter implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think passing the containerID makes sense here. I also did the same for removing the RWLayer field from container struct (that's only a draft though).
7d9b386
to
4860d25
Compare
5a63f11
to
14a28b1
Compare
14a28b1
to
b099f42
Compare
606a279
to
14da6a1
Compare
14da6a1
to
ff8840d
Compare
@TBBle I rebased the PR to get a fresh CI run. |
daemon/containerd/image_builder.go
Outdated
@@ -332,32 +332,19 @@ func (rw *rwlayer) Commit() (_ builder.ROLayer, outErr error) { | |||
}() | |||
|
|||
// Temporarily unmount the layer, required by the containerd windows snapshotter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer temporary. We also might want to mention that the windowsfilter graphdriver also remounts it afterwards, as it justifies the new final comment paragraph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, missed that part of the comment. Fixed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
windowsfilter graphdriver also remounts it afterwards
Hmmm are you sure about that? I looked into the code and couldn't find it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, there's this code in the image service (so level above the graphdriver):
moby/daemon/images/image_commit.go
Lines 85 to 93 in 83de55b
// TODO: this mount call is not necessary as we assume that TarStream() should | |
// mount the layer if needed. But the Diff() function for windows requests that | |
// the layer should be mounted when calling it. So we reserve this mount call | |
// until windows driver can implement Diff() interface correctly. | |
_, err = rwlayer.Mount(mountLabel) | |
if err != nil { | |
return nil, err | |
} | |
suggesting the opposite - that the Diff actually needs the layer to be mounted.
OTOH, this PR works so it looks like that comment might actually be outdated/wrong?
I had a quick look at the test run, and most of the amazingly few integration test failures on Windows mentioned "file in use", e.g., container restart tests, which suggests the logic around the impedance mismatch between graphdriver and snapshotter might still be incorrect or inconsistent. But it might also be unrelated, if we're leaking a file handle or something. My opinion is that we can improve that by moving containerd-storage to use WithSnapshot for executing containers as described in the original post, rather than mounting and passing-in the mount-point as we do now. But that's a wider question, and I think landing this work first is better, because more tests better. |
@TBBle are you near a computer to do a new rebase and to fix up that comment? We just merged #46634, which makes all the tests green on Linux, and (sorry for the long delay on this PR), we want to merge this PR to get Windows tests "functional" I agree that we may want to look at follow-ups, but at least your work on this PR (thank you SO much) brings us much further in the good direction. Let me know if you have time / opportunity to rebase / push, otherwise I can ask someone to do so 👍 |
* conatinerd => containerd * ROLayer => RWLayer Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com> Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com> Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is consistent with layerStore's CreateRWLayer behaviour. Potentially this can be refactored to avoid creating the -init layer, but as noted in layerStore's initMount, this name may be special, and should be cleared-out all-at-once. Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com> Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Needed for Diff on Windows. Don't remount it afterwards as the layer is going to be released anyway. Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com> Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The existing API ImageService.GetLayerFolders didn't have access to the ID of the container, and once we have that, the snapshotter Mounts API provides all the information we need here. Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com> Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
That means 'null', not that we can call builder-next on Windows. If and when we do get builder-next going, this will need to be solved properly in some way. Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com> Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The actual divergence is due to differences in the snapshotter and graphfilter mount behaviour on Windows, but the snapshotter behaviour is better, so we deal with it here rather than changing the snapshotter behaviour. We're relying on the internals of containerd's Windows mount implementation here. Unless this code flow is replaced, future work is to move getBackingDeviceForContainerdMount into containerd's mount implementation. Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com> Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
ae478a8
to
e8f4bfb
Compare
Looks indeed that there's some remaining failures (filesystem in use);
We (@rumpl @vvoland) discussed this on Slack, and want to;
Waiting for CI to finish, but once it does, we can go ahead and merge this one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
graph driver tests are still green; failures with snapshotter are currently expected (see above)
LGTM
let's bring this one in 🎉 ❤️
Thanks @TBBle for making this PR! Sorry it had to wait so long! |
Wooohoooo 🎉❤️🎉 |
Awesome, thank you for fixing and merging. Sorry, I posted my comment yesterday and immediately went to bed, but looks like it wasn't a blocker. Yay! |
Ha! Yes, I'm aware you're in a very different timezone than us, so thought "let's wait a few minutes, and otherwise let's make the changes". We decided to take the risk of a possible race condition, and to carry the last few bits. I also saw @vvoland started to have a look at some of the remaining failures in this PR; (review / input welcome!) |
- What I did
Building upon
#46402#46572 to bring the containerd image-store support on Windows up to something that is worth having tests enabled for.- How I did it
Firstly, cherry-picked #46402 onto the mainline branch, once containerd 1.7.6 had been vendored by #45966. Update: Currently based directly on #46402 until it merges, unless something important lands in mainline first.
Secondly, got the snapshotter tests functional on Windows matching the Linux version. (Two-liner, most of the work was done by @rumpl in #46402).
Violently beat my head againstIteratively worked through failures observed in the integration test suite setup, building the busybox image. As it happens, this step does most of the things we seem to care about, so it was a good test-case.Rebased the results of the above after #46572 landed, which integrated the outcomes of the second paragraph already.
Remaining TODOs to undraft this PR:
containerd-integration
(someone needs to do that for me) and drop the last (0d5edcd) commitci: Add conditional CI for the containerd backed image store #46402ci: split and use matrix to test containerd backed image store #46572 is merged, dropping the first couple of commitsb099f42
)- Hack 1 (53777db)
builder-next specifies the
host
network provider, but that fails on Windows, blocking daemon startup even though we don't currently support builder-next.My current hack is to use
auto
, since that meanscni
if configured and if not, on Windows it falls back tonull
, and on Linux it falls back tohost
.I've currently only done this for Windows with a
runtime.GOOS
test, but I'm inclined to think we should useauto
always, since: it does the right thing on Linux, and if we ever do need a more-complex network setup, CNI is the obvious path. If we ever do get builder-next working for Windows, it'll need to use CNI anyway, even for the equivalent of host networking.- Hack 2 (6e48f48 or
b099f42
)The windowfilter graphdriver is a bit weird: When you "mount" an image from it, it actually only does the volume setup (Prepare and Activate in HCS terms) and then returns the
\\?\Volume{GUID}\
formatted volume name, where other graphdrivers (and all the containerd snapshotters) actually mount the image to a host directory and return that directory.However, starting a container as we do now requires us to pass that
\\?\Volume{GUID}\
-formatted volume name in the OCI spec asRoot.Path
.Normally, containerd doesn't expose that volume name to callers, but we can use the underlying Windows API to get the volume name from the mount point, which relies on internal knowledge of the containerd/mount implementation (earlier implementation of the same idea that I'm less confident in).
The BindFilter approach was pretty messy though, and I didn't like it very much. (There's probably a bunch of overlooked edge cases there too. I have no idea why I see four copies of the bind-mount there, which is another vote against the older BindFilter version.) The
GetLayerMountPath
isn't too bad, it's still poking at internal details, but they're details that containerd's mount package could trivially provide an API to expose, if we go this way.Beyond cleaning it up, an alternative approach would be to not populate
Windows.LayerFolders
norRoot.Path
at all (removing the need for 1ad566e as well) and instead use containerd'sWithSnapshot
andWithSnapshotter
config-functor-things when setting up the container.However, this would, at least on Windows, require us to unmount the directory before starting the container, as this flow makes hcsshim responsible for both mounting of the root layer and deriving
Windows.LayerFolders
from the snapshot. Note that in the latter case, there's a bug where hcsshim'll append to anyWindows.LayerFolders
passed in (0.12 onwards rejects this case explicitly instead), so we must not populate it in this case.nil
is fine, even though it's out-of-spec.This should be feasible as we already have the
conditionalMountOnStart
hook, but code flows after that is called need to be checked to ensure they are correctly handling the case of not mounting the container filesystem locally. For example, I noticed code on the Windows side that checks the same conditions as the currentconditionalMountOnStart
check, and if they match, does a temporary mount of the image; clearly there's some scope for refactoring that to not duplicate the condition around.If it was just Windows, I'd have tried to go with this approach. However, I think it'd be cleaner if we use
WithSnapshot
for both Windows and Linux when executing containers. This would be a larger change, and probably has repercussions I'm unaware of at this time, since I haven't really worked to understand the overarching flow and its rationales. (Most other containerd-calling code appears to useWithNewSnapshot
, but moby currently has a pile of stuff it wants to do to the image first, so I'm not considering that here.)- How to verify it
CI.
This PR has the
containerd-integration
label applied, so the new snapshotter test suite introduced in#46402#46572 will run for this PR (and any PR with that label), giving us both Windows and Linux coverage. As long as the failures in the snapshotter test suite are consistent across platforms (and I didn't break anything), we're probably fine.All of the changes made here (except an interface change while the compiler will validate) are only in the "use snapshotter" code-path, so verification should be low-risk.
If someone wants to take the time and try this out with the WASM runtime, that'd be awesome. I have no particular plans to do so, even though that's what started me on this piece of work, months ago.
- Description for the changelog
Containerd image store on Windows is slightly usable.
- A picture of a cute animal (not mandatory but encouraged)