Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkpointing of Wasm container with podman+crun fails : Can't lookup mount #1204

Open
mh4ck-Thales opened this issue May 5, 2023 · 4 comments

Comments

@mh4ck-Thales
Copy link

This is the same issue than checkpoint-restore/criu#2170. I'm opening it here on the advice of @adrianreber who thinks this issue is related to the Wasm implementation in crun and not a problem within criu.

Description

When trying to checkpoint a wasm container started with podman + crun with wasmedge support, the checkpointing fails with an error like:

Error (criu/files-reg.c:1710): Can't lookup mount=476 for fd=-3 path=/
Error (criu/cr-dump.c:1524): Collect mappings (pid: 5571) failed with -1

This happens on both Fedora 38 (btrfs) and Debian 11 (ext4) up-to-date. For both OSes the error at the end of the dump.log file is the same, excepted for the mount number and pid.

Steps to reproduce the issue:

  1. Create a wasm app. The easiest way is to create a rust app with a simple infinite loop, and compile it for wasm :
cargo new app && cd app
rustup target add wasm32-wasi
echo 'fn main() { loop { println!("Hello Wasm");}}' > src/main.rs
cargo build --target wasm32-wasi
  1. Create the wasm Container from this Containerfile :
FROM scratch
COPY target/wasm32-wasi/debug/app.wasm /app.wasm
CMD ["/app.wasm"]

And build with

podman build -t demo-wasm --platform wasi/wasm .
  1. Start this container in the background :
podman run --platform wasi/wasm --name demo-wasm-1 -d localhost/demo-wasm

You can check it is running with podman logs demo-wasm-1. You should see a lot of "Hello Wasm" printed.

  1. Try to checkpoint this container with
podman container checkpoint demo-wasm-1

And notice it is failing.

Describe the results you received:
The checkpointing of the container fails

Describe the results you expected:
The checkpointing succeeds

Additional information you deem important (e.g. issue happens only occasionally):

The issue happens with the most simple of Wasm container. I was able to checkpoint and restore normal containers (debian and others) on the same machine without any issue.

logs and information:

Output of podman container checkpoint command :

2023-05-05T14:21:43.243762Z: CRIU checkpointing failed -52.  Please check CRIU logfile /var/lib/containers/storage/overlay-containers/ec5ef8e9db19f3840bfc9357687935de4f7610448552a2be9ab611f2cbd3742e/userdata/dump.log
Error: `/usr/bin/crun-wasm checkpoint --image-path /var/lib/containers/storage/overlay-containers/ec5ef8e9db19f3840bfc9357687935de4f7610448552a2be9ab611f2cbd3742e/userdata/checkpoint --work-path /var/lib/containers/storage/overlay-containers/ec5ef8e9db19f3840bfc9357687935de4f7610448552a2be9ab611f2cbd3742e/userdata ec5ef8e9db19f3840bfc9357687935de4f7610448552a2be9ab611f2cbd3742e` failed: exit status 1

dump.log file is attached :

dump.log

Output of `criu --version`:

Version: 3.17.1

Output of `criu check --all`:

Looks good but some kernel features are missing
which, depending on your process tree, may cause
dump or restore failure.

Podman version 4.5.0

crun --version :

crun version 1.8.4
commit: 5a8fa99a5e41facba2eda4af12fa26313918805b
rundir: /run/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL

Additional environment details:

Tried on both Fedora 38 (btrfs) and Debian 11 (ext4) in VMs. Criu installed from respective package managers. Outputs are from the Fedora machine. Both crun were using wasmedge as wasm runtime but I'll check if the issue is also present with other wasm runtimes like wasmtime and wasmer.

@adrianreber
Copy link
Contributor

So this does not seem to be trivial. We are not actually trying to checkpoint the wasm application, but the wasm runtime.

Not sure it makes sense to be able to checkpoint the wasm runtime. The easier solution would be if the wasm runtime supports checkpointing because the process we are trying to checkpoint is not running directly on Linux.

One could say we should be able to checkpoint the runtime. That could be possible.

From my understanding there is a mount in the runtime process CRIU cannot handle. The log file says:

(00.020251) Error (criu/files-reg.c:1710): Can't lookup mount=448 for fd=-3 path=/

I am able to reproduce this locally. On my system I see the last mount ID (by looking at /proc/self/mountinfo) is, in this example, 447. The mountinfo from the process in the container mentions looks something like this:

533 449 0:86 / / rw,relatime - overlay overlay ....

So, somewhere during setup if the wasm runtime the mount id 448 gets lost.

But even if we are able to configure the mounts correctly I am not sure checkpoint/restore will be easily possible on this process. All the libraries used by the runtime process are only available on the host system and not in the container.

I do not understand enough of how the wasm runtime is configured or how it works or if there is anything missing during the setup of the runtime.

I tried to checkpoint the wasm runtime (wasmedge /home/app/target/wasm32-wasi/debug/app.wasm) without crun and that worked. So it should be doable, but the way it is setup currently I am not sure it can work. The most confusing thing right now for me, besides the mount ID, is the fact that the runtime libraries are not part of the container which means that the restore will depend on the exact version of the host libraries installed and not on the content of the container.

@Snorch do you have any ideas why the mount id is not visible in the container. Any suggestions what could be done to solve this.

@Snorch
Copy link

Snorch commented May 8, 2023

All the libraries used by the runtime process are only available on the host system and not in the container.

This is a source of problem. CRIU does not support dumping external resources.

If you run some app and it has file mapping in memory, CRIU does not save the memory belonging to this file mapping to images, CRIU relies that it can recreate those mappings from files from the container filesystem on restore. (roughly speaking). So if backing file of the mapping is not available inside container filesystem CRIU would not be able to restore it (e.g. if mount is not available inside container filesystem and thus CRIU would not be able to find file on it).

To support some external resource (file) dumping in container, one should explicitly specify each such resource via CRIU options. https://github.com/checkpoint-restore/criu/blob/33dd66c6fc93c47213aaa0447a94d97ba1fa56ba/Documentation/criu.txt#L236

@mh4ck-Thales
Copy link
Author

Thanks for your detailed answers.

@adrianreber :

Not sure it makes sense to be able to checkpoint the wasm runtime. The easier solution would be if the wasm runtime supports checkpointing because the process we are trying to checkpoint is not running directly on Linux.

Indeed, for the checkpoint / restore use case, only checkpointing the wasm application would suffice. More precisely, we would need to save the binary + its internal state, i.e. the whole content of the wasm runtime virtual memory (some optimizations may be made by saving only parts of the wasm runtime virtual memory, but I do not think the gained performance will be significant, and I do not know the wasm specs well enough to be able to tell if it is even feasible).

The main obstacle I see to this is that the internal state of the wasm runtime virtual memory may depend on the runtime used. This means that checkpointing and restoring would depend on the wasm runtime embedded within crun, and also its version (which is, from what I know, not accessible once embedded within crun). It also means that the checkpointing functionality should be implemented within all the wasm runtimes who can run in containers (at least those who can be embedded within crun, which are from what I know, wasmedge, wasmer and wasmtime, with wasmedge being the more actively supported).

One could say we should be able to checkpoint the runtime. That could be possible.

In fact, I am more interested in checkpointing the whole runtime / container than just the contents of the wasm app. I'm using checkpointing for forensic analysis purposes and having the possibility to take a look at the whole runtime instead of only the app seems more interesting as it can enable us to detect runtime compromising or breakout.

Moreover, in the case on using checkpointing at large scale on containers, as the recent introduction of checkpointing to the Kubernetes world can allow us, we're looking at automating checkpointing and analysis to help detect compromises within containers. For this kind of automation (and many other use cases that may arise from the democratization of container checkpointing), having the same format to analyze for both classic and wasm containers would be essential. I'm not sure that checkpointing only the wasm application would allow to have a level of detail and flexibility comparable to the checkpointing of the whole container or runtime.

Finally, a wasm container can contain more than just the wasm app : configuration files, storage (database or other), other applications or libraries... Which may make checkpointing only the state of the wasm app less relevant.

The most confusing thing right now for me, besides the mount ID, is the fact that the runtime libraries are not part of the container which means that the restore will depend on the exact version of the host libraries installed and not on the content of the container.

crun is embedding a wasm runtime (only the core runtime). To be able to run a wasm container, the wasm runtime library (typically libwasmedge.so for wasmedge) must be present on the system. Then, when detecting a wasm runtime, crun will delegate the running of the container to this wasm runtime, instead of running it directly on the host system as it is happening with classic containers.

Indeed, this means that at contrary to classic containers, where all the binaries needed to run the containers (coreutils and more) are present within the container image, the host needing only to provide access to its kernel. For wasm, an external library (and more ?) is used.

I don't know how container checkpointing works internally, and if it is supposed to depend on the crun or libcrun.so versions, or even if you can checkpoint a crun container and restore it on runc or another container runtime. Answers to these questions may guide us on how to implement the checkpointing for a wasm container and if it needs to be compatible between runtimes and runtimes versions. This should also be discussed with those who created the wasm support for crun as their point of view of the inner working of wasm workload delegation may be enlightening.

@Snorch :

Thanks for the highlights. I only tried to checkpoint the container through podman or crun, and the --external flag doesn't seem available for these tools. I'll try to checkpoint a container using criu directly and specify the path to the wasm library in case this is the failing point.

I'm also wondering how the wasm library is loaded from within the container. Is it mounted inside the container ? With the good permissions / namespaces / etc ?

@adrianreber
Copy link
Contributor

I don't know how container checkpointing works internally, and if it is supposed to depend on the crun or libcrun.so versions, or even if you can checkpoint a crun container and restore it on runc or another container runtime.

In theory it should be possible to restore a checkpoint from runc with crun, there is nothing runtime specific in the checkpoint. I think it does not work currently, but just because nobody looked into making it possible. I do not think there is a real technical problem.

The main problem, from my point of view, is the used libraries. If you do something like lsof on a non wasm container you will see that the process only uses resources from the inside of the container. For a wasm container all libraries are on the host and the container is more or less empty besides the actual wasm application.

For CRIU to restore a process a used libraries must be exactly the same. Not just ABI compatible all open files must be exactly the same. So if between checkpoint and restoring only on used resource (libraries) must likely is updated you cannot restore it. If all files are in the container they will probably not change.

From my point of view it makes not sense to implement wasm application checkpointing and restoring.

I understand what you are trying to do, but to make it work I think it would make more sense to have crun setup wasm in such a way that CRIU does not fail. Restoring would still be difficult if anything changed on the host.

Maybe it would make sense to integrate checkpointing in each wasm runtime, just like the JVM tries to do for faster startup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants