Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unix.Unix_error(Unix.EPERM, "io_uring_queue_init", "") failures in Docker #695

Open
kit-ty-kate opened this issue Feb 18, 2024 · 5 comments

Comments

@kit-ty-kate
Copy link
Contributor

When using programs using eio in a Docker container, they sometimes fail randomly with the uncaught exception:

Unix.Unix_error(Unix.EPERM, "io_uring_queue_init", "")

For example running eio_linux.0.14’s tests gives me:

#=== ERROR while compiling eio_linux.0.14 =====================================#
# context              2.2.0~beta2~dev | linux/x86_64 | ocaml-base-compiler.5.1.1 | file:///home/opam/opam-repository
# path                 ~/.opam/5.1/.opam-switch/build/eio_linux.0.14
# command              ~/.opam/5.1/bin/dune build -p eio_linux -j 1 @install @runtest
# exit-code            1
# env-file             ~/.opam/log/eio_linux-1641-26795d.env
# output-file          ~/.opam/log/eio_linux-1641-26795d.out
### output ###
# File "lib_eio_linux/tests/fd_sharing.md", line 1, characters 0-0:
# /usr/bin/git --no-pager diff --no-index --color=always -u _build/default/lib_eio_linux/tests/fd_sharing.md _build/default/lib_eio_linux/tests/.mdx/fd_sharing.md.corrected
# diff --git a/_build/default/lib_eio_linux/tests/fd_sharing.md b/_build/default/lib_eio_linux/tests/.mdx/fd_sharing.md.corrected
# index bf8230b..af876d3 100644
# --- a/_build/default/lib_eio_linux/tests/fd_sharing.md
# +++ b/_build/default/lib_eio_linux/tests/.mdx/fd_sharing.md.corrected
# @@ -51,11 +51,5 @@ One domain closes an FD after another domain has enqueued a uring operation ment
#         (* Allow the read to complete. *)
#         Eio.Flow.close w
#      );;
# -+Domain 1 enqueuing read on FD
# -+Waiting for domain 0...
# -+Domain 0 closing FD
# -+Domain 0 closed FD; waking domain 1
# -+Domain 1 flushing queue
# -+Read EOF
# -- : unit = ()
# +Exception: Unix.Unix_error(Unix.EPERM, "io_uring_queue_init", "")
#  ```
# File "lib_eio_linux/tests/spawn.md", line 1, characters 0-0:
# /usr/bin/git --no-pager diff --no-index --color=always -u _build/default/lib_eio_linux/tests/spawn.md _build/default/lib_eio_linux/tests/.mdx/spawn.md.corrected
# diff --git a/_build/default/lib_eio_linux/tests/spawn.md b/_build/default/lib_eio_linux/tests/.mdx/spawn.md.corrected
# index 8d45211..bb70712 100644
# --- a/_build/default/lib_eio_linux/tests/spawn.md
# +++ b/_build/default/lib_eio_linux/tests/.mdx/spawn.md.corrected
# @@ -21,8 +21,7 @@ Setting environment variables:
#        ~env:[| "FOO=bar" |];
#    ] in
#    Promise.await (Process.exit_status child);;
# -FOO=bar
# -- : Unix.process_status = Unix.WEXITED 0
# +Exception: Unix.Unix_error(Unix.EPERM, "io_uring_queue_init", "")
#  ```
#  
#  Changing directory:
# @@ -37,8 +36,7 @@ Changing directory:
#        ~env:(Unix.environment ())
#    ] in
#    Promise.await (Process.exit_status child);;
# -/
# -- : Unix.process_status = Unix.WEXITED 0
# +Exception: Unix.Unix_error(Unix.EPERM, "io_uring_queue_init", "")
#  ```
#  
#  Changing directory using a file descriptor:
# @@ -61,8 +59,7 @@ Changing directory using a file descriptor:
#        ~env:(Unix.environment ())
#    ] in
#    Promise.await (Process.exit_status child);;
# -/
# -- : Unix.process_status = Unix.WEXITED 0
# +Exception: Unix.Unix_error(Unix.EPERM, "io_uring_queue_init", "")
#  ```
#  
#  Exit status:
# @@ -76,7 +73,7 @@ Exit status:
#        ~env:(Unix.environment ())
#    ] in
#    Promise.await (Process.exit_status child);;
# -- : Unix.process_status = Unix.WEXITED 1
# +Exception: Unix.Unix_error(Unix.EPERM, "io_uring_queue_init", "")
#  ```
#  
#  Failure starting child:
# @@ -96,7 +93,7 @@ Failure starting child:
#      assert false
#    with Failure ex ->
#      String.sub ex 0 7
# -- : string = "chdir: "
# +Exception: Unix.Unix_error(Unix.EPERM, "io_uring_queue_init", "")
#  ```
#  
#  Signalling a running child:
# @@ -115,8 +112,7 @@ Signalling a running child:
#    match Promise.await (Process.exit_status child) with
#    | Unix.WSIGNALED x when x = Sys.sigkill -> traceln "Child got SIGKILL"
#    | _ -> assert false;;
# -+Child got SIGKILL
# -- : unit = ()
# +Exception: Unix.Unix_error(Unix.EPERM, "io_uring_queue_init", "")
#  ```
#  
#  Signalling an exited child does nothing:
# @@ -133,6 +129,5 @@ Signalling an exited child does nothing:
#    in
#    ignore (Promise.await (Process.exit_status child) : Unix.process_status);
#    Process.signal child Sys.sigkill;;
# -FOO=bar
# -- : unit = ()
# +Exception: Unix.Unix_error(Unix.EPERM, "io_uring_queue_init", "")
#  ```

Is there any chance a fallback could be done in that case if a lack of permission is detected?


Docker version: 25.0.3
Linux version: 6.6.16
OCaml versions: tested with both 5.1.1 and 5.2.0~alpha1

@talex5
Copy link
Collaborator

talex5 commented Feb 18, 2024

Was probably fixed in #691

Maybe something changed in Docker recently, since we both hit it around the same time.

@kit-ty-kate
Copy link
Contributor Author

oh awesome, thanks! Could we have a release with this fix soon?

@talex5
Copy link
Collaborator

talex5 commented Feb 19, 2024

There will probably be a release in a week or so. As a work-around, you can set EIO_BACKEND=posix when running in a container.

@kit-ty-kate
Copy link
Contributor Author

Thanks for the release. 0.15 fixed most issues, however I'm still seeing some failure on the eio packages themselves (eio_linux, eio_main)

#=== ERROR while compiling eio_linux.0.15 =====================================#
# context              2.2.0~beta2~dev | linux/x86_64 | ocaml-base-compiler.5.1.1 | file:///home/opam/opam-repository
# path                 ~/.opam/5.1/.opam-switch/build/eio_linux.0.15
# command              ~/.opam/5.1/bin/dune build -p eio_linux -j 1 @install @runtest
# exit-code            1
# env-file             ~/.opam/log/eio_linux-1640-047f3e.env
# output-file          ~/.opam/log/eio_linux-1640-047f3e.out
### output ###
# File "lib_eio_linux/tests/fd_sharing.md", line 1, characters 0-0:
# /usr/bin/git --no-pager diff --no-index --color=always -u _build/default/lib_eio_linux/tests/fd_sharing.md _build/default/lib_eio_linux/tests/.mdx/fd_sharing.md.corrected
# diff --git a/_build/default/lib_eio_linux/tests/fd_sharing.md b/_build/default/lib_eio_linux/tests/.mdx/fd_sharing.md.corrected
# index bf8230b..fc5c373 100644
# --- a/_build/default/lib_eio_linux/tests/fd_sharing.md
# +++ b/_build/default/lib_eio_linux/tests/.mdx/fd_sharing.md.corrected
# @@ -51,11 +51,5 @@ One domain closes an FD after another domain has enqueued a uring operation ment
#         (* Allow the read to complete. *)
#         Eio.Flow.close w
#      );;
# -+Domain 1 enqueuing read on FD
# -+Waiting for domain 0...
# -+Domain 0 closing FD
# -+Domain 0 closed FD; waking domain 1
# -+Domain 1 flushing queue
# -+Read EOF
# -- : unit = ()
# +Exception: Failure "io_uring is not available (permission denied)".

There is also an issue in eio_lwt but that has already been fixed in master and just needs a release: ocaml-multicore/lwt_eio#27

@talex5
Copy link
Collaborator

talex5 commented Feb 28, 2024

That test is specifically testing uring, so the fallback doesn't apply. We can't test eio_linux on a machine that doesn't support uring, so I'm tempted to say that this is correct behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants