Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib/posix-process: Add vfork & execve #1386

Open
wants to merge 11 commits into
base: staging
Choose a base branch
from

Conversation

michpappas
Copy link
Member

Prerequisite checklist

  • Read the contribution guidelines regarding submitting new changes to the project;
  • Tested your changes against relevant architectures and platforms;
  • Ran the checkpatch.uk on your commit series before opening this PR;
  • Updated relevant documentation.

Base target

  • Architecture(s): [N/A]
  • Platform(s): [N/A]
  • Application(s): [N/A]

Additional configuration

  • CONFIG_LIBPOSIX_PROCESS_EXECVE: Enables the execve() syscall
  • CONFIG_LIBPOSIX_PROCESS_VFORK: Enables the vfork() syscall
  • CONFIG_LIBUKBINFMT: Enables the ukbinfmt library

Description of changes

This PR adds support for vfork() / execve() to libposix-process. This provides the foundation for running multiprocess applications on Unikraft.

vfork() creates a child process that shares the same memory with the calling thread, including the stack. Upon vfork() the parent is suspended until the child calls execve() or _exit() to prevent the parent's state from being corrupted. Due to above, the only permissible actions taken by the child are:

  • Update a variable of type pid_t to store its return value.
  • Calling any function other than one of the functions of the exec(2) family, or _exit(2).

Any action besides the above results into undefined behavior.

For a more elaborate description on vfork() see vfork(2).

execve() implements the standard behavior described in execve(2). To provide flexibility on the supported binary formats the implementation relies on libukbinfmt, which is also introducedin this series. Similarly to linux, this provides a minimal framework for registering loaders that handle different binary formats, introducing the ability of loading interpreted scripts. The implementation may be extended with support for linux's binfmt_misc fs, should such a use-case arise. The implementation of the ELF ukbinfmt loader is introduced in app-elfloader.

GitHub-Depends-On: #1316
GitHub-Depends-On: #1322
GitHub-Depends-On: #1346

@michpappas michpappas requested review from a team as code owners April 16, 2024 16:08
@github-actions github-actions bot added arch/arm arch/arm64 arch/x86_64 area/arch Unikraft Architecture area/include Part of include/uk area/lib Internal Unikraft Microlibrary area/plat Unikraft Patform lang/c Issues or PRs to do with C/C++ lib/nolibc Only neccessary subset of libc functionality lib/posix-process Information about system parameters lib/syscall_shim lib/ukboot lib/uksched lib/ukschedcoop lib/vfscore VFS Core Interface plat/common Common to all platforms plat/kvm Unikraft for KVM labels Apr 16, 2024
@nderjung nderjung added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Apr 16, 2024
mogasergiu and others added 10 commits April 26, 2024 09:17
libukbinfmt provides a minimal framework to register handlers of
executable files. Typical examples include binary executables like
ELF objects, or interpreted files like *nix scripts that use the
sha-bang sequence to specify an interpreter.

This commit only implements the functionality required to register
and execute loaders within the kernel's scope. Additional
functionality incl. application support via Linux's `binfmt_misc`
API shall be added as a future extension.

Checkpatch-Ignore: COMPLEX_MACRO
Checkpatch-Ignore: MACRO_ARG_REUSE

Signed-off-by: Michalis Pappas <michalis@unikraft.io>
Add definition of ARG_MAX to limits.h. POSIX defines ARG_MAX as the
number of bytes available for the combined arguments and env vars
of a new process. Whether that additionally includes NULL terminator,
pointers, or alignment bytes is IMPLEMENTATION DEFINED.

Signed-off-by: Michalis Pappas <michalis@unikraft.io>
Migrate the definitions of tid2pthread() and tid2pprocess() to
the private process.h to make them available to the rest of the
library. This requires to additionally migrate the definitions of
struct posix_process() and struct posix_thread().

Signed-off-by: Michalis Pappas <michalis@unikraft.io>
The state provides information on whether a posix_thread is
running, blocked, or exited.

Notice that posix_thread_state is only updated by operations
at the posix_process / posix_thread level and may not
always be in sync with the state of the underlying uk_thread.
This specifically applies to the POSIX_THREAD_RUNNING state,
which may not be accurate e.g. if the underlying uk_thread
blocks at the scheduler due to a lock.

On the other hand, the variants of POSIX_STATE_BLOCKED always
reflect the state of a posix_thread, as it is certain that the
underlying uk_thread will also be blocked from the scheduler.

Given the above, a check against POSIX_STATE_RUNNING should only be
used to check if the state of a posix-thread is not terminated or
blocked.

Signed-off-by: Michalis Pappas <michalis@unikraft.io>
Add a field to posix_thread to keep track of its parent. This is
populated during the creation of a posix_thread, and it is used
for deriving the parent's state in execve() / exit().

Signed-off-by: Michalis Pappas <michalis@unikraft.io>
Add implementation for execve(). For more info see execve(2).

Requires a binfmt ELF loader. The default loader is provided by
app-bincompat.

Checkpatch-Ignore: AVOID_EXTERNS

Signed-off-by: Michalis Pappas <michalis@unikraft.io>
vfork() sets the CLONE_VM and CLONE_VFORK flags. This triggers an
error in the clone handlers of vfscore as CLONE_FS is not set. Update
the handlers to additionally check against CLONE_VM, as that also
implies that the parent and child share filesystem state.

Signed-off-by: Michalis Pappas <michalis@unikraft.io>
The vfork() syscall is equivalent to calling clone() with the flags
parameter set to CLONE_VM | CLONE_VFORK | SIGCHLD. Update clone() to
support CLONE_VFORK and CLONE_VM. Implement vfork() as a wrapper of
clone().

For more info see vfork(2).

Signed-off-by: Michalis Pappas <michalis@unikraft.io>
@michpappas michpappas force-pushed the michpappas/feature/vfork_execve branch from e2182e2 to 0ec4fed Compare April 26, 2024 07:20
@github-actions github-actions bot added the area/support Support scripts, tools, services. label Apr 26, 2024
@razvand razvand requested a review from andreittr April 28, 2024 05:35
@razvand razvand requested review from StefanJum, mariasfiraiala and RaduNichita and removed request for a team April 28, 2024 05:35
@razvand razvand added this to the v0.17.0 (Calypso) milestone Apr 28, 2024
Copy link
Member

@mogasergiu mogasergiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi there 👋 👨‍⚕️ . Had a brief go at this. I will have a deeper look once we get the dependency PRs in.

@@ -70,3 +70,4 @@ $(eval $(call import_lib,$(CONFIG_UK_BASE)/lib/ukrust))
$(eval $(call import_lib,$(CONFIG_UK_BASE)/lib/ukreloc))
$(eval $(call import_lib,$(CONFIG_UK_BASE)/lib/ukofw))
$(eval $(call import_lib,$(CONFIG_UK_BASE)/lib/ukallocstack))
$(eval $(call import_lib,$(CONFIG_UK_BASE)/lib/ukbinfmt))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lib: Introduce libukbinfmt

s/sha-bang/shebang in the commit msg

@@ -0,0 +1,102 @@
/* SPDX-License-Identifier: BSD-3-Clause */
/* Copyright (c) 2023, Unikraft GmbH and The Unikraft Authors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2024

@@ -58,6 +67,7 @@ struct posix_thread {
struct uk_list_head thread_list_entry;
struct uk_thread *thread;
struct uk_alloc *_a;
enum posix_thread_state state;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also add a comment here that states again the potential state difference between uk_thread and posix_thread

@@ -0,0 +1,20 @@
/* SPDX-License-Identifier: BSD-3-Clause */
/* Copyright (c) 2023, Unikraft GmbH and The Unikraft Authors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2024


/* Prepare for iret */
execenv->regs.eflags = X86_EFLAGS_IF; /* enable IRQs on ctx switch */
execenv->regs.cs = 8;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GDT_DESC_OFFSET(GDT_DESC_CODE) and GDT_DESC_OFFSET(GDT_DESC_DATA). yes this introduces deps to plat common 🤪

{
execenv->regs.lr = ip;
execenv->regs.sp = sp;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also set spsr_el1 as I believe it to have an unsafe value coming from a mallocd stack. I would reuse the one of the parent and, additionally &= ~PSR_I so that IRQs are enblaed. Ditto for x86, let's reuse the parents flags but with IRQs forcefully enabled (although I think in both cases it should be safe to assume they are already enabled IMO).

@@ -0,0 +1,29 @@
/* SPDX-License-Identifier: BSD-3-Clause */
/* Copyright (c) 2023, Unikraft GmbH and The Unikraft Authors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2️⃣ 0️⃣ 2️⃣ 4️⃣

* still allocate an Unikraft TLS but sets the TLS architecture pointer
* to zero.
*/
static int _clone(struct clone_args *cl_args, size_t cl_args_len,
struct ukarch_execenv *execenv)
int uk_clone(struct clone_args *cl_args, size_t cl_args_len,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__sz?

if ((flags & CLONE_VM)) {
if (!cl_args->stack && !cl_args->stack_size) {
uk_pr_debug("Using parent's sp @ 0x%lx\n", execenv->regs.rsp);
cl_args->stack = execenv->regs.rsp; /* FIXME */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume FIXME cus x86 specific, right?

* or exit(). Yield to schedule the child.
*/
if (flags & CLONE_VFORK) {
struct posix_thread *pthread = tid2pthread(ukthread2tid(t));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subscope declaration? So unlike you 😞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch/arm arch/arm64 arch/x86_64 area/arch Unikraft Architecture area/include Part of include/uk area/lib Internal Unikraft Microlibrary area/plat Unikraft Patform area/support Support scripts, tools, services. lang/c Issues or PRs to do with C/C++ lib/nolibc Only neccessary subset of libc functionality lib/posix-process Information about system parameters lib/syscall_shim lib/ukboot lib/uksched lib/ukschedcoop lib/vfscore VFS Core Interface plat/common Common to all platforms plat/kvm Unikraft for KVM release-note Denotes a PR that will be considered when it comes time to generate release notes.
Projects
Status: 🏗 In Progress
Development

Successfully merging this pull request may close these issues.

None yet

5 participants