New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lib/posix-process: Add vfork & execve #1386
base: staging
Are you sure you want to change the base?
lib/posix-process: Add vfork & execve #1386
Conversation
libukbinfmt provides a minimal framework to register handlers of executable files. Typical examples include binary executables like ELF objects, or interpreted files like *nix scripts that use the sha-bang sequence to specify an interpreter. This commit only implements the functionality required to register and execute loaders within the kernel's scope. Additional functionality incl. application support via Linux's `binfmt_misc` API shall be added as a future extension. Checkpatch-Ignore: COMPLEX_MACRO Checkpatch-Ignore: MACRO_ARG_REUSE Signed-off-by: Michalis Pappas <michalis@unikraft.io>
Add definition of ARG_MAX to limits.h. POSIX defines ARG_MAX as the number of bytes available for the combined arguments and env vars of a new process. Whether that additionally includes NULL terminator, pointers, or alignment bytes is IMPLEMENTATION DEFINED. Signed-off-by: Michalis Pappas <michalis@unikraft.io>
Migrate the definitions of tid2pthread() and tid2pprocess() to the private process.h to make them available to the rest of the library. This requires to additionally migrate the definitions of struct posix_process() and struct posix_thread(). Signed-off-by: Michalis Pappas <michalis@unikraft.io>
The state provides information on whether a posix_thread is running, blocked, or exited. Notice that posix_thread_state is only updated by operations at the posix_process / posix_thread level and may not always be in sync with the state of the underlying uk_thread. This specifically applies to the POSIX_THREAD_RUNNING state, which may not be accurate e.g. if the underlying uk_thread blocks at the scheduler due to a lock. On the other hand, the variants of POSIX_STATE_BLOCKED always reflect the state of a posix_thread, as it is certain that the underlying uk_thread will also be blocked from the scheduler. Given the above, a check against POSIX_STATE_RUNNING should only be used to check if the state of a posix-thread is not terminated or blocked. Signed-off-by: Michalis Pappas <michalis@unikraft.io>
Add a field to posix_thread to keep track of its parent. This is populated during the creation of a posix_thread, and it is used for deriving the parent's state in execve() / exit(). Signed-off-by: Michalis Pappas <michalis@unikraft.io>
Add implementation for execve(). For more info see execve(2). Requires a binfmt ELF loader. The default loader is provided by app-bincompat. Checkpatch-Ignore: AVOID_EXTERNS Signed-off-by: Michalis Pappas <michalis@unikraft.io>
vfork() sets the CLONE_VM and CLONE_VFORK flags. This triggers an error in the clone handlers of vfscore as CLONE_FS is not set. Update the handlers to additionally check against CLONE_VM, as that also implies that the parent and child share filesystem state. Signed-off-by: Michalis Pappas <michalis@unikraft.io>
The vfork() syscall is equivalent to calling clone() with the flags parameter set to CLONE_VM | CLONE_VFORK | SIGCHLD. Update clone() to support CLONE_VFORK and CLONE_VM. Implement vfork() as a wrapper of clone(). For more info see vfork(2). Signed-off-by: Michalis Pappas <michalis@unikraft.io>
e2182e2
to
0ec4fed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi there 👋 👨⚕️ . Had a brief go at this. I will have a deeper look once we get the dependency PRs in.
@@ -70,3 +70,4 @@ $(eval $(call import_lib,$(CONFIG_UK_BASE)/lib/ukrust)) | |||
$(eval $(call import_lib,$(CONFIG_UK_BASE)/lib/ukreloc)) | |||
$(eval $(call import_lib,$(CONFIG_UK_BASE)/lib/ukofw)) | |||
$(eval $(call import_lib,$(CONFIG_UK_BASE)/lib/ukallocstack)) | |||
$(eval $(call import_lib,$(CONFIG_UK_BASE)/lib/ukbinfmt)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lib: Introduce libukbinfmt
s/sha-bang/shebang in the commit msg
@@ -0,0 +1,102 @@ | |||
/* SPDX-License-Identifier: BSD-3-Clause */ | |||
/* Copyright (c) 2023, Unikraft GmbH and The Unikraft Authors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2024
@@ -58,6 +67,7 @@ struct posix_thread { | |||
struct uk_list_head thread_list_entry; | |||
struct uk_thread *thread; | |||
struct uk_alloc *_a; | |||
enum posix_thread_state state; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also add a comment here that states again the potential state difference between uk_thread and posix_thread
@@ -0,0 +1,20 @@ | |||
/* SPDX-License-Identifier: BSD-3-Clause */ | |||
/* Copyright (c) 2023, Unikraft GmbH and The Unikraft Authors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2024
|
||
/* Prepare for iret */ | ||
execenv->regs.eflags = X86_EFLAGS_IF; /* enable IRQs on ctx switch */ | ||
execenv->regs.cs = 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GDT_DESC_OFFSET(GDT_DESC_CODE)
and GDT_DESC_OFFSET(GDT_DESC_DATA)
. yes this introduces deps to plat common 🤪
{ | ||
execenv->regs.lr = ip; | ||
execenv->regs.sp = sp; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also set spsr_el1
as I believe it to have an unsafe value coming from a malloc
d stack. I would reuse the one of the parent and, additionally &= ~PSR_I
so that IRQs are enblaed. Ditto for x86, let's reuse the parents flags but with IRQs forcefully enabled (although I think in both cases it should be safe to assume they are already enabled IMO).
@@ -0,0 +1,29 @@ | |||
/* SPDX-License-Identifier: BSD-3-Clause */ | |||
/* Copyright (c) 2023, Unikraft GmbH and The Unikraft Authors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2️⃣ 0️⃣ 2️⃣ 4️⃣
* still allocate an Unikraft TLS but sets the TLS architecture pointer | ||
* to zero. | ||
*/ | ||
static int _clone(struct clone_args *cl_args, size_t cl_args_len, | ||
struct ukarch_execenv *execenv) | ||
int uk_clone(struct clone_args *cl_args, size_t cl_args_len, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__sz
?
if ((flags & CLONE_VM)) { | ||
if (!cl_args->stack && !cl_args->stack_size) { | ||
uk_pr_debug("Using parent's sp @ 0x%lx\n", execenv->regs.rsp); | ||
cl_args->stack = execenv->regs.rsp; /* FIXME */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume FIXME cus x86 specific, right?
* or exit(). Yield to schedule the child. | ||
*/ | ||
if (flags & CLONE_VFORK) { | ||
struct posix_thread *pthread = tid2pthread(ukthread2tid(t)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subscope declaration? So unlike you 😞
Prerequisite checklist
checkpatch.uk
on your commit series before opening this PR;Base target
Additional configuration
execve()
syscallvfork()
syscallukbinfmt
libraryDescription of changes
This PR adds support for
vfork()
/execve()
to libposix-process. This provides the foundation for running multiprocess applications on Unikraft.vfork()
creates a child process that shares the same memory with the calling thread, including the stack. Uponvfork()
the parent is suspended until the child callsexecve()
or_exit()
to prevent the parent's state from being corrupted. Due to above, the only permissible actions taken by the child are:pid_t
to store its return value.exec(2)
family, or_exit(2)
.Any action besides the above results into undefined behavior.
For a more elaborate description on
vfork()
see vfork(2).execve()
implements the standard behavior described in execve(2). To provide flexibility on the supported binary formats the implementation relies onlibukbinfmt
, which is also introducedin this series. Similarly to linux, this provides a minimal framework for registering loaders that handle different binary formats, introducing the ability of loading interpreted scripts. The implementation may be extended with support for linux'sbinfmt_misc
fs, should such a use-case arise. The implementation of the ELFukbinfmt
loader is introduced inapp-elfloader
.GitHub-Depends-On: #1316
GitHub-Depends-On: #1322
GitHub-Depends-On: #1346