New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
profiles/seccomp: add syscalls for kernel v5.17 - v6.6, match containerd's profile #47341
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…p v2.5.4) This syscall is gated by CAP_SYS_NICE, matching the profile in containerd. containerd: containerd/containerd@a6e52c7 libseccomp: seccomp/libseccomp@d83cb7a kernel: torvalds/linux@c6018b4 mm/mempolicy: add set_mempolicy_home_node syscall This syscall can be used to set a home node for the MPOL_BIND and MPOL_PREFERRED_MANY memory policy. Users should use this syscall after setting up a memory policy for the specified range as shown below. mbind(p, nr_pages * page_size, MPOL_BIND, new_nodes->maskp, new_nodes->size + 1, 0); sys_set_mempolicy_home_node((unsigned long)p, nr_pages * page_size, home_node, 0); The syscall allows specifying a home node/preferred node from which kernel will fulfill memory allocation requests first. ... Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd containerd: containerd/containerd@a6e52c7 libseccomp: seccomp/libseccomp@53267af kernel: torvalds/linux@cf264e1 NAME cachestat - query the page cache statistics of a file. SYNOPSIS #include <sys/mman.h> struct cachestat_range { __u64 off; __u64 len; }; struct cachestat { __u64 nr_cache; __u64 nr_dirty; __u64 nr_writeback; __u64 nr_evicted; __u64 nr_recently_evicted; }; int cachestat(unsigned int fd, struct cachestat_range *cstat_range, struct cachestat *cstat, unsigned int flags); DESCRIPTION cachestat() queries the number of cached pages, number of dirty pages, number of pages marked for writeback, number of evicted pages, number of recently evicted pages, in the bytes range given by `off` and `len`. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd containerd: containerd/containerd@a6e52c7 libseccomp: seccomp/libseccomp@53267af kernel: torvalds/linux@09da082 fs: Add fchmodat2() On the userspace side fchmodat(3) is implemented as a wrapper function which implements the POSIX-specified interface. This interface differs from the underlying kernel system call, which does not have a flags argument. Most implementations require procfs [1][2]. There doesn't appear to be a good userspace workaround for this issue but the implementation in the kernel is pretty straight-forward. The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag, unlike existing fchmodat. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd containerd: containerd/containerd@a6e52c7 libseccomp: seccomp/libseccomp@53267af kernel: torvalds/linux@c35559f x86/shstk: Introduce map_shadow_stack syscall When operating with shadow stacks enabled, the kernel will automatically allocate shadow stacks for new threads, however in some cases userspace will need additional shadow stacks. The main example of this is the ucontext family of functions, which require userspace allocating and pivoting to userspace managed stacks. Unlike most other user memory permissions, shadow stacks need to be provisioned with special data in order to be useful. They need to be setup with a restore token so that userspace can pivot to them via the RSTORSSP instruction. But, the security design of shadow stacks is that they should not be written to except in limited circumstances. This presents a problem for userspace, as to how userspace can provision this special data, without allowing for the shadow stack to be generally writable. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd containerd: containerd/containerd@a6e52c7 libseccomp: seccomp/libseccomp@53267af kernel: torvalds/linux@0f4b5f9 futex: Add sys_futex_requeue() Finish off the 'simple' futex2 syscall group by adding sys_futex_requeue(). Unlike sys_futex_{wait,wake}() its arguments are too numerous to fit into a regular syscall. As such, use struct futex_waitv to pass the 'source' and 'destination' futexes to the syscall. This syscall implements what was previously known as FUTEX_CMP_REQUEUE and uses {val, uaddr, flags} for source and {uaddr, flags} for destination. This design explicitly allows requeueing between different types of futex by having a different flags word per uaddr. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd containerd: containerd/containerd@a6e52c7 libseccomp: seccomp/libseccomp@53267af kernel: torvalds/linux@cb8c431 futex: Add sys_futex_wait() To complement sys_futex_waitv()/wake(), add sys_futex_wait(). This syscall implements what was previously known as FUTEX_WAIT_BITSET except it uses 'unsigned long' for the value and bitmask arguments, takes timespec and clockid_t arguments for the absolute timeout and uses FUTEX2 flags. The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd containerd: containerd/containerd@a6e52c7 libseccomp: seccomp/libseccomp@53267af kernel: torvalds/linux@9f6c532 futex: Add sys_futex_wake() To complement sys_futex_waitv() add sys_futex_wake(). This syscall implements what was previously known as FUTEX_WAKE_BITSET except it uses 'unsigned long' for the bitmask and takes FUTEX2 flags. The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
thaJeztah
added
status/2-code-review
kind/enhancement
Enhancements are not bugs or new features but can improve usability or performance.
impact/changelog
area/security/seccomp
labels
Feb 6, 2024
AkihiroSuda
approved these changes
Feb 6, 2024
Thanks for providing those descriptions in the containerd PR @AkihiroSuda - would you like me to add you as |
Thanks, either is fine to me |
vvoland
approved these changes
Feb 6, 2024
idodod
added a commit
to earthly/dind
that referenced
this pull request
Apr 22, 2024
[![Mend Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [docker/docker](https://togithub.com/docker/docker) | patch | `25.0.1` -> `25.0.5` | --- ### Release Notes <details> <summary>docker/docker (docker/docker)</summary> ### [`v25.0.5`](https://togithub.com/moby/moby/releases/tag/v25.0.5) [Compare Source](https://togithub.com/docker/docker/compare/v25.0.4...v25.0.5) #### 25.0.5 For a full list of pull requests and changes in this release, refer to the relevant GitHub milestones: - [docker/cli, 25.0.5 milestone](https://togithub.com/docker/cli/issues?q=is%3Aclosed+milestone%3A25.0.5) - [moby/moby, 25.0.5 milestone](https://togithub.com/moby/moby/issues?q=is%3Aclosed+milestone%3A25.0.5) - Deprecated and removed features, see [Deprecated Features](https://togithub.com/docker/cli/blob/v25.0.5/docs/deprecated.md). - Changes to the Engine API, see [API version history](https://togithub.com/moby/moby/blob/v25.0.5/docs/api/version-history.md). ##### Security This release contains a security fix for [CVE-2024-29018], a potential data exfiltration from 'internal' networks via authoritative DNS servers. ##### Bug fixes and enhancements - [CVE-2024-29018]: Do not forward requests to external DNS servers for a container that is only connected to an 'internal' network. Previously, requests were forwarded if the host's DNS server was running on a loopback address, like systemd's 127.0.0.53. [moby/moby#47589](https://togithub.com/moby/moby/pull/47589) - plugin: fix mounting /etc/hosts when running in UserNS. [moby/moby#47588](https://togithub.com/moby/moby/pull/47588) - rootless: fix `open /etc/docker/plugins: permission denied`. [moby/moby#47587](https://togithub.com/moby/moby/pull/47587) - Fix multiple parallel `docker build` runs leaking disk space. [moby/moby#47527](https://togithub.com/moby/moby/pull/47527) [CVE-2024-29018]: https://togithub.com/moby/moby/security/advisories/GHSA-mq39-4gv4-mvpx ### [`v25.0.4`](https://togithub.com/moby/moby/releases/tag/v25.0.4) [Compare Source](https://togithub.com/docker/docker/compare/v25.0.3...v25.0.4) #### 25.0.4 For a full list of pull requests and changes in this release, refer to the relevant GitHub milestones: - [docker/cli, 25.0.4 milestone](https://togithub.com/docker/cli/issues?q=is%3Aclosed+milestone%3A25.0.4) - [moby/moby, 25.0.4 milestone](https://togithub.com/moby/moby/issues?q=is%3Aclosed+milestone%3A25.0.4) - Deprecated and removed features, see [Deprecated Features](https://togithub.com/docker/cli/blob/v25.0.4/docs/deprecated.md). - Changes to the Engine API, see [API version history](https://togithub.com/moby/moby/blob/v25.0.4/docs/api/version-history.md). ##### Bug fixes and enhancements - Restore DNS names for containers in the default "nat" network on Windows. [moby/moby#47490](https://togithub.com/moby/moby/pull/47490) - Fix `docker start` failing when used with `--checkpoint` [moby/moby#47466](https://togithub.com/moby/moby/pull/47466) - Don't enforce new validation rules for existing swarm networks [moby/moby#47482](https://togithub.com/moby/moby/pull/47482) - Restore IP connectivity between the host and containers on an internal bridge network. [moby/moby#47481](https://togithub.com/moby/moby/pull/47481) - Fix a regression introduced in v25.0 that prevented the classic builder from ADDing a tar archive with xattrs created on a non-Linux OS [moby/moby#47483](https://togithub.com/moby/moby/pull/47483) - containerd image store: Fix image pull not emitting `Pulling fs layer` status [moby/moby#47484](https://togithub.com/moby/moby/pull/47484) ##### API - To preserve backwards compatibility, make read-only mounts not recursive by default when using older clients (API version < v1.44). [moby/moby#47393](https://togithub.com/moby/moby/pull/47393) - `GET /images/{id}/json` omits the `Created` field (previously it was `0001-01-01T00:00:00Z`) if the `Created` field is missing from the image config. [moby/moby#47451](https://togithub.com/moby/moby/pull/47451) - Populate a missing `Created` field in `GET /images/{id}/json` with `0001-01-01T00:00:00Z` for API version <= 1.43. [moby/moby#47387](https://togithub.com/moby/moby/pull/47387) - Fix a regression that caused API socket connection failures to report an API version negotiation failure instead. [moby/moby#47470](https://togithub.com/moby/moby/pull/47470) - Preserve supplied endpoint configuration in a container-create API request, when a container-wide MAC address is specified, but `NetworkMode` name-or-id is not the same as the name-or-id used in `NetworkSettings.Networks`. [moby/moby#47510](https://togithub.com/moby/moby/pull/47510) ##### Packaging updates - Upgrade Go runtime to [1.21.8](https://go.dev/doc/devel/release#go1.21.8). [moby/moby#47503](https://togithub.com/moby/moby/pull/47503) - Upgrade RootlessKit to [v2.0.2](https://togithub.com/rootless-containers/rootlesskit/releases/tag/v2.0.2). [moby/moby#47508](https://togithub.com/moby/moby/pull/47508) - Upgrade Compose to [v2.24.7](https://togithub.com/docker/compose/releases/tag/v2.24.7). [docker/docker-ce-packaging#998 - Upgrade Buildx to [v0.13.0](https://togithub.com/docker/buildx/releases/tag/v0.13.0). [docker/docker-ce-packaging#997 **Full Changelog**: moby/moby@v25.0.3...v25.0.4 ### [`v25.0.3`](https://togithub.com/moby/moby/releases/tag/v25.0.3) [Compare Source](https://togithub.com/docker/docker/compare/v25.0.2...v25.0.3) #### 25.0.3 For a full list of pull requests and changes in this release, refer to the relevant GitHub milestones: - [docker/cli, 25.0.3 milestone](https://togithub.com/docker/cli/issues?q=is%3Aclosed+milestone%3A25.0.3) - [moby/moby, 25.0.3 milestone](https://togithub.com/moby/moby/issues?q=is%3Aclosed+milestone%3A25.0.3) ##### Bug fixes and enhancements - containerd image store: Fix a bug where `docker image history` would fail if a manifest wasn't found in the content store. [moby/moby#47348](https://togithub.com/moby/moby/pull/47348) - Ensure that a generated MAC address is not restored when a container is restarted, but a configured MAC address is preserved. [moby/moby#47304](https://togithub.com/moby/moby/pull/47304) > **Note** > > - Containers created with Docker Engine version 25.0.0 may have duplicate MAC addresses. > They must be re-created. > - Containers with user-defined MAC addresses created with Docker Engine versions 25.0.0 or 25.0.1 > receive new MAC addresses when started using Docker Engine version 25.0.2. > They must also be re-created. <!----> - Fix `docker save <image>@​<digest>` producing an OCI archive with index without manifests. [moby/moby#47294](https://togithub.com/moby/moby/pull/47294) - Fix a bug preventing bridge networks from being created with an MTU higher than 1500 on RHEL and CentOS 7. [moby/moby#47308](https://togithub.com/moby/moby/issues/47308), [moby/moby#47311](https://togithub.com/moby/moby/pull/47311) - Fix a bug where containers are unable to communicate over an `internal` network. [moby/moby#47303](https://togithub.com/moby/moby/pull/47303) - Fix a bug where the value of the `ipv6` daemon option was ignored. [moby/moby#47310](https://togithub.com/moby/moby/pull/47310) - Fix a bug where trying to install a pulling using a digest revision would cause a panic. [moby/moby#47323](https://togithub.com/moby/moby/pull/47323) - Fix a potential race condition in the managed containerd supervisor. [moby/moby#47313](https://togithub.com/moby/moby/pull/47313) - Fix an issue with the `journald` log driver preventing container logs from being followed correctly with systemd version 255. [moby/moby47243](https://togithub.com/moby/moby/pull/47243) - seccomp: Update the builtin seccomp profile to include syscalls added in kernel v5.17 - v6.7 to align the profile with the profile used by containerd. [moby/moby#47341](https://togithub.com/moby/moby/pull/47341) - Windows: Fix cache not being used when building images based on Windows versions older than the host's version. [moby/moby#47307](https://togithub.com/moby/moby/pull/47307), [moby/moby#47337](https://togithub.com/moby/moby/pull/47337) ##### Packaging updates - Removed support for Ubuntu Lunar (23.04). [docker/ce-packaging#986](https://togithub.com/docker/docker-ce-packaging/pull/986) ### [`v25.0.2`](https://togithub.com/moby/moby/releases/tag/v25.0.2) [Compare Source](https://togithub.com/docker/docker/compare/v25.0.1...v25.0.2) #### 25.0.2 For a full list of pull requests and changes in this release, refer to the relevant GitHub milestones: - [docker/cli, 25.0.2 milestone](https://togithub.com/docker/cli/issues?q=is%3Aclosed+milestone%3A25.0.2) - [moby/moby, 25.0.2 milestone](https://togithub.com/moby/moby/issues?q=is%3Aclosed+milestone%3A25.0.2) ##### Security This release contains security fixes for the following CVEs affecting Docker Engine and its components. | CVE | Component | Fix version | Severity | | ----------------------------------------------------------- | ------------- | ----------- | ---------------- | | [CVE-2024-21626](https://scout.docker.com/v/CVE-2024-21626) | runc | 1.1.12 | High, CVSS 8.6 | | [CVE-2024-23651](https://scout.docker.com/v/CVE-2024-23651) | BuildKit | 1.12.5 | High, CVSS 8.7 | | [CVE-2024-23652](https://scout.docker.com/v/CVE-2024-23652) | BuildKit | 1.12.5 | High, CVSS 8.7 | | [CVE-2024-23653](https://scout.docker.com/v/CVE-2024-23653) | BuildKit | 1.12.5 | High, CVSS 7.7 | | [CVE-2024-23650](https://scout.docker.com/v/CVE-2024-23650) | BuildKit | 1.12.5 | Medium, CVSS 5.5 | | [CVE-2024-24557](https://scout.docker.com/v/CVE-2024-24557) | Docker Engine | 25.0.2 | Medium, CVSS 6.9 | The potential impacts of the above vulnerabilities include: - Unauthorized access to the host filesystem - Compromising the integrity of the build cache - In the case of CVE-2024-21626, a scenario that could lead to full container escape For more information about the security issues addressed in this release, refer to the [blog post](https://www.docker.com/blog/docker-security-advisory-multiple-vulnerabilities-in-runc-buildkit-and-moby/). For details about each vulnerability, see the relevant security advisory: - [CVE-2024-21626](https://togithub.com/opencontainers/runc/security/advisories/GHSA-xr7r-f8xq-vfvv) - [CVE-2024-23651](https://togithub.com/moby/buildkit/security/advisories/GHSA-m3r6-h7wv-7xxv) - [CVE-2024-23652](https://togithub.com/moby/buildkit/security/advisories/GHSA-4v98-7qmw-rqr8) - [CVE-2024-23653](https://togithub.com/moby/buildkit/security/advisories/GHSA-wr6v-9f75-vh2g) - [CVE-2024-23650](https://togithub.com/moby/buildkit/security/advisories/GHSA-9p26-698r-w4hx) - [CVE-2024-24557](https://togithub.com/moby/moby/security/advisories/GHSA-xw73-rw38-6vjc) ##### Packaging updates - Upgrade containerd to [v1.6.28](https://togithub.com/containerd/containerd/releases/tag/v1.6.28). - Upgrade containerd to v1.7.13 (static binaries only). [moby/moby#47280](https://togithub.com/moby/moby/pull/47280) - Upgrade runc to v1.1.12. [moby/moby#47269](https://togithub.com/moby/moby/pull/47269) - Upgrade Compose to v2.24.5. [docker/docker-ce-packaging#985](https://togithub.com/docker/docker-ce-packaging/pull/985) - Upgrade BuildKit to v0.12.5. [moby/moby#47273](https://togithub.com/moby/moby/pull/47273) </details> --- ### Configuration 📅 **Schedule**: Branch creation - "after 6am on monday" (UTC), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Enabled. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Mend Renovate](https://www.mend.io/free-developer-tools/renovate/). View repository job log [here](https://developer.mend.io/github/earthly/dind). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4yOTMuMCIsInVwZGF0ZWRJblZlciI6IjM3LjI5My4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZSJdfQ==--> --------- Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: idodod <ido@earthly.dev>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/security/seccomp
impact/changelog
kind/enhancement
Enhancements are not bugs or new features but can improve usability or performance.
process/cherry-picked
status/2-code-review
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
seccomp: add set_mempolicy_home_node syscall (kernel v5.17, libseccomp v2.5.4)
This syscall is gated by CAP_SYS_NICE, matching the profile in containerd.
containerd: containerd/containerd@a6e52c7
libseccomp: seccomp/libseccomp@d83cb7a
kernel: torvalds/linux@c6018b4
seccomp: add cachestat syscall (kernel v6.5, libseccomp v2.5.5)
Add this syscall to match the profile in containerd
containerd: containerd/containerd@a6e52c7
libseccomp: seccomp/libseccomp@53267af
kernel: torvalds/linux@cf264e1
seccomp: add fchmodat2 syscall (kernel v6.6, libseccomp v2.5.5)
Add this syscall to match the profile in containerd
containerd: containerd/containerd@a6e52c7
libseccomp: seccomp/libseccomp@53267af
kernel: torvalds/linux@09da082
seccomp: add map_shadow_stack syscall (kernel v6.6, libseccomp v2.5.5)
Add this syscall to match the profile in containerd
containerd: containerd/containerd@a6e52c7
libseccomp: seccomp/libseccomp@53267af
kernel: torvalds/linux@c35559f
seccomp: add futex_requeue syscall (kernel v6.7, libseccomp v2.5.5)
Add this syscall to match the profile in containerd
containerd: containerd/containerd@a6e52c7
libseccomp: seccomp/libseccomp@53267af
kernel: torvalds/linux@0f4b5f9
seccomp: add futex_wait syscall (kernel v6.7, libseccomp v2.5.5)
Add this syscall to match the profile in containerd
containerd: containerd/containerd@a6e52c7
libseccomp: seccomp/libseccomp@53267af
kernel: torvalds/linux@cb8c431
seccomp: add futex_wake syscall (kernel v6.7, libseccomp v2.5.5)
Add this syscall to match the profile in containerd
containerd: containerd/containerd@a6e52c7
libseccomp: seccomp/libseccomp@53267af
kernel: torvalds/linux@9f6c532
- Description for the changelog
- A picture of a cute animal (not mandatory but encouraged)