Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using mmap to zero memory: performance questions and CHERI incompatibility #642

Open
jrtc27 opened this issue Oct 4, 2023 · 2 comments
Open

Comments

@jrtc27
Copy link
Contributor

jrtc27 commented Oct 4, 2023

PALPOSIX::zero uses mmap(MAP_FIXED) to map zero pages over the top as an "optimisation". However, there are two issues here:

  1. This requires VMEM permission on CHERI, but that's not always present here (in particular in the finish_alloc path).
  2. The heuristic here seems very wrong, as it's for any range that has page-aligned boundaries. For small allocations that's likely to hurt performance, having to go all the way into the kernel (maybe even IPI to shoot down the existing mappings, depending on the architecture, but at least do something to invalidate them), and once you get to the really large allocations where that is more performant does snmalloc really try and chunk them rather than just doing a raw mmap on demand? Short of data to inform whether and when it makes sense to use this optimisation I do not think it should be enabled by default.

The net result is we will likely be #if 0'ing this code out in CheriBSD.

@mjp41
Copy link
Member

mjp41 commented Oct 4, 2023

I am happy to disable this "optimisation", and put the code under a feature flag, so we can benchmark it properly at some future point. I have some other work items, so it will take me a while to get this. Will be able to review a PR sooner, if you submit one.

@davidchisnall
Copy link
Collaborator

Note that the goal here was not just zeroing, it’s allowing the kernel to reclaim the page if no longer used. I believe that the newer notify_not_using paths introduced with the ranges work have largely obsoleted these uses and the heuristic probably need tweaking.

The only place where this is really important is when a user calls calloc on a very large allocation and gets fresh memory (we don’t track which pages are fresh, so zero unconditionally). If you memset this, you will take CoW faults on every page. If you mmap it, you will hit a fairly fast path that notices that you want to replace CoW zero pages with CoW zero pages and does nothing. Ideally, we’d track which pages are guaranteed zeroed through the buddy allocator so that we could skip zeroing in these cases, but I’m not sure if we have a spare bit in the large buddy range. Even that wouldn’t help the case where a user allocates 1 MiB, touches 10 KiB of it, frees it, and the reallocates it (with calloc in both cases): in this case we would still want to use mmap to avoid faulting in almost 1 MiB of memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants