Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement slab level cache for remote frees #634

Open
3 tasks
mjp41 opened this issue Sep 14, 2023 · 5 comments
Open
3 tasks

Implement slab level cache for remote frees #634

mjp41 opened this issue Sep 14, 2023 · 5 comments

Comments

@mjp41
Copy link
Member

mjp41 commented Sep 14, 2023

@nwf-msr observed that we could improve the performance of remote deallocation if the producer does more work on building lists for each slab before returning to the original allocator. This could improve producer/consumer scenarios further.

Tasks

@mjp41
Copy link
Member Author

mjp41 commented Sep 14, 2023

@Licenser @darach are you still using snmalloc. If so do you have any benchmarks that represent your workload? We have some ideas that would benefit your kind of workload for Tremor.

@darach
Copy link

darach commented Sep 14, 2023 via email

@mjp41
Copy link
Member Author

mjp41 commented Sep 14, 2023

Awesome thanks. Any benchmarks we can run would really help us in justifying the engineering work.

@darach
Copy link

darach commented Sep 14, 2023

Our CI benchmarks are snmalloc based https://www.tremor.rs/benchmarks/ - relatively boring UI there.
The benchmarks we ran a year or two ago for tremor ( and mimalloc before it, and jemalloc before that ) are here:
https://github.com/tremor-rs/tremor-runtime/blob/main/bench/README.md

They have changed enough though in themselves and we rewrote the runtime and connectors so the benchmarking
code works differently so YMMV

nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Sep 21, 2023
- Trace "Handling remote" once per batch, rather than per element

- Remote queue events also log the associated metaslab; we'll use this
  to assess the efficacy of microsoft#634
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Sep 21, 2023
Approximate a message-passing application as a set of producers, a set of
consumers, and a set of proxies that do both.  We'll use this for some initial
insight for microsoft#634 but it seems worth
having in general.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Sep 21, 2023
Approximate a message-passing application as a set of producers, a set of
consumers, and a set of proxies that do both.  We'll use this for some initial
insight for microsoft#634 but it seems worth
having in general.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Sep 21, 2023
- Trace "Handling remote" once per batch, rather than per element

- Remote queue events also log the associated metaslab; we'll use this
  to assess the efficacy of microsoft#634
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Sep 21, 2023
Approximate a message-passing application as a set of producers, a set of
consumers, and a set of proxies that do both.  We'll use this for some initial
insight for microsoft#634 but it seems worth
having in general.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Sep 22, 2023
Approximate a message-passing application as a set of producers, a set of
consumers, and a set of proxies that do both.  We'll use this for some initial
insight for microsoft#634 but it seems worth
having in general.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Sep 22, 2023
- Trace "Handling remote" once per batch, rather than per element

- Remote queue events also log the associated metaslab; we'll use this
  to assess the efficacy of microsoft#634
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Sep 22, 2023
Approximate a message-passing application as a set of producers, a set of
consumers, and a set of proxies that do both.  We'll use this for some initial
insight for microsoft#634 but it seems worth
having in general.
@nwf-msr
Copy link
Contributor

nwf-msr commented Sep 25, 2023

I've spent a while playing with various cache strategies (though nothing very sophisticated, just sort of seeing what the low-hanging fruit was like). For the two workloads I've tried, for three interesting choices of hashes, I currently have these message counts:

Workload No caching Perfect assembly 4-way direct hash
msgpass 7,205,380 552,781; 35 rings 1,076,843
xmalloc-test 2,388,551 317,057; 5 rings 317,065

The hash here is inspired by https://github.com/skeeto/hash-prospector but is sort of "a third" of that:

  hash = slab;
  hash *= 0x7feb352d;
  return (hash >> 16) & 3;

I'm sure it's possible to do better, but that seems to work alright, though it's not sensitive to the upper bits of the slab! Of note, however, changing the shift to hash >> 30 or performing a prospector-esque xor-shift prior to multiplication performs significantly worse on msgpass (and a little worse on xmalloc-test). The full prospector hash also does a little worse than the numbers above and is, obviously, a fair bit more expensive.

nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Sep 25, 2023
While working on microsoft#634, it's useful
to be able to simulate caching policies without having to write all the C++ to
actually run them.  Here's a terrible little Perl script that can probably do
most of what you might want.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Sep 25, 2023
Approximate a message-passing application as a set of producers, a set of
consumers, and a set of proxies that do both.  We'll use this for some initial
insight for microsoft#634 but it seems worth
having in general.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Sep 25, 2023
While working on microsoft#634, it's useful
to be able to simulate caching policies without having to write all the C++ to
actually run them.  Here's a terrible little Perl script that can probably do
most of what you might want.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Sep 25, 2023
Approximate a message-passing application as a set of producers, a set of
consumers, and a set of proxies that do both.  We'll use this for some initial
insight for microsoft#634 but it seems worth
having in general.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Sep 25, 2023
While working on microsoft#634, it's useful
to be able to simulate caching policies without having to write all the C++ to
actually run them.  Here's a terrible little Perl script that can probably do
most of what you might want.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Nov 16, 2023
- Trace "Handling remote" once per batch, rather than per element

- Remote queue events also log the associated metaslab; we'll use this
  to assess the efficacy of microsoft#634
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Nov 16, 2023
Approximate a message-passing application as a set of producers, a set of
consumers, and a set of proxies that do both.  We'll use this for some initial
insight for microsoft#634 but it seems worth
having in general.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Nov 16, 2023
While working on microsoft#634, it's useful
to be able to simulate caching policies without having to write all the C++ to
actually run them.  Here's a terrible little Perl script that can probably do
most of what you might want.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Nov 16, 2023
While working on microsoft#634, it's useful
to be able to simulate caching policies without having to write all the C++ to
actually run them.  Here's a terrible little Perl script that can probably do
most of what you might want.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Dec 14, 2023
- Trace "Handling remote" once per batch, rather than per element

- Remote queue events also log the associated metaslab; we'll use this
  to assess the efficacy of microsoft#634
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Dec 14, 2023
Approximate a message-passing application as a set of producers, a set of
consumers, and a set of proxies that do both.  We'll use this for some initial
insight for microsoft#634 but it seems worth
having in general.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Dec 14, 2023
While working on microsoft#634, it's useful
to be able to simulate caching policies without having to write all the C++ to
actually run them.  Here's a terrible little Perl script that can probably do
most of what you might want.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Mar 12, 2024
- Trace "Handling remote" once per batch, rather than per element

- Remote queue events also log the associated metaslab; we'll use this
  to assess the efficacy of microsoft#634
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Mar 12, 2024
Approximate a message-passing application as a set of producers, a set of
consumers, and a set of proxies that do both.  We'll use this for some initial
insight for microsoft#634 but it seems worth
having in general.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Mar 13, 2024
Approximate a message-passing application as a set of producers, a set of
consumers, and a set of proxies that do both.  We'll use this for some initial
insight for microsoft#634 but it seems worth
having in general.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Mar 15, 2024
Approximate a message-passing application as a set of producers, a set of
consumers, and a set of proxies that do both.  We'll use this for some initial
insight for microsoft#634 but it seems worth
having in general.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Mar 21, 2024
- Trace "Handling remote" once per batch, rather than per element

- Remote queue events also log the associated metaslab; we'll use this
  to assess the efficacy of microsoft#634
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue Mar 21, 2024
Approximate a message-passing application as a set of producers, a set of
consumers, and a set of proxies that do both.  We'll use this for some initial
insight for microsoft#634 but it seems worth
having in general.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue May 11, 2024
- Trace "Handling remote" once per batch, rather than per element

- Remote queue events also log the associated metaslab; we'll use this
  to assess the efficacy of microsoft#634
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue May 11, 2024
Approximate a message-passing application as a set of producers, a set of
consumers, and a set of proxies that do both.  We'll use this for some initial
insight for microsoft#634 but it seems worth
having in general.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue May 22, 2024
Approximate a message-passing application as a set of producers, a set of
consumers, and a set of proxies that do both.  We'll use this for some initial
insight for microsoft#634 but it seems worth
having in general.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue May 24, 2024
- Trace "Handling remote" once per batch, rather than per element

- Remote queue events also log the associated metaslab; we'll use this
  to assess the efficacy of microsoft#634
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue May 24, 2024
Approximate a message-passing application as a set of producers, a set of
consumers, and a set of proxies that do both.  We'll use this for some initial
insight for microsoft#634 but it seems worth
having in general.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue May 24, 2024
Approximate a message-passing application as a set of producers, a set of
consumers, and a set of proxies that do both.  We'll use this for some initial
insight for microsoft#634 but it seems worth
having in general.
nwf-msr added a commit to nwf-msr/snmalloc that referenced this issue May 24, 2024
Approximate a message-passing application as a set of producers, a set of
consumers, and a set of proxies that do both.  We'll use this for some initial
insight for microsoft#634 but it seems worth
having in general.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants