Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch the default allocation policy to best-fit. #10188

Merged
merged 4 commits into from Feb 10, 2021

Conversation

damiendoligez
Copy link
Member

Switching to best-fit avoids most compactions and makes the worst-case behavior much better than next-fit in terms of time and memory.

The code is now sufficiently well-tested (at Jane Street and by Coq, see coq/coq#13040) so we can make the change with confidence.

@alainfrisch
Copy link
Contributor

I fully support that. We(=LexiFi) have also been using the new strategy happily in production for more than 6 months now.

@@ -240,7 +240,7 @@ typedef uint64_t uintnat;
/* Default speed setting for the major GC. The heap will grow until
the dead objects and the free list represent this percentage of the
total size of live objects. */
#define Percent_free_def 80
#define Percent_free_def 100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a data point, in the systems I have tested, 120 has been notably faster without a measurably significant increase in memory usage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To concur, Coq sets 200 with the best-fit, and when I asked they seemed happy with it, so 120 is not shocking.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At 200, memory use can increase significantly, but 120 seems like a good trade-off.

According to some measurements of the two (see https://blog.janestreet.com/memory-allocator-showdown/), best-fit at 100 has similar performance to next-fit at 80 while using less memory. This means that the gains from switching to best-fit are entirely allocated to reducing memory use, while at 120 the gains are split between reducing memory use and reducing time spent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Personally I think that the setting of 200 for Coq was a bit too aggressive. It is a very memory-hungry program, and while everyone is able to wait 10% longer to get their proof to complete, your machine may not necessarily be able to give 10% more memory to the process. I think this hard-limit behavior of memory would suggest a more conservative default, but I guess heavy Coq-users don't have the same perspective as they bought large-memory machines anyway.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @gasche: let's have default values that don't waste too much memory, because running out of memory and starting to swap is very costly indeed.

Copy link
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to hear your story concerning the "100" default value for Percent_free_def, since opinions differ.

Regardless, this is a fine PR and I'm confident it's going to improve performance on many OCaml programs.

@avsm
Copy link
Member

avsm commented Feb 9, 2021

All of the Solo5 MirageOS unikernels we can find are basically faster and better using best fit, so I'm fully supportive of this change. Keeping the defaults at 100 for roughly the same memory utilisation as next-fit at 80 is helpful for us (as we don't have swap and compact on running out of the physmem freelist).

@damiendoligez
Copy link
Member Author

@avsm This is interesting because Jane Street has reported getting about the same timing and less memory with the setting at 100. I'll change the default to 120 as it seems more popular as a general-purpose setting.

Also, if memory utilisation is very important, I would suggest switching the heap increases back to a fixed amount (instead of the current 15% exponential increase). The performance problem that happens with next-fit and fixed increases doesn't exist with best-fit.

In fact, what do people think of changing the default heap increase in this PR along with the other settings?

@gasche
Copy link
Member

gasche commented Feb 10, 2021

In fact, what do people think of changing the default heap increase in this PR along with the other settings?

Has the switch-back received any testing? If not yet, I would suggest asking the Coq people to switch to a fixed increase (can one do that using Gc.set?), and looking at their benchmarks result. What we should expect, if I understand your description correctly, is a decrease in peak memory usage, with a neglectible performance cost.

@stedolan
Copy link
Contributor

Also, if memory utilisation is very important, I would suggest switching the heap increases back to a fixed amount (instead of the current 15% exponential increase)

I'm not enthusiastic about this one. There are a small number of operations in the runtime that are O(number of chunks), and I like this being O(log heap) rather than O(heap).

In particular, there's a linear scan to find where to place the new chunk in sweep order when the heap expands. (This could be made sublinear with skiplists, but right now it's fine because the list is short)

@xavierleroy
Copy link
Contributor

I'm not enthusiastic about this one. There are a small number of operations in the runtime that are O(number of chunks), and I like this being O(log heap) rather than O(heap).

My reaction too. It's nice that the number of heap chunks grows slowly.

Plus, the more parameters we change at once, the more feedback from the field is hard to understand. Why don't we make one big change (to the best-fit allocator), see what happens, then do further changes?

@damiendoligez damiendoligez merged commit 8ef5723 into ocaml:trunk Feb 10, 2021
@damiendoligez damiendoligez deleted the enable-best-fit branch February 10, 2021 15:06
@@ -108,7 +108,7 @@ type control =
percentage of the memory used for live data.
The GC will work more (use more CPU time and collect
blocks more eagerly) if [space_overhead] is smaller.
Default: 80. *)
Default: 100. *)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also be changed from 100 to 120? @damiendoligez?

harder, unless you relax it by increasing [space_overhead].
Note: If you change to next-fit, you may need to reduce
the [space_overhead] setting, for example using [80] instead
of the default [100] which is tuned for best-fit. Otherwise,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this

@damiendoligez
Copy link
Member Author

@Bannerets you're right. I've pushed a fix directly to trunk (5661bbf).

damiendoligez added a commit to damiendoligez/ocaml that referenced this pull request Feb 11, 2021
Setting the default policy is not enough, we also have to set
the initial value of the freelist function pointers...
damiendoligez added a commit to damiendoligez/ocaml that referenced this pull request Feb 12, 2021
Setting the default policy is not enough, we also have to set
the initial value of the freelist function pointers...
gasche added a commit that referenced this pull request Mar 2, 2021
garrigue pushed a commit to garrigue/ocaml that referenced this pull request Mar 3, 2021
* switch the default allocation policy to best-fit and adjust the overhead parameter
* change the default overhead to 120
garrigue pushed a commit to garrigue/ocaml that referenced this pull request Mar 3, 2021
garrigue pushed a commit to garrigue/ocaml that referenced this pull request Mar 3, 2021
Setting the default policy is not enough, we also have to set
the initial value of the freelist function pointers...
smuenzel pushed a commit to smuenzel/ocaml that referenced this pull request Mar 30, 2021
* switch the default allocation policy to best-fit and adjust the overhead parameter
* change the default overhead to 120
smuenzel pushed a commit to smuenzel/ocaml that referenced this pull request Mar 30, 2021
smuenzel pushed a commit to smuenzel/ocaml that referenced this pull request Mar 30, 2021
Setting the default policy is not enough, we also have to set
the initial value of the freelist function pointers...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants