[GR-18214] Compacting garbage collection (non-default). #8870

graalvmbot · 2024-05-01T20:06:13Z

This PR adds a mark&compact GC for the old generation of the Serial GC. The primary intention is to reduce worst-case memory usage compared to the copying GC, which can use 2x the current heap size when all objects survive.

Typical mark&compact collectors do four passes over the heap to:

mark live objects
assign new locations to them
update their references for the new locations
move them to their new locations

New locations are often stored in each individual object. This would significantly enlarge our current 4/8-byte object headers and add memory overhead even outside of GC. Using side tables would require this memory only during GC (and could also collect with fewer passes), but it requires allocating these tables precisely when memory can be scarce.

Our implementation uses the object header as it is and stores new locations of contiguous sequences of surviving objects in records in the gaps between them (made up of dead objects). In order to find an object's new location when updating a reference to it, we first identify the referenced object's aligned chunk. In the chunk's card table, we temporarily keep an index, which we use to find a record near the object. The records also form a singly-linked list, which we can follow further to find the exact record that applies to the object. With this record, we can then compute the object's new location and update references accordingly.

When encountering a chunk with pinned objects, we sweep the gaps in them instead. Unfortunately, we cannot fill them with other surviving objects because our design requires that the order of objects stays the same (or records would be overwritten prematurely). Currently we also copy only entire object sequences and not split them to fit smaller areas of unused memory at the end of chunks (but does not seem to be an issue in practice). We also still copy objects which have an identity hash code that is based on their current address and needs to be stored in an additional field, enlarging the objects.

The performance of the mark&compact GC is currently somewhat comparable to the current copying GC, sometimes better and sometimes worse, both in terms of CPU usage and memory usage. Depending on the application's heap usage, complete collections can become more expensive, which can cause the GC policy to use more memory to need fewer collections.

The initial implementation also isn't heavily tuned, so there should still be room for improvement. For example, the object order is currently determined by the copying collector in the young generation which often places related objects apart from each other. The GC policy can likely also be tweaked for the different characteristics.

Points of interest in the implementation:

option CompactingOldGen in SerialGCOptions
subclass CompactingOldGeneration of the now abstract OldGeneration class
classes in package com.oracle.svm.core.genscavenge.compacting
object header bits in ObjectHeaderImpl
checks for SerialGCOptions.useCompactingOldGen() in several places

This work is based on a prototype by JKU student Christian Aistleitner for his master's thesis.

… to master.

SergejIsbrecht · 2024-05-02T11:02:28Z

@peter-hofer , I found the master thesis^0 for this commit, but the measurements are quite thin. Did you run internal measurements?

^0 https://epub.jku.at/obvulihs/download/pdf/9649056?originalFilename=true

peter-hofer · 2024-05-02T11:18:59Z

@SergejIsbrecht, yes, and the gist of it is that it is somewhat comparable to the current copying GC, sometimes better and sometimes worse, both in terms of CPU usage and memory usage. We avoided adding a forwarding pointer to every object and instead store the new object locations in the gaps between sequences of live objects, which requires extra passes over the heap and more complex lookups for determining an object's new location and updating references to it. This in turn can make full collections more expensive, which can cause the GC policy to use more memory to do fewer collections. On the other hand, you get an (almost) hard heap size limit by collecting in-place. This initial implementation also isn't heavily tuned, so I believe there's still some low-hanging fruits to harvest.

galderz · 2024-05-06T07:52:24Z

@peter-hofer Thanks for this work and the presentation last week during the native image meeting. I'm giving this a go and wondering that the following native image build line should indicate whether mark-copy or mark-compact is in use:

 Garbage collector: Serial GC (max heap size: 80% of RAM)

As it stands, there's no indication of which old gen algorithm is in use.

peter-hofer · 2024-05-06T08:01:55Z

@galderz, thanks for giving it a try. In this branch, the CompactingOldGen option is on by default, but it won't be for merging. I believe that when it gets enabled explicitly, it will show up in the builder output as a custom option. Indicating it in the GC line is a possibility though.

galderz · 2024-05-06T08:14:10Z

@peter-hofer Thanks. I would also add some information to the log so that when -XX:+PrintGC is enabled, you see a message right away (even before GC has kicked in) on what GC parameters you are using at runtime.

Related to this, if you are tweaking the GC policy, the one used at runtime is only -XX:+VerboseGC is passed in which is very verbose. I think that the GC policy should also be logged on startup -XX:+PrintGC is passed in.

So, when tackling the old gen configuration info reporting at runtime, you could also tackle the gc policy configuration reporting.

galderz · 2024-05-06T08:18:44Z

To be more precise about GC policy, I'm tackling about this messages:

[7.525s] GC(0) Using Serial GC
[7.525s] GC(0)   Memory: 32049M
[7.525s] GC(0)   Heap policy: adaptive
[7.525s] GC(0)   Maximum young generation size: 21M
[7.525s] GC(0)   Maximum heap size: 64M
[7.525s] GC(0)   Minimum heap size: 16M
[7.525s] GC(0)   Aligned chunk size: 512K
[7.525s] GC(0)   Large array threshold: 128K

These only appear when VerboseGC is on, and the only come up when GC kicks in, say when you start sending HTTP requests. PrintGC should be enough to see this and they should appear as soon as the native image starts, before any GC cycles kick in.

galderz · 2024-05-06T12:56:37Z

@peter-hofer I ran some quick tests locally with a basic Quarkus app and didn't see any major failures with it. Throughput seems to be about the same than the copying one, but latency at high percentiles shows ~10% worse. This can be understandable since as you said, the impl has not yet been optimized. I will be exploring additional angles in an upcoming blog post.

…ely).

Christian Aistleitner and others added 9 commits April 30, 2024 17:49

Introduce new compacting GC

4ac1d2e

Merge remote-tracking branch 'origin/master' into ca/compacting-gc

624af25

Mend uninterruptible stack reference fixup.

5419cea

Move UseRememberedSet to SerialGCOptions.

68e7fa1

Abstract OldGeneration, reintroduce CopyingOldGeneration, reduce diff…

ca79665

… to master.

Refactor marking.

5468dd9

Refactor heap space classification.

71d61a9

Use stack for depth-first marking in GC.

e43052c

Rename compacting old gen package.

ab5ea70

graalvmbot assigned peter-hofer May 1, 2024

oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label May 1, 2024

graalvmbot force-pushed the ca/compacting-gc branch 2 times, most recently from 3426504 to 07212ef Compare May 4, 2024 20:10

graalvmbot force-pushed the ca/compacting-gc branch 2 times, most recently from c5bc94e to 63dd758 Compare May 8, 2024 19:29

peter-hofer added 8 commits May 10, 2024 14:23

Revisit newly added files.

f83ca9b

Avoid a second pass over pinned chunks to prepare sweeping.

e145910

Iterate objects for updating references more efficiently.

43660c0

Revisit how CompactingVisitor updates chunk top pointers.

89f71cd

Store size of object sequences rather than preceding gaps.

c5c68c4

Revert GC policy changes (other than collecting the young gen separat…

9991e95

…ely).

Disable sweeping due to low fragmentation.

0d692b0

Fix Epsilon GC when CompactingOldGen defaults to true.

34f2556

graalvmbot force-pushed the ca/compacting-gc branch 4 times, most recently from 2d89f1a to d00bd3e Compare May 15, 2024 11:29

graalvmbot force-pushed the ca/compacting-gc branch 2 times, most recently from 14e9d0e to cb1baa2 Compare May 22, 2024 07:52

Improvements and minor fixes.

4cb51af

graalvmbot force-pushed the ca/compacting-gc branch from 782474a to d6c86f0 Compare May 22, 2024 15:01

peter-hofer added 2 commits May 23, 2024 13:33

Add changelog entry and default to -H:-CompactingOldGen.

8fe991b

Merge remote-tracking branch 'origin/master' into ca/compacting-gc

42cd023

graalvmbot force-pushed the ca/compacting-gc branch from d6c86f0 to 42cd023 Compare May 23, 2024 11:53

graalvmbot closed this May 24, 2024

graalvmbot deleted the ca/compacting-gc branch May 24, 2024 01:29

graalvmbot merged commit 1db13d4 into master May 24, 2024
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GR-18214] Compacting garbage collection (non-default). #8870

[GR-18214] Compacting garbage collection (non-default). #8870

graalvmbot commented May 1, 2024 •

edited by fniephaus

SergejIsbrecht commented May 2, 2024

peter-hofer commented May 2, 2024

galderz commented May 6, 2024

peter-hofer commented May 6, 2024

galderz commented May 6, 2024

galderz commented May 6, 2024

galderz commented May 6, 2024

[GR-18214] Compacting garbage collection (non-default). #8870

[GR-18214] Compacting garbage collection (non-default). #8870

Conversation

graalvmbot commented May 1, 2024 • edited by fniephaus

SergejIsbrecht commented May 2, 2024

peter-hofer commented May 2, 2024

galderz commented May 6, 2024

peter-hofer commented May 6, 2024

galderz commented May 6, 2024

galderz commented May 6, 2024

galderz commented May 6, 2024

graalvmbot commented May 1, 2024 •

edited by fniephaus