Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GR-18214] Compacting garbage collection (non-default). #8870

Merged
merged 20 commits into from May 24, 2024

Conversation

graalvmbot
Copy link
Collaborator

@graalvmbot graalvmbot commented May 1, 2024

This PR adds a mark&compact GC for the old generation of the Serial GC. The primary intention is to reduce worst-case memory usage compared to the copying GC, which can use 2x the current heap size when all objects survive.

Typical mark&compact collectors do four passes over the heap to:

  • mark live objects
  • assign new locations to them
  • update their references for the new locations
  • move them to their new locations

New locations are often stored in each individual object. This would significantly enlarge our current 4/8-byte object headers and add memory overhead even outside of GC. Using side tables would require this memory only during GC (and could also collect with fewer passes), but it requires allocating these tables precisely when memory can be scarce.

Our implementation uses the object header as it is and stores new locations of contiguous sequences of surviving objects in records in the gaps between them (made up of dead objects). In order to find an object's new location when updating a reference to it, we first identify the referenced object's aligned chunk. In the chunk's card table, we temporarily keep an index, which we use to find a record near the object. The records also form a singly-linked list, which we can follow further to find the exact record that applies to the object. With this record, we can then compute the object's new location and update references accordingly.

When encountering a chunk with pinned objects, we sweep the gaps in them instead. Unfortunately, we cannot fill them with other surviving objects because our design requires that the order of objects stays the same (or records would be overwritten prematurely). Currently we also copy only entire object sequences and not split them to fit smaller areas of unused memory at the end of chunks (but does not seem to be an issue in practice). We also still copy objects which have an identity hash code that is based on their current address and needs to be stored in an additional field, enlarging the objects.

The performance of the mark&compact GC is currently somewhat comparable to the current copying GC, sometimes better and sometimes worse, both in terms of CPU usage and memory usage. Depending on the application's heap usage, complete collections can become more expensive, which can cause the GC policy to use more memory to need fewer collections.

The initial implementation also isn't heavily tuned, so there should still be room for improvement. For example, the object order is currently determined by the copying collector in the young generation which often places related objects apart from each other. The GC policy can likely also be tweaked for the different characteristics.

Points of interest in the implementation:

  • option CompactingOldGen in SerialGCOptions
  • subclass CompactingOldGeneration of the now abstract OldGeneration class
  • classes in package com.oracle.svm.core.genscavenge.compacting
  • object header bits in ObjectHeaderImpl
  • checks for SerialGCOptions.useCompactingOldGen() in several places

This work is based on a prototype by JKU student Christian Aistleitner for his master's thesis.

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label May 1, 2024
@SergejIsbrecht
Copy link

@peter-hofer , I found the master thesis^0 for this commit, but the measurements are quite thin. Did you run internal measurements?

^0 https://epub.jku.at/obvulihs/download/pdf/9649056?originalFilename=true

@peter-hofer
Copy link
Member

@SergejIsbrecht, yes, and the gist of it is that it is somewhat comparable to the current copying GC, sometimes better and sometimes worse, both in terms of CPU usage and memory usage. We avoided adding a forwarding pointer to every object and instead store the new object locations in the gaps between sequences of live objects, which requires extra passes over the heap and more complex lookups for determining an object's new location and updating references to it. This in turn can make full collections more expensive, which can cause the GC policy to use more memory to do fewer collections. On the other hand, you get an (almost) hard heap size limit by collecting in-place. This initial implementation also isn't heavily tuned, so I believe there's still some low-hanging fruits to harvest.

@graalvmbot graalvmbot force-pushed the ca/compacting-gc branch 2 times, most recently from 3426504 to 07212ef Compare May 4, 2024 20:10
@galderz
Copy link
Contributor

galderz commented May 6, 2024

@peter-hofer Thanks for this work and the presentation last week during the native image meeting. I'm giving this a go and wondering that the following native image build line should indicate whether mark-copy or mark-compact is in use:

 Garbage collector: Serial GC (max heap size: 80% of RAM)

As it stands, there's no indication of which old gen algorithm is in use.

@peter-hofer
Copy link
Member

@galderz, thanks for giving it a try. In this branch, the CompactingOldGen option is on by default, but it won't be for merging. I believe that when it gets enabled explicitly, it will show up in the builder output as a custom option. Indicating it in the GC line is a possibility though.

@galderz
Copy link
Contributor

galderz commented May 6, 2024

@peter-hofer Thanks. I would also add some information to the log so that when -XX:+PrintGC is enabled, you see a message right away (even before GC has kicked in) on what GC parameters you are using at runtime.

Related to this, if you are tweaking the GC policy, the one used at runtime is only -XX:+VerboseGC is passed in which is very verbose. I think that the GC policy should also be logged on startup -XX:+PrintGC is passed in.

So, when tackling the old gen configuration info reporting at runtime, you could also tackle the gc policy configuration reporting.

@galderz
Copy link
Contributor

galderz commented May 6, 2024

To be more precise about GC policy, I'm tackling about this messages:

[7.525s] GC(0) Using Serial GC
[7.525s] GC(0)   Memory: 32049M
[7.525s] GC(0)   Heap policy: adaptive
[7.525s] GC(0)   Maximum young generation size: 21M
[7.525s] GC(0)   Maximum heap size: 64M
[7.525s] GC(0)   Minimum heap size: 16M
[7.525s] GC(0)   Aligned chunk size: 512K
[7.525s] GC(0)   Large array threshold: 128K

These only appear when VerboseGC is on, and the only come up when GC kicks in, say when you start sending HTTP requests. PrintGC should be enough to see this and they should appear as soon as the native image starts, before any GC cycles kick in.

@galderz
Copy link
Contributor

galderz commented May 6, 2024

@peter-hofer I ran some quick tests locally with a basic Quarkus app and didn't see any major failures with it. Throughput seems to be about the same than the copying one, but latency at high percentiles shows ~10% worse. This can be understandable since as you said, the impl has not yet been optimized. I will be exploring additional angles in an upcoming blog post.

@graalvmbot graalvmbot force-pushed the ca/compacting-gc branch 2 times, most recently from c5bc94e to 63dd758 Compare May 8, 2024 19:29
@graalvmbot graalvmbot force-pushed the ca/compacting-gc branch 4 times, most recently from 2d89f1a to d00bd3e Compare May 15, 2024 11:29
@graalvmbot graalvmbot force-pushed the ca/compacting-gc branch 2 times, most recently from 14e9d0e to cb1baa2 Compare May 22, 2024 07:52
@graalvmbot graalvmbot closed this May 24, 2024
@graalvmbot graalvmbot deleted the ca/compacting-gc branch May 24, 2024 01:29
@graalvmbot graalvmbot merged commit 1db13d4 into master May 24, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants