Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GR-51654] Add support for JFR jdk.ObjectCount and jdk.ObjectCountAfterGC events. #7433

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

roberttoyonaga
Copy link
Collaborator

@roberttoyonaga roberttoyonaga commented Sep 15, 2023

Summary

These events provide aggregate data about objects on the heap. They also useful for inspecting how the population of objects on the heap changes over time. For each class, the number of instances and total size is reported. The only difference between these two events is that JFR jdk.ObjectCount is periodic and guaranteed to run once per JFR chunk (not on every GC), while JFR jdk.ObjectCountAfterGC runs upon every GC.

Details

After a GC is completed, the heap is walked and live object data is gathered. This all happens during the GC safepoint.

Performance

This is expected to slightly slow down garbage collection and so is off-by-default in JIT mode. I ran an experiment comparing with/without these new events enabled. Hyperfoil was used to make 50 requests/s over 5s to a simple plaintext quarkus app running JFR. See here for the code. The results of 14 paired trials is shown below:

Increase in mean response time after new events enabled

  Mean response time increase (us)
Mean 292.4
Median 226
STDV 371.092
MAX 1238

Increase in 99th % response time after new events enabled

  p99 response time increase (us)
Mean 6477.2
Median 6783
STDV 5112.92
MAX 13566

(P99 means 99% of responses were faster)

As you can see, the performance hit is much worse for p99. This is expected because the change only happens during a GC. RSS usage and start up time were unaffected by enabling these new events.

I also tried explicitly timing how long the ObjectCountEventSupport.emitEvents() takes to complete. It appears to take about 2.7-3.1 ms on my machine (based on results from only 4 GCs).

Related issue: #7402

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Sep 15, 2023
@roberttoyonaga roberttoyonaga self-assigned this Sep 15, 2023
@roberttoyonaga roberttoyonaga force-pushed the object-count branch 2 times, most recently from 15c4df0 to d873806 Compare September 18, 2023 14:37
Copy link
Member

@christianhaeubl christianhaeubl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, I added a few comments and questions.

@Uninterruptible(reason = "Set and unset should be atomic with invoked GC to avoid races.")
private static void emitObjectCount() {
if (JfrEvent.ObjectCount.shouldEmit()) {
ObjectCountEventSupport.setShouldSendRequestableEvent(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not atomic enough (another GC could already be enqueued, or it could get enqueued in the meanwhile). I think you can completely remove setShouldSendRequestableEvent as it should be sufficient if you check for GCCause.JfrObjectCount in ObjectCountEventSupport.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea. I've adopted your suggestion.

@@ -177,7 +178,7 @@ private void collect(GCCause cause, boolean forceFullGC) {

@Uninterruptible(reason = "Avoid races with other threads that also try to trigger a GC")
@RestrictHeapAccess(access = RestrictHeapAccess.Access.NO_ALLOCATION, reason = "Must not allocate in the implementation of garbage collection.")
boolean collectWithoutAllocating(GCCause cause, boolean forceFullGC) {
public boolean collectWithoutAllocating(GCCause cause, boolean forceFullGC) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be necessary, you can call GC.collectCompletely(...) instead.

@@ -244,6 +245,7 @@ private boolean collectImpl(GCCause cause, long requestingNanoTime, boolean forc
}
} finally {
JfrGCEvents.emitGarbageCollectionEvent(getCollectionEpoch(), cause, startTicks);
ObjectCountEventSupport.emitEvents((int) getCollectionEpoch().rawValue(), startTicks);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to do the conversion to int as late as possible (i.e., in ObjectCountEventSupport.emitEvents0, right before you emit the event).

@@ -49,6 +49,7 @@ public class GCCause {
@DuplicatedInNativeCode public static final GCCause HintedGC = new GCCause("Hinted GC", 3);
@DuplicatedInNativeCode public static final GCCause JvmtiForceGC = new GCCause("JvmtiEnv ForceGarbageCollection", 4);
@DuplicatedInNativeCode public static final GCCause HeapDump = new GCCause("Heap Dump Initiated GC ", 5);
@DuplicatedInNativeCode public static final GCCause JfrObjectCount = new GCCause("Required for JFR object counting", 6);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... new GCCause("JFR object counting", 6);

Please also add a comment that this is a GC cause that HotSpot doesn't have.

@@ -102,15 +102,20 @@ private TypeInfo collectTypeInfo(boolean flushpoint) {
return typeInfo;
}

boolean isClassGenerated(Class<?> clazz) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be private.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed this now. Please see responses below.

if (isPackageVisited(typeInfo, pkg)) {
assert module == (flushedPackages.containsKey(pkg.getName()) ? flushedPackages.get(pkg.getName()).module : typeInfo.packages.get(pkg.getName()).module);
return false;
Module cached = (flushedPackages.containsKey(pkg.getName()) ? flushedPackages.get(pkg.getName()).module : typeInfo.packages.get(pkg.getName()).module);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment why this is necessary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here was a bit confusing and has been removed now in favor of what's described in the above response.

Previously, this logic was so that we would only replace the cached module if it is the unnamed module. And we only want to bother replacing it with a named module. This effectively would result in only serializing named modules, or the unnamed module if there was no named module for the given package.

}

private static void emitEvents0(int gcId, long startTicks) {
PointerArray objectCounts = StackValue.get(PointerArray.class);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to replace PointerArray and PointerArrayAccess with a NonmovableArray (see NonMovableArrays.createWordArray()).

private static long visitObjects(PointerArray objectCounts) {
assert VMOperation.isGCInProgress();
objectCountVisitor.initialize(objectCounts);
Heap.getHeap().walkImageHeapObjects(objectCountVisitor);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want to visit all Java heap objects (and not only the image heap objects).

}

/**
* It's ok to ge the trace ID here because JFR epoch will not change before jdk.ObjectCount
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... ok to get ...

objectCountData.setCount(objectCountData.getCount() + 1);

// Get size
long additionalSize = uninterruptibleGetSize(obj);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use LayoutEncoding.getSizeFromObjectInGC(...) instead.

@@ -77,8 +76,6 @@ public long getSymbolId(String imageHeapString, boolean previousEpoch, boolean r
return 0;
}

assert Heap.getHeap().isInImageHeap(imageHeapString);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion was failing due to com.sun.proxy.jdk.proxy1. I think this assertion is incorrect because its possible we may sometimes need to serialize symbols not in the image heap. This was exposed now since the ObjectCount event causes many more symbols to be required. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be bad, then we would need to change JfrSymbolRepository significantly. Do you have a stack trace?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the string is not in the image heap, would it be possible to derive the hash from the symbol string itself rather than from its memory address?

I added an assertion to JfrTypeRepository.writePackage. Then with mx native-unittest:

21) com.oracle.svm.test.jfr.TestVirtualThreadsJfrStreaming#test
java.lang.AssertionError: Not in image heapcom.sun.proxy.jdk.proxy1
        at org.graalvm.nativeimage.builder/com.oracle.svm.core.jfr.JfrTypeRepository.writePackage(JfrTypeRepository.java:179)
        at org.graalvm.nativeimage.builder/com.oracle.svm.core.jfr.JfrTypeRepository.writePackages(JfrTypeRepository.java:172)
        at org.graalvm.nativeimage.builder/com.oracle.svm.core.jfr.JfrTypeRepository.write(JfrTypeRepository.java:68)
        at org.graalvm.nativeimage.builder/com.oracle.svm.core.jfr.JfrChunkWriter.writeConstantPools(JfrChunkWriter.java:354)
        at org.graalvm.nativeimage.builder/com.oracle.svm.core.jfr.JfrChunkWriter.writeCheckpointEvent(JfrChunkWriter.java:324)
        at org.graalvm.nativeimage.builder/com.oracle.svm.core.jfr.JfrChunkWriter.writeFlushCheckpoint(JfrChunkWriter.java:296)
        at org.graalvm.nativeimage.builder/com.oracle.svm.core.jfr.JfrChunkWriter.closeFile(JfrChunkWriter.java:222)
        at org.graalvm.nativeimage.builder/com.oracle.svm.core.jfr.SubstrateJVM.setOutput(SubstrateJVM.java:363)
        at jdk.jfr@21/jdk.jfr.internal.JVM.setOutput(JVM.java:167)
        at jdk.jfr@21/jdk.jfr.internal.MetadataRepository.setOutput(MetadataRepository.java:295)
        at jdk.jfr@21/jdk.jfr.internal.PlatformRecorder.start(PlatformRecorder.java:267)
        at jdk.jfr@21/jdk.jfr.internal.PlatformRecording.start(PlatformRecording.java:123)
        at jdk.jfr@21/jdk.jfr.consumer.RecordingStream.startAsync(RecordingStream.java:379)
        at com.oracle.svm.test.jfr.JfrStreamingTest.startStream(JfrStreamingTest.java:87)
        at com.oracle.svm.test.jfr.JfrStreamingTest.startStream(JfrStreamingTest.java:73)
        at com.oracle.svm.test.jfr.TestVirtualThreadsJfrStreaming.test(TestVirtualThreadsJfrStreaming.java:68)
        at java.base@21/java.lang.reflect.Method.invoke(Method.java:580)
        at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
        at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
        at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
        at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
        at org.junit.runners.Suite.runChild(Suite.java:128)
        at org.junit.runners.Suite.runChild(Suite.java:27)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
        at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
        at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
        at com.oracle.mxtool.junit.MxJUnitWrapper.runRequest(MxJUnitWrapper.java:376)
        at org.graalvm.nativeimage.junitsupport/com.oracle.svm.junit.SVMJUnitRunner.run(SVMJUnitRunner.java:164)
        at org.graalvm.nativeimage.junitsupport/com.oracle.svm.junit.SVMJUnitRunner.main(SVMJUnitRunner.java:184)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roberttoyonaga : please try to figure out where the problematic Package object is created, for which package.getName() returns an object that doesn't live in the image heap. I suppose that Package object also doesn't live in the image heap?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please try to figure out where the problematic Package object is created

It gets created the first time we try to get the package. In this case, that's in JfrTypeRepository while we're trying to write out the type data.
Class.getPackage() --> BootLoader.definePackage(Class<?> c) --> Target_jdk_internal_loader_BootLoader.getDefinedPackage --> Target_java_lang_Package()

I suppose that Package object also doesn't live in the image heap?

That's correct. It seems like when DynamicHubInitializer.registerPackage is called at build time, javaClass.getPackage() returns null for all packages with name com.sun.proxy.jdk.proxy1, preventing the package from being registered. Maybe this is why the package is not in the image heap.

Is something wrong with the behavior for handling proxy classes? Otherwise, should we modify the JfrSymbolRepository (I can do that in a separate PR)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a difference in behavior between Native Image and HotSpot. If HotSpot directly returns null for the package of proxy classes, then Native Image should do the same (i.e., it should not allocate a package at run-time in Target_jdk_internal_loader_BootLoader.getDefinedPackage).

Please try to implement a smaller reproducer that shows this difference in behavior and open a separate issue for this problem.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @christianhaeubl, I've opened an issue here: #8796
It includes a small reproducer too.

@christianhaeubl christianhaeubl changed the title Add support for JFR jdk.ObjectCount and jdk.ObjectCountAfterGC events [GR-51654] Add support for JFR jdk.ObjectCount and jdk.ObjectCountAfterGC events. Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
native-image native-image-jfr OCA Verified All contributors have signed the Oracle Contributor Agreement. redhat-interest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants