[GR-52483] Native Image call graph imprecision #8496

mv02 · 2024-03-01T18:54:32Z

Describe the issue

When generating a call graph using the -H:+PrintAnalysisCallTree and -H:PrintAnalysisCallTreeType=CSV options with Native Image, virtual invokes are handled by creating a "virtual node" in the graph, which causes imprecision.

I created a reproducer script that parses CSV reports from a simple program compilation and outputs the call graph in DOT language. The Animal.makeSound virtual node can be seen, which creates a path from Main.foo to Cow.makeSound and, similarly, a path from Main.bar to Dog.makeSound, while these methods can never call those specific implementations of the virtual method, therefore these paths should not exist in the call graph.

Steps to reproduce the issue

Clone the reproducer: git clone https://gist.github.com/758955c93fa3f6eaed8d90057182eaad.git
Compile the example program: javac Main.java
Run the example through Native Image:
native-image Main -H:+PrintAnalysisCallTree -H:PrintAnalysisCallTreeType=CSV
Run the included script that parses the report and draw its output:
python3 issue.py | dot -Tpdf -o graph.pdf

Describe GraalVM and your environment:

GraalVM version: 2334a13
JDK major version: 23
OS: Linux 6.7.6
Architecture: x86_64

More details

Picture of the generated call graph
Reproducer gist

The text was updated successfully, but these errors were encountered:

fernando-valdez · 2024-03-05T03:48:50Z

Thanks for reporting this issue. I created an internal ticket to track this effort: GR-52483

d-kozak · 2024-03-05T10:36:57Z

@cstancu I believe this issue is a potential problem for anyone trying to extract the call graph and run some client analysis on it, as it unnecessarily decreases the precision of the call graph. Would you be okay with either reimplementing the existing CSV mode or creating a third one if the authors of the current approach would like it to stay unchanged? I've discussed it with @mv02 and he is willing to implement it.

d-kozak · 2024-03-05T10:47:08Z

I would prefer a simpler output format that creates three files connected by one-to-many mappings: method 1-n invoke 1-n call-target(possibly with the fourth file for describing entry points, which could alternatively be a column in the method CSV file).

d-kozak · 2024-03-14T09:22:25Z

Hello @galderz, if I am not mistaken you are the author of the CSV format? What is your opinion on this issue? Have we missed something or is the imprecision indeed in there?

galderz · 2024-03-22T07:27:38Z

@d-kozak It's indeed an issue in the CSV output format.

Looking at the text format it all looks as expected:

│       │   ├── directly calls Main.main(java.lang.String[]):void id=4557 @bci=73 
│       │   │   ├── directly calls Main.foo():void id=5638 @bci=0 
│       │   │   │   ├── directly calls java.lang.Math.random():double id=6851 @bci=0 
│       │   │   │   │           └── virtually calls java.util.Random.next(int):int @bci=13
│       │   │   │   │               └── is overridden by java.util.Random.next(int):int id-ref=9238 
│       │   │   │   └── virtually calls Animal.makeSound():void @bci=29
│       │   │   │       ├── is overridden by Cat.makeSound():void id=6852 
│       │   │   │       └── is overridden by Dog.makeSound():void id=6853 
│       │   │   └── directly calls Main.bar():void id=5639 @bci=3 
│       │   │       └── virtually calls Animal.makeSound():void @bci=29
│       │   │           ├── is overridden by Cat.makeSound():void id-ref=6852 
│       │   │           └── is overridden by Cow.makeSound():void id=6854

@fernando-valdez you have assigned this to you, are you already trying to fix it? Otherwise just assign it to me and I'll look into a fix.

mv02 · 2024-03-22T09:02:10Z

I've already implemented the format described by @d-kozak, which eliminates the issue. The report consists of 3 files.

call_tree_methods.csv

MethodId,Name,Type,Parameters,Return,Display,Flags,IsEntryPoint
...
6327,foo,Main,empty,void,M.foo,s,false
6328,bar,Main,empty,void,M.bar,s,false
...
7720,makeSound,Cat,empty,void,C.makeSound,p,false
7721,makeSound,Dog,empty,void,D.makeSound,p,false
7722,makeSound,Cow,empty,void,C.makeSound,p,false
...
13228,makeSound,Animal,empty,void,A.makeSound,pa,false
...

call_tree_invokes.csv

InvokeId,MethodId,BytecodeIndexes,TargetId,IsDirect
...
16024,6327,29,13228,false
...
16026,6328,29,13228,false
...

call_tree_targets.csv

InvokeId,TargetId
...
16024,7720
16024,7721
...
16026,7720
16026,7722
...

Resulting call graph (vs the previous one)

If no one needs the current CSV format to stay unchanged, I can create a PR.

galderz · 2024-03-25T11:49:49Z

If no one needs the current CSV format to stay unchanged, I can create a PR.

I think it's fine to change it. I'm not aware of specific usages of the information that has been left out. I originally kept the same kind of edges found in the text format, that's why you had a virtual call different to a monomorphic call, and why the override edges were there. I blogged on how to use the CSV output here.

d-kozak · 2024-05-10T10:07:36Z

Resolved in #8774

mv02 added the bug label Mar 1, 2024

fernando-valdez self-assigned this Mar 5, 2024

fernando-valdez added the native-image label Mar 5, 2024

fernando-valdez changed the title ~~Native Image call graph imprecision~~ [GR-52483] Native Image call graph imprecision Mar 5, 2024

wirthi assigned d-kozak and unassigned fernando-valdez Mar 28, 2024

mv02 mentioned this issue Apr 15, 2024

Change CSV format of call tree report #8774

Merged

d-kozak closed this as completed May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GR-52483] Native Image call graph imprecision #8496

[GR-52483] Native Image call graph imprecision #8496

mv02 commented Mar 1, 2024 •

edited

fernando-valdez commented Mar 5, 2024

d-kozak commented Mar 5, 2024

d-kozak commented Mar 5, 2024

d-kozak commented Mar 14, 2024

galderz commented Mar 22, 2024

mv02 commented Mar 22, 2024

galderz commented Mar 25, 2024

d-kozak commented May 10, 2024

[GR-52483] Native Image call graph imprecision #8496

[GR-52483] Native Image call graph imprecision #8496

Comments

mv02 commented Mar 1, 2024 • edited

fernando-valdez commented Mar 5, 2024

d-kozak commented Mar 5, 2024

d-kozak commented Mar 5, 2024

d-kozak commented Mar 14, 2024

galderz commented Mar 22, 2024

mv02 commented Mar 22, 2024

galderz commented Mar 25, 2024

d-kozak commented May 10, 2024

mv02 commented Mar 1, 2024 •

edited