In this part of the series, we benchmark the performance of our EasyScript interpreter, and fix some performance issues that we find.
The benchmark is written using the JMH library.
We introduce a superclass, TruffleBenchmark
,
that holds the common configuration of the benchmark
(how long should the warmup and measurement phases be,
what units to measure the performance in,
the options to pass to the JVM the benchmarks execute in, etc.).
The actual measured code extends TruffleBenchmark
,
and is kept in the FibonacciBenchmark
class.
We use a naive implementation of the Fibonacci function as the measured code.
In addition to EasyScript, we also implement benchmarks for Java,
the GraalVM JavaScript Truffle implementation
(which used to come bundled with GraalVM,
but since version 22
, is now a
separate library),
and also SimpleLanguage,
for comparison.
The initial numbers I get on my laptop when executing the benchmark command
(./gradlew :part-09:jmh
) with the interpreter code from the previous part:
Benchmark Mode Cnt Score Error Units
FibonacciBenchmark.recursive_eval_ezs avgt 5 6028.256 ± 421.844 us/op
FibonacciBenchmark.recursive_eval_js avgt 5 78.143 ± 3.453 us/op
FibonacciBenchmark.recursive_eval_sl avgt 5 55.662 ± 3.395 us/op
FibonacciBenchmark.recursive_java avgt 5 38.383 ± 1.046 us/op
Clearly, our interpreter needs some work to catch up to the performance of the JavaScript and SimpleLanguage implementations.
The naive Fibonacci function implementation uses subtraction, which the interpreter from part 8 does not support. Given that, we need to add support for it to our language.
The change is relatively straightforward:
we change the existing rule in the grammar for addition
to also allow subtraction.
We handle the two possible symbols that we can have in that rule in the
EasyScriptTruffleParser
class.
The implementation of the subtraction Truffle Node itself
is very similar to the addition Node.
In order to improve the performance of our interpreter, we make the following changes to it:
- We refactor the implementation of
executeStatement()
in the block statement Node to avoid redundant assignments. - We make the
FunctionObject
class mutable with theredefine()
method. We also add anAssumption
that confirms a function was not redefined since itsCallTarget
was last referenced. - We change the function dispatch code
to check this
Assumption
in thedispatchDirectly()
specialization - We change the API of defining a global function
to call the
FunctionObject.redefine()
method if a function already exists, instead of always creating a newFunctionObject
instance - We add caching to the function declaration Node,
so that it doesn't create a new
CallTarget
every time it's executed (as they all would have the same behavior anyway) - Since we now always return the same
FunctionObject
instance from ourGlobalScopeObject
class, we add caching to the global reference Node in case it resolves to a function - We only check if a variable with the same name already exists on the
first execution of its declaration -
otherwise, executing the same Truffle AST multiple times
(which is how
Context.eval()
works, as it caches the AST to avoid re-parsing your code) would fail
With these changes, re-running the benchmark produces the following numbers:
Benchmark Mode Cnt Score Error Units
FibonacciBenchmark.recursive_eval_ezs avgt 5 102.190 ± 1.099 us/op
We have achieved almost a 60x speedup compared to the version from part 8.
When diagnosing performance issues, the Ideal Graph Visualizer tool is very helpful. It’s a project maintained by the same team that maintains GraalVM and Truffle, and allows visualizing as graphs the many debug trees that Truffle and Graal produce in the process of interpreting your language.
To have your Truffle interpreter emit graphs to consume in IGV,
you need to add the -Dgraal.Dump=:1
JVM argument when starting it.
You can send the dumps directly to the program with the -Dgraal.PrintGraph=Network
,
or to a specific file by passing the -Dgraal.DumpPath
argument
(the default is to save them in the graal_dumps
folder in the current directory if they cannot be delivered to a running instance of IGV).
For example, to dump the data from the EasyScript benchmark, you can add the appropriate JVM arguments in the benchmark configuration:
@Fork(jvmArgsPrepend = {
"-Dgraal.Dump=:1",
"-Dgraal.PrintGraph=Network"
})
@Benchmark
public int recursive_eval_ezs() {
return this.truffleContext.eval("ezs", FIBONACCI_JS_PROGRAM).asInt();
}
After downloading IGV
(it's under the "Archived Enterprise Releases" tab),
run it by executing ./idealgraphvisualizer
in the bin
directory of the downloaded and uncompressed program.
Now, when running the benchmark, you should see in the output:
[Use -Dgraal.LogFile=<path> to redirect Graal log output to a file.]
Connected to the IGV on 127.0.0.1:4445
And a bunch of graphs should appear in the "Outline" menu on the left: