JIT produces different asm from IL emit than from source #89685

timcassell · 2023-07-29T22:21:36Z

While refactoring BenchmarkDotNet to call benchmark methods directly instead of through a delegate (dotnet/BenchmarkDotNet#2334), I ran into an issue where the InProcessEmitToolchain is producing different results than the default toolchain. I disassembled it to try to figure out why it was different, and found the only difference is the call instruction.

Default toolchain

call      qword ptr [BenchmarkDotNet.Autogenerated.Runnable_0.__Overhead()]

InProcessEmit

call      BenchmarkDotNet.Autogenerated.Runnable_0.__Overhead()

It wouldn't really be an issue if the workload call also used the same call instruction, but it doesn't, so the overhead measurement is off.

call      qword ptr [ActualWork.IncrementField()]

Is there any way I can make the asm match so we can get correct measurements?

call-direct-default-asm.md
call-direct-inprocess-asm.md

The text was updated successfully, but these errors were encountered:

ghost · 2023-07-29T22:21:44Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

While refactoring BenchmarkDotNet to call benchmark methods directly instead of through a delegate (dotnet/BenchmarkDotNet#2334), I ran into an issue where the InProcessEmitToolchain is producing different results than the default toolchain. I disassembled it to try to figure out why it was different, and found the only difference is the call instruction.

Default toolchain

call      qword ptr [BenchmarkDotNet.Autogenerated.Runnable_0.__Overhead()]

InProcessEmit

call      BenchmarkDotNet.Autogenerated.Runnable_0.__Overhead()

It wouldn't really be an issue if the workload call also used the same call instruction, but it doesn't, so the overhead measurement is off.

call      qword ptr [ActualWork.IncrementField()]

Is there any way I can make the asm match so we can get correct measurements?

call-direct-default-asm.md
call-direct-inprocess-asm.md

Author:	timcassell
Assignees:	-
Labels:	`area-CodeGen-coreclr`
Milestone:	-

EgorBo · 2023-07-29T22:40:38Z

Managed calls are always expected to be indirect (square brackets) so it's not clear to me what produced the direct calls, perhaps, those are direct calls to jump-stubs?

MichalPetryka · 2023-07-29T23:46:57Z

Managed calls are always expected to be indirect (square brackets) so it's not clear to me what produced the direct calls, perhaps, those are direct calls to jump-stubs?

Maybe it's related to the fact that ILEmit is not tiered?

timcassell · 2023-07-29T23:53:39Z

Maybe it's related to the fact that ILEmit is not tiered?

Would that matter here, though? The OverheadActionUnroll etc methods are annotated with AggressiveOptimization, and the __OverheadWrapper and __WorkloadWrapper methods are annotated with NoOptimization, so there should be no tiering involved.

EgorBo · 2023-07-30T17:48:19Z

Managed calls are always expected to be indirect (square brackets) so it's not clear to me what produced the direct calls, perhaps, those are direct calls to jump-stubs?

Maybe it's related to the fact that ILEmit is not tiered?

They're indirect not because of tiereing, but because of stubs and potential rejit profiler sessions

timcassell · 2023-08-01T02:02:41Z

I tried making the wrapper method static and passing in the instance for a virtual call.
I tried making a separate class between the benchmark class and generated class in the hierarchy.
I tried making them completely separate classes (no hierarchical relationship).

No matter what I tried, I could not get the overhead and workload calls to have the same assembly call.

This is only an issue in net7.0+, net6.0 has matching assembly for both methods (it uses the direct calls without qword ptr).

timcassell · 2023-08-09T21:47:37Z

@EgorBo This issue impacts #89940 (it's part of the fix in my PR).

EgorBo · 2023-08-09T22:12:31Z

I tried making the wrapper method static and passing in the instance for a virtual call. I tried making a separate class between the benchmark class and generated class in the hierarchy. I tried making them completely separate classes (no hierarchical relationship).

No matter what I tried, I could not get the overhead and workload calls to have the same assembly call.

This is only an issue in net7.0+, net6.0 has matching assembly for both methods (it uses the direct calls without qword ptr).

Are there any steps on how to reproduce this locally?

timcassell · 2023-08-09T22:15:35Z

Are there any steps on how to reproduce this locally?

Are you able to pull my fork/branch and check it? If not, I can try to create a simple repro.

EgorBo · 2023-08-09T22:18:00Z

Are there any steps on how to reproduce this locally?

Are you able to pull my fork/branch and check it? If not, I can try to create a simple repro.

I can clone it but it'd be nice to have exact steps on how to build it and reproduce 🙂

timcassell · 2023-08-09T22:22:45Z

exact steps on how to build it and reproduce

In the BenchmarkDotNet.IntegrationTests.ManualRunning there is a test NonEmptyBenchmarksReportsNonZeroTimeAndZeroAllocated_InProcess. Remove the Skip reason and run it in net7.0 with typeof(ActualWork).

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 29, 2023

ghost added the untriaged New issue has not been triaged by the area owner label Jul 29, 2023

JulieLeeMSFT added the question Answer questions and provide assistance, not an issue with source code or documentation. label Jul 31, 2023

JulieLeeMSFT added this to the Future milestone Jul 31, 2023

ghost removed the untriaged New issue has not been triaged by the area owner label Jul 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT produces different asm from IL emit than from source #89685

JIT produces different asm from IL emit than from source #89685

timcassell commented Jul 29, 2023

ghost commented Jul 29, 2023

EgorBo commented Jul 29, 2023

MichalPetryka commented Jul 29, 2023

timcassell commented Jul 29, 2023 •

edited

EgorBo commented Jul 30, 2023

timcassell commented Aug 1, 2023

timcassell commented Aug 9, 2023

EgorBo commented Aug 9, 2023

timcassell commented Aug 9, 2023

EgorBo commented Aug 9, 2023

timcassell commented Aug 9, 2023

JIT produces different asm from IL emit than from source #89685

JIT produces different asm from IL emit than from source #89685

Comments

timcassell commented Jul 29, 2023

ghost commented Jul 29, 2023

EgorBo commented Jul 29, 2023

MichalPetryka commented Jul 29, 2023

timcassell commented Jul 29, 2023 • edited

EgorBo commented Jul 30, 2023

timcassell commented Aug 1, 2023

timcassell commented Aug 9, 2023

EgorBo commented Aug 9, 2023

timcassell commented Aug 9, 2023

EgorBo commented Aug 9, 2023

timcassell commented Aug 9, 2023

timcassell commented Jul 29, 2023 •

edited