Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Allow moving handler regions in fgRelocateEHRegions #101613

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

amanasifkhalid
Copy link
Member

@amanasifkhalid amanasifkhalid commented Apr 26, 2024

Follow-up to #101611. fgRelocateEHRegions previously would not try to move handler regions due to implementation restrictions in fgDetermineFirstColdBlock that have since been lifted. This should only affect behavior on Windows x86 when we aren't using funclets.

Also, I noticed we never set Compiler::fgCanRelocateEHRegions to false anywhere, so it seems safe to remove this member.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 26, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@amanasifkhalid
Copy link
Member Author

cc @dotnet/jit-contrib, @AndyAyersMS PTAL. Diffs look like an improvement: A lot of the size savings seem to come from shorter jumps on non-exceptional paths when we decide to move handlers further down the method, while a lot of the size regressions stem from new jumps now that handlers don't fall through to the non-exceptional successor, as well as larger jumps to the handlers. Since we only move handlers if they're marked as rarely run, I think we want to do this movement as a matter of principle.

@amanasifkhalid
Copy link
Member Author

SPMI replay failure is a timeout -- this change shouldn't affect x64, anyway.

@AndyAyersMS
Copy link
Member

Probably worth running jitstress. Unfortunately, that is blocked by #101628

@amanasifkhalid
Copy link
Member Author

Looks like jitstress is no longer blocked (at least not on x86). I'll run it now

@amanasifkhalid
Copy link
Member Author

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@amanasifkhalid
Copy link
Member Author

@AndyAyersMS this is the only test to fail in the outerloop pipelines: The NullReferenceException triggered in the try block is never caught by the catch block. I'm able to repro the failure without any stress modes locally. If I replace the offending delegate func(null, 1, 2, 3, 4, 5, 6, 7, 8) with a normal method call that throws, the failure doesn't repro, so it seems specific to exception handling with delegates. The codegen looks fine... Here's without this change:

; Assembly listing for method Test35000:TestEntryPoint():int (Tier0-FullOpts)
; Emitting BLENDED_CODE for X86 with AVX - Windows
; Tier-0 switched to FullOpts code
; optimized code
; ebp based frame
; partially interruptible
; No PGO data
; 2 inlinees with PGO data; 8 single block inlinees; 0 inlinees without PGO data
; Final local variable assignments
;
;  V00 loc0         [V00,T09] (  2,  2   )     ref  ->  eax         class-hnd single-def <System.Reflection.MethodInfo>
;  V01 loc1         [V01,T00] (  7,136   )     ref  ->  [ebp-0x28]  class-hnd EH-live single-def <System.Func`10[Test35000+TestData0,int,int,int,int,int,int,int,int,System.Object]>
;  V02 loc2         [V02,T02] (  2, 32   )     ref  ->  [ebp-0x2C]  class-hnd exact EH-live spill-single-def <Test35000+TestData0>
;  V03 loc3         [V03,T03] (  2, 32   )     ref  ->  [ebp-0x30]  class-hnd exact EH-live spill-single-def <Test35000+TestData1>
;  V04 loc4         [V04,T08] (  4,  2   )     int  ->  [ebp-0x20]  do-not-enreg[Z] EH-live
;  V05 loc5         [V05,T04] (  4, 25   )     int  ->  [ebp-0x24]  do-not-enreg[Z] EH-live
;  V06 loc6         [V06,T01] (  4,104   )     int  ->  edi        
;* V07 loc7         [V07    ] (  0,  0   )     ref  ->  zero-ref    class-hnd <System.NullReferenceException>
;* V08 tmp0         [V08    ] (  0,  0   )     ref  ->  zero-ref    class-hnd exact "NewObj constructor temp" <Test35000+TestData0>
;* V09 tmp1         [V09    ] (  0,  0   )     ref  ->  zero-ref    class-hnd exact "NewObj constructor temp" <Test35000+TestData1>
;  V10 tmp2         [V10,T10] (  2,  0   )     ref  ->  ecx         class-hnd "impSpillSpecialSideEff" <System.NullReferenceException>
;  V11 tmp3         [V11,T06] (  3,  4.50)     ref  ->  eax         single-def "argument with side effect"
;  V12 EHSlots      [V12    ] (  1,  1   )  struct (16) [ebp-0x1C]  do-not-enreg[XS] must-init addr-exposed "lvaShadowSPslotsVar"
;  V13 rat0         [V13,T05] (  5,  7.50)     ref  ->  esi         "replacement local"
;  V14 rat1         [V14,T07] (  3,  2.50)     int  ->  ecx         "CSE for expectedClsNode"
;
; Lcl frame size = 40

G_M22950_IG01:  ;; offset=0x0000
       push     ebp
       mov      ebp, esp
       push     edi
       push     esi
       sub      esp, 40
       xor      eax, eax
       mov      dword ptr [ebp-0x1C], eax
       mov      dword ptr [ebp-0x18], eax
       mov      dword ptr [ebp-0x14], eax
       mov      dword ptr [ebp-0x10], eax
						;; size=22 bbWeight=1 PerfScore 7.75
G_M22950_IG02:  ;; offset=0x0016
       push     -1
       push     28
       push     0
       push     3
       push     0
       push     0
       mov      ecx, 0x872781C      ; 'TestData0'
       mov      edx, 0x8727830      ; 'MyMethod'
       call     [System.RuntimeType:GetMethodImplCommon(System.String,int,int,System.Reflection.Binder,int,System.Type[],System.Reflection.ParameterModifier[]):System.Reflection.MethodInfo:this]
       push     eax
       push     1
       mov      ecx, 0x8727850      ; 'System.Func`10[Test35000+TestData0,System.Int32,System.Int32,Sy'
       xor      edx, edx
       call     [System.Delegate:CreateDelegate(System.Type,System.Object,System.Reflection.MethodInfo,ubyte):System.Delegate]
       mov      esi, eax
       test     esi, esi
       je       SHORT G_M22950_IG05
						;; size=50 bbWeight=1 PerfScore 16.50
G_M22950_IG03:  ;; offset=0x0048
       mov      ecx, 0x96AD8F0      ; System.Func`10[Test35000+TestData0,int,int,int,int,int,int,int,int,System.Object]
       cmp      dword ptr [esi], ecx
       je       SHORT G_M22950_IG05
						;; size=9 bbWeight=0.50 PerfScore 2.12
G_M22950_IG04:  ;; offset=0x0051
       mov      edx, eax
       call     [CORINFO_HELP_CHKCASTANY]
       mov      esi, eax
						;; size=10 bbWeight=0.25 PerfScore 0.88
G_M22950_IG05:  ;; offset=0x005B
       mov      gword ptr [ebp-0x28], esi
       mov      ecx, 0x96AD688      ; Test35000+TestData0
       call     CORINFO_HELP_NEWSFAST
       mov      gword ptr [ebp-0x2C], eax
       mov      ecx, 0x96ADB78      ; Test35000+TestData1
       call     CORINFO_HELP_NEWSFAST
       mov      gword ptr [ebp-0x30], eax
       xor      edx, edx
       mov      dword ptr [ebp-0x20], edx
						;; size=34 bbWeight=1 PerfScore 6.75
G_M22950_IG06:  ;; offset=0x007D
       mov      dword ptr [ebp-0x24], edx
						;; size=3 bbWeight=1 PerfScore 1.00
G_M22950_IG07:  ;; offset=0x0080
       xor      edi, edi
						;; size=2 bbWeight=8 PerfScore 2.00
G_M22950_IG08:  ;; offset=0x0082
       push     1
       push     2
       push     3
       push     4
       push     5
       push     6
       push     7
       push     8
       mov      edx, gword ptr [ebp-0x2C]
       mov      ecx, gword ptr [esi+0x04]
       call     [esi+0x0C]System.Func`10[System.__Canon,int,int,int,int,int,int,int,int,System.__Canon]:Invoke(System.__Canon,int,int,int,int,int,int,int,int):System.__Canon:this
       push     1
       push     2
       push     3
       push     4
       push     5
       push     6
       push     7
       push     8
       mov      edx, gword ptr [ebp-0x30]
       mov      ecx, gword ptr [esi+0x04]
       call     [esi+0x0C]System.Func`10[System.__Canon,int,int,int,int,int,int,int,int,System.__Canon]:Invoke(System.__Canon,int,int,int,int,int,int,int,int):System.__Canon:this
       inc      edi
       cmp      edi, 50
       jl       SHORT G_M22950_IG08
						;; size=56 bbWeight=32 PerfScore 944.00
G_M22950_IG09:  ;; offset=0x00BA
       push     1
       push     2
       push     3
       push     4
       push     5
       push     6
       push     7
       push     8
       xor      edx, edx
       mov      ecx, gword ptr [esi+0x04]
       call     [esi+0x0C]System.Func`10[System.__Canon,int,int,int,int,int,int,int,int,System.__Canon]:Invoke(System.__Canon,int,int,int,int,int,int,int,int):System.__Canon:this
       jmp      SHORT G_M22950_IG11
						;; size=26 bbWeight=4 PerfScore 61.00
G_M22950_IG10:  ;; offset=0x00D4
       mov      ecx, eax
       mov      eax, dword ptr [ebp-0x20]
       inc      eax
       mov      dword ptr [ebp-0x20], eax
       call     [System.Console:WriteLine(System.Object)]
       call     CORINFO_HELP_ENDCATCH
       mov      esi, gword ptr [ebp-0x28]
						;; size=23 bbWeight=0 PerfScore 0.00
G_M22950_IG11:  ;; offset=0x00EB
       mov      eax, dword ptr [ebp-0x24]
       inc      eax
       mov      dword ptr [ebp-0x24], eax
       cmp      dword ptr [ebp-0x24], 10
       jl       SHORT G_M22950_IG07
						;; size=13 bbWeight=8 PerfScore 42.00
G_M22950_IG12:  ;; offset=0x00F8
       mov      eax, 100
       mov      ecx, 101
       cmp      dword ptr [ebp-0x20], 10
       cmovne   eax, ecx
						;; size=17 bbWeight=1 PerfScore 2.75
G_M22950_IG13:  ;; offset=0x0109
       lea      esp, [ebp-0x08]
       pop      esi
       pop      edi
       pop      ebp
       ret      
						;; size=7 bbWeight=1 PerfScore 3.00

; Total bytes of code 272, prolog size 22, PerfScore 1089.75, instruction count 102, allocated bytes for code 272 (MethodHash=0103a659) for method Test35000:TestEntryPoint():int (Tier0-FullOpts)
; ============================================================

and with:

; Assembly listing for method Test35000:TestEntryPoint():int (Tier0-FullOpts)
; Emitting BLENDED_CODE for X86 with AVX - Windows
; Tier-0 switched to FullOpts code
; optimized code
; ebp based frame
; partially interruptible
; No PGO data
; 2 inlinees with PGO data; 8 single block inlinees; 0 inlinees without PGO data
; Final local variable assignments
;
;  V00 loc0         [V00,T09] (  2,  2   )     ref  ->  eax         class-hnd single-def <System.Reflection.MethodInfo>
;  V01 loc1         [V01,T00] (  7,136   )     ref  ->  [ebp-0x28]  class-hnd EH-live single-def <System.Func`10[Test35000+TestData0,int,int,int,int,int,int,int,int,System.Object]>
;  V02 loc2         [V02,T02] (  2, 32   )     ref  ->  [ebp-0x2C]  class-hnd exact EH-live spill-single-def <Test35000+TestData0>
;  V03 loc3         [V03,T03] (  2, 32   )     ref  ->  [ebp-0x30]  class-hnd exact EH-live spill-single-def <Test35000+TestData1>
;  V04 loc4         [V04,T08] (  4,  2   )     int  ->  [ebp-0x20]  do-not-enreg[Z] EH-live
;  V05 loc5         [V05,T04] (  4, 25   )     int  ->  [ebp-0x24]  do-not-enreg[Z] EH-live
;  V06 loc6         [V06,T01] (  4,104   )     int  ->  edi        
;* V07 loc7         [V07    ] (  0,  0   )     ref  ->  zero-ref    class-hnd <System.NullReferenceException>
;* V08 tmp0         [V08    ] (  0,  0   )     ref  ->  zero-ref    class-hnd exact "NewObj constructor temp" <Test35000+TestData0>
;* V09 tmp1         [V09    ] (  0,  0   )     ref  ->  zero-ref    class-hnd exact "NewObj constructor temp" <Test35000+TestData1>
;  V10 tmp2         [V10,T10] (  2,  0   )     ref  ->  ecx         class-hnd "impSpillSpecialSideEff" <System.NullReferenceException>
;  V11 tmp3         [V11,T06] (  3,  4.50)     ref  ->  eax         single-def "argument with side effect"
;  V12 EHSlots      [V12    ] (  1,  1   )  struct (16) [ebp-0x1C]  do-not-enreg[XS] must-init addr-exposed "lvaShadowSPslotsVar"
;  V13 rat0         [V13,T05] (  5,  7.50)     ref  ->  esi         "replacement local"
;  V14 rat1         [V14,T07] (  3,  2.50)     int  ->  ecx         "CSE for expectedClsNode"
;
; Lcl frame size = 40

G_M22950_IG01:  ;; offset=0x0000
       push     ebp
       mov      ebp, esp
       push     edi
       push     esi
       sub      esp, 40
       xor      eax, eax
       mov      dword ptr [ebp-0x1C], eax
       mov      dword ptr [ebp-0x18], eax
       mov      dword ptr [ebp-0x14], eax
       mov      dword ptr [ebp-0x10], eax
						;; size=22 bbWeight=1 PerfScore 7.75
G_M22950_IG02:  ;; offset=0x0016
       push     -1
       push     28
       push     0
       push     3
       push     0
       push     0
       mov      ecx, 0x8FF7878      ; 'TestData0'
       mov      edx, 0x8FF788C      ; 'MyMethod'
       call     [System.RuntimeType:GetMethodImplCommon(System.String,int,int,System.Reflection.Binder,int,System.Type[],System.Reflection.ParameterModifier[]):System.Reflection.MethodInfo:this]
       push     eax
       push     1
       mov      ecx, 0x8FF78AC      ; 'System.Func`10[Test35000+TestData0,System.Int32,System.Int32,Sy'
       xor      edx, edx
       call     [System.Delegate:CreateDelegate(System.Type,System.Object,System.Reflection.MethodInfo,ubyte):System.Delegate]
       mov      esi, eax
       test     esi, esi
       je       SHORT G_M22950_IG05
						;; size=50 bbWeight=1 PerfScore 16.50
G_M22950_IG03:  ;; offset=0x0048
       mov      ecx, 0x9F7D5F0      ; System.Func`10[Test35000+TestData0,int,int,int,int,int,int,int,int,System.Object]
       cmp      dword ptr [esi], ecx
       je       SHORT G_M22950_IG05
						;; size=9 bbWeight=0.50 PerfScore 2.12
G_M22950_IG04:  ;; offset=0x0051
       mov      edx, eax
       call     [CORINFO_HELP_CHKCASTANY]
       mov      esi, eax
						;; size=10 bbWeight=0.25 PerfScore 0.88
G_M22950_IG05:  ;; offset=0x005B
       mov      gword ptr [ebp-0x28], esi
       mov      ecx, 0x9F7D388      ; Test35000+TestData0
       call     CORINFO_HELP_NEWSFAST
       mov      gword ptr [ebp-0x2C], eax
       mov      ecx, 0x9F7D878      ; Test35000+TestData1
       call     CORINFO_HELP_NEWSFAST
       mov      gword ptr [ebp-0x30], eax
       xor      edx, edx
       mov      dword ptr [ebp-0x20], edx
						;; size=34 bbWeight=1 PerfScore 6.75
G_M22950_IG06:  ;; offset=0x007D
       mov      dword ptr [ebp-0x24], edx
						;; size=3 bbWeight=1 PerfScore 1.00
G_M22950_IG07:  ;; offset=0x0080
       xor      edi, edi
						;; size=2 bbWeight=8 PerfScore 2.00
G_M22950_IG08:  ;; offset=0x0082
       push     1
       push     2
       push     3
       push     4
       push     5
       push     6
       push     7
       push     8
       mov      edx, gword ptr [ebp-0x2C]
       mov      ecx, gword ptr [esi+0x04]
       call     [esi+0x0C]System.Func`10[System.__Canon,int,int,int,int,int,int,int,int,System.__Canon]:Invoke(System.__Canon,int,int,int,int,int,int,int,int):System.__Canon:this
       push     1
       push     2
       push     3
       push     4
       push     5
       push     6
       push     7
       push     8
       mov      edx, gword ptr [ebp-0x30]
       mov      ecx, gword ptr [esi+0x04]
       call     [esi+0x0C]System.Func`10[System.__Canon,int,int,int,int,int,int,int,int,System.__Canon]:Invoke(System.__Canon,int,int,int,int,int,int,int,int):System.__Canon:this
       inc      edi
       cmp      edi, 50
       jl       SHORT G_M22950_IG08
						;; size=56 bbWeight=32 PerfScore 944.00
G_M22950_IG09:  ;; offset=0x00BA
       push     1
       push     2
       push     3
       push     4
       push     5
       push     6
       push     7
       push     8
       xor      edx, edx
       mov      ecx, gword ptr [esi+0x04]
       call     [esi+0x0C]System.Func`10[System.__Canon,int,int,int,int,int,int,int,int,System.__Canon]:Invoke(System.__Canon,int,int,int,int,int,int,int,int):System.__Canon:this
						;; size=24 bbWeight=4 PerfScore 53.00
G_M22950_IG10:  ;; offset=0x00D2
       mov      eax, dword ptr [ebp-0x24]
       inc      eax
       mov      dword ptr [ebp-0x24], eax
       cmp      dword ptr [ebp-0x24], 10
       jl       SHORT G_M22950_IG07
						;; size=13 bbWeight=8 PerfScore 42.00
G_M22950_IG11:  ;; offset=0x00DF
       mov      eax, 100
       mov      ecx, 101
       cmp      dword ptr [ebp-0x20], 10
       cmovne   eax, ecx
						;; size=17 bbWeight=1 PerfScore 2.75
G_M22950_IG12:  ;; offset=0x00F0
       lea      esp, [ebp-0x08]
       pop      esi
       pop      edi
       pop      ebp
       ret      
						;; size=7 bbWeight=1 PerfScore 3.00
G_M22950_IG13:  ;; offset=0x00F7
       mov      ecx, eax
       mov      eax, dword ptr [ebp-0x20]
       inc      eax
       mov      dword ptr [ebp-0x20], eax
       call     [System.Console:WriteLine(System.Object)]
       call     CORINFO_HELP_ENDCATCH
       mov      esi, gword ptr [ebp-0x28]
       jmp      SHORT G_M22950_IG10
						;; size=25 bbWeight=0 PerfScore 0.00

; Total bytes of code 272, prolog size 22, PerfScore 1081.75, instruction count 102, allocated bytes for code 272 (MethodHash=0103a659) for method Test35000:TestEntryPoint():int (Tier0-FullOpts)
; ============================================================

We decided to move the catch block since it's cold, but we kept the try block in the loop. The only difference in codegen is the try block now falls into the bottom loop block, and the catch block has to jump to the bottom loop block, rather than the other way around. This looks like a runtime issue, considering the failure is specific to delegates, right?

@AndyAyersMS
Copy link
Member

I wonder if this is one of those cases where we have to put a NOP after the call, if it's at the end of the try range, or else we can't figure out if we were in the try or not.

When you did your experiment of calling a method directly, did the call end the try or was there another instruction afterwards?

@amanasifkhalid
Copy link
Member Author

When you did your experiment of calling a method directly, did the call end the try or was there another instruction afterwards?

The try block still ended with a call, which led me to think the codegen is fine. Though let me try disabling the jump-to-next removal optimization, and see if that fixes the failure locally...

@AndyAyersMS
Copy link
Member

When you did your experiment of calling a method directly, did the call end the try or was there another instruction afterwards?

The try block still ended with a call, which led me to think the codegen is fine. Though let me try disabling the jump-to-next removal optimization, and see if that fixes the failure locally...

The clr-abi.md and code say this NOP is only needed for x64. I wonder if it is also needed for x86 but since historically for x86 the catch was always just after the try, the try would always end with something other than a call, so we just never noticed?

@amanasifkhalid
Copy link
Member Author

The clr-abi.md and code say this NOP is only needed for x64. I wonder if it is also needed for x86 but since historically for x86 the catch was always just after the try, the try would always end with something other than a call, so we just never noticed?

I think you're right. Enabling the NOP check for x86 fixes this failure. New codegen:

; Assembly listing for method Test35000:TestEntryPoint():int (Tier0-FullOpts)
; Emitting BLENDED_CODE for X86 with AVX - Windows
; Tier-0 switched to FullOpts code
; optimized code
; ebp based frame
; partially interruptible
; No PGO data
; 2 inlinees with PGO data; 8 single block inlinees; 0 inlinees without PGO data
; Final local variable assignments
;
;  V00 loc0         [V00,T09] (  2,  2   )     ref  ->  eax         class-hnd single-def <System.Reflection.MethodInfo>
;  V01 loc1         [V01,T00] (  7,136   )     ref  ->  [ebp-0x28]  class-hnd EH-live single-def <System.Func`10[Test35000+TestData0,int,int,int,int,int,int,int,int,System.Object]>
;  V02 loc2         [V02,T02] (  2, 32   )     ref  ->  [ebp-0x2C]  class-hnd exact EH-live spill-single-def <Test35000+TestData0>
;  V03 loc3         [V03,T03] (  2, 32   )     ref  ->  [ebp-0x30]  class-hnd exact EH-live spill-single-def <Test35000+TestData1>
;  V04 loc4         [V04,T08] (  4,  2   )     int  ->  [ebp-0x20]  do-not-enreg[Z] EH-live
;  V05 loc5         [V05,T04] (  4, 25   )     int  ->  [ebp-0x24]  do-not-enreg[Z] EH-live
;  V06 loc6         [V06,T01] (  4,104   )     int  ->  edi        
;* V07 loc7         [V07    ] (  0,  0   )     ref  ->  zero-ref    class-hnd <System.NullReferenceException>
;* V08 tmp0         [V08    ] (  0,  0   )     ref  ->  zero-ref    class-hnd exact "NewObj constructor temp" <Test35000+TestData0>
;* V09 tmp1         [V09    ] (  0,  0   )     ref  ->  zero-ref    class-hnd exact "NewObj constructor temp" <Test35000+TestData1>
;  V10 tmp2         [V10,T10] (  2,  0   )     ref  ->  ecx         class-hnd "impSpillSpecialSideEff" <System.NullReferenceException>
;  V11 tmp3         [V11,T06] (  3,  4.50)     ref  ->  eax         single-def "argument with side effect"
;  V12 EHSlots      [V12    ] (  1,  1   )  struct (16) [ebp-0x1C]  do-not-enreg[XS] must-init addr-exposed "lvaShadowSPslotsVar"
;  V13 rat0         [V13,T05] (  5,  7.50)     ref  ->  esi         "replacement local"
;  V14 rat1         [V14,T07] (  3,  2.50)     int  ->  ecx         "CSE for expectedClsNode"
;
; Lcl frame size = 40

G_M22950_IG01:  ;; offset=0x0000
       push     ebp
       mov      ebp, esp
       push     edi
       push     esi
       sub      esp, 40
       xor      eax, eax
       mov      dword ptr [ebp-0x1C], eax
       mov      dword ptr [ebp-0x18], eax
       mov      dword ptr [ebp-0x14], eax
       mov      dword ptr [ebp-0x10], eax
						;; size=22 bbWeight=1 PerfScore 7.75
G_M22950_IG02:  ;; offset=0x0016
       push     -1
       push     28
       push     0
       push     3
       push     0
       push     0
       mov      ecx, 0x810781C      ; 'TestData0'
       mov      edx, 0x8107830      ; 'MyMethod'
       call     [System.RuntimeType:GetMethodImplCommon(System.String,int,int,System.Reflection.Binder,int,System.Type[],System.Reflection.ParameterModifier[]):System.Reflection.MethodInfo:this]
       push     eax
       push     1
       mov      ecx, 0x8107850      ; 'System.Func`10[Test35000+TestData0,System.Int32,System.Int32,Sy'
       xor      edx, edx
       call     [System.Delegate:CreateDelegate(System.Type,System.Object,System.Reflection.MethodInfo,ubyte):System.Delegate]
       mov      esi, eax
       test     esi, esi
       je       SHORT G_M22950_IG05
						;; size=50 bbWeight=1 PerfScore 16.50
G_M22950_IG03:  ;; offset=0x0048
       mov      ecx, 0x908D8F0      ; System.Func`10[Test35000+TestData0,int,int,int,int,int,int,int,int,System.Object]
       cmp      dword ptr [esi], ecx
       je       SHORT G_M22950_IG05
						;; size=9 bbWeight=0.50 PerfScore 2.12
G_M22950_IG04:  ;; offset=0x0051
       mov      edx, eax
       call     [CORINFO_HELP_CHKCASTANY]
       mov      esi, eax
						;; size=10 bbWeight=0.25 PerfScore 0.88
G_M22950_IG05:  ;; offset=0x005B
       mov      gword ptr [ebp-0x28], esi
       mov      ecx, 0x908D688      ; Test35000+TestData0
       call     CORINFO_HELP_NEWSFAST
       mov      gword ptr [ebp-0x2C], eax
       mov      ecx, 0x908DB78      ; Test35000+TestData1
       call     CORINFO_HELP_NEWSFAST
       mov      gword ptr [ebp-0x30], eax
       xor      edx, edx
       mov      dword ptr [ebp-0x20], edx
						;; size=34 bbWeight=1 PerfScore 6.75
G_M22950_IG06:  ;; offset=0x007D
       mov      dword ptr [ebp-0x24], edx
						;; size=3 bbWeight=1 PerfScore 1.00
G_M22950_IG07:  ;; offset=0x0080
       xor      edi, edi
						;; size=2 bbWeight=8 PerfScore 2.00
G_M22950_IG08:  ;; offset=0x0082
       push     1
       push     2
       push     3
       push     4
       push     5
       push     6
       push     7
       push     8
       mov      edx, gword ptr [ebp-0x2C]
       mov      ecx, gword ptr [esi+0x04]
       call     [esi+0x0C]System.Func`10[System.__Canon,int,int,int,int,int,int,int,int,System.__Canon]:Invoke(System.__Canon,int,int,int,int,int,int,int,int):System.__Canon:this
       push     1
       push     2
       push     3
       push     4
       push     5
       push     6
       push     7
       push     8
       mov      edx, gword ptr [ebp-0x30]
       mov      ecx, gword ptr [esi+0x04]
       call     [esi+0x0C]System.Func`10[System.__Canon,int,int,int,int,int,int,int,int,System.__Canon]:Invoke(System.__Canon,int,int,int,int,int,int,int,int):System.__Canon:this
       inc      edi
       cmp      edi, 50
       jl       SHORT G_M22950_IG08
						;; size=56 bbWeight=32 PerfScore 944.00
G_M22950_IG09:  ;; offset=0x00BA
       push     1
       push     2
       push     3
       push     4
       push     5
       push     6
       push     7
       push     8
       xor      edx, edx
       mov      ecx, gword ptr [esi+0x04]
       call     [esi+0x0C]System.Func`10[System.__Canon,int,int,int,int,int,int,int,int,System.__Canon]:Invoke(System.__Canon,int,int,int,int,int,int,int,int):System.__Canon:this
       nop      
						;; size=25 bbWeight=4 PerfScore 54.00
G_M22950_IG10:  ;; offset=0x00D3
       mov      eax, dword ptr [ebp-0x24]
       inc      eax
       mov      dword ptr [ebp-0x24], eax
       cmp      dword ptr [ebp-0x24], 10
       jl       SHORT G_M22950_IG07
						;; size=13 bbWeight=8 PerfScore 42.00
G_M22950_IG11:  ;; offset=0x00E0
       mov      eax, 100
       mov      ecx, 101
       cmp      dword ptr [ebp-0x20], 10
       cmovne   eax, ecx
						;; size=17 bbWeight=1 PerfScore 2.75
G_M22950_IG12:  ;; offset=0x00F1
       lea      esp, [ebp-0x08]
       pop      esi
       pop      edi
       pop      ebp
       ret      
						;; size=7 bbWeight=1 PerfScore 3.00
G_M22950_IG13:  ;; offset=0x00F8
       mov      ecx, eax
       mov      eax, dword ptr [ebp-0x20]
       inc      eax
       mov      dword ptr [ebp-0x20], eax
       call     [System.Console:WriteLine(System.Object)]
       call     CORINFO_HELP_ENDCATCH
       mov      esi, gword ptr [ebp-0x28]
       jmp      SHORT G_M22950_IG10
						;; size=25 bbWeight=0 PerfScore 0.00

; Total bytes of code 273, prolog size 22, PerfScore 1082.75, instruction count 103, allocated bytes for code 273 (MethodHash=0103a659) for method Test35000:TestEntryPoint():int (Tier0-FullOpts)
; ============================================================

Would you like me to turn it on in this PR, or a separate one? It might be interesting to see how many cases this fires for.

@amanasifkhalid
Copy link
Member Author

I wonder if it is also needed for x86 but since historically for x86 the catch was always just after the try, the try would always end with something other than a call, so we just never noticed?

I'm surprised we haven't run into the other cases yet, like a call instruction preceding the start of a try region. Since we haven't faced issues for those, should we be conservative in our fix by only emitting the NOP for the "call before EH region end" case for now?

@AndyAyersMS
Copy link
Member

I wonder if it is also needed for x86 but since historically for x86 the catch was always just after the try, the try would always end with something other than a call, so we just never noticed?

I'm surprised we haven't run into the other cases yet, like a call instruction preceding the start of a try region. Since we haven't faced issues for those, should we be conservative in our fix by only emitting the NOP for the "call before EH region end" case for now?

Good question. Seems like if the try-entering side was a problem we'd have already run into it somehow, since there is nothing preventing us from falling into a try?

@AndyAyersMS
Copy link
Member

Would you like me to turn it on in this PR, or a separate one? It might be interesting to see how many cases this fires for.

Our guess is that it won't ever fire without this PR. If you verify that I don't see why a separate PR would be interesting.

@amanasifkhalid
Copy link
Member Author

Our guess is that it won't ever fire without this PR. If you verify that I don't see why a separate PR would be interesting.

When I experimented with placing a call to a normal method that throws at the end of the try region, there wasn't any code after the call instruction, but it didn't fail, so I suspect there might be other methods that didn't fail, but we'll end up inserting NOPs for with the fix.

@amanasifkhalid
Copy link
Member Author

amanasifkhalid commented May 3, 2024

I added a check to genCodeForBBList to emit a NOP on x86 if a try region with a catch handler ends with a call, and falls into its non-EH successor. Without this PR's changes to fgRelocateEHRegions this produces 40 diffs in the coreclr_tests collection. I'm guessing we were either getting (un)lucky and not needing to handle exceptions during the call from the try region in those cases, or for some reason, the call's return address wasn't being seen as outside the EH region by the runtime when handling exceptions.

I think this is the minimally impactful solution we can use to unblock this for now.

@amanasifkhalid
Copy link
Member Author

/azp run runtime-coreclr outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@amanasifkhalid
Copy link
Member Author

@AndyAyersMS emitting a NOP before the end of the try region seemed to fix the failure. Linux ARM64 failure is unrelated. Should we update the ABI documentation to cover this case?

@AndyAyersMS
Copy link
Member

I would like to get confirmation from the runtime side that NOP padding for x86 EH makes sense.

@mangod9 who can we ask about this?

@mangod9
Copy link
Member

mangod9 commented May 6, 2024

@janvorli.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants