New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Allow moving handler regions in fgRelocateEHRegions #101613
base: main
Are you sure you want to change the base?
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
cc @dotnet/jit-contrib, @AndyAyersMS PTAL. Diffs look like an improvement: A lot of the size savings seem to come from shorter jumps on non-exceptional paths when we decide to move handlers further down the method, while a lot of the size regressions stem from new jumps now that handlers don't fall through to the non-exceptional successor, as well as larger jumps to the handlers. Since we only move handlers if they're marked as rarely run, I think we want to do this movement as a matter of principle. |
SPMI replay failure is a timeout -- this change shouldn't affect x64, anyway. |
Probably worth running jitstress. Unfortunately, that is blocked by #101628 |
Looks like jitstress is no longer blocked (at least not on x86). I'll run it now |
/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress |
Azure Pipelines successfully started running 2 pipeline(s). |
@AndyAyersMS this is the only test to fail in the outerloop pipelines: The
and with:
We decided to move the catch block since it's cold, but we kept the try block in the loop. The only difference in codegen is the try block now falls into the bottom loop block, and the catch block has to jump to the bottom loop block, rather than the other way around. This looks like a runtime issue, considering the failure is specific to delegates, right? |
I wonder if this is one of those cases where we have to put a NOP after the call, if it's at the end of the try range, or else we can't figure out if we were in the try or not. When you did your experiment of calling a method directly, did the call end the try or was there another instruction afterwards? |
The try block still ended with a call, which led me to think the codegen is fine. Though let me try disabling the jump-to-next removal optimization, and see if that fixes the failure locally... |
The clr-abi.md and code say this NOP is only needed for x64. I wonder if it is also needed for x86 but since historically for x86 the catch was always just after the try, the try would always end with something other than a call, so we just never noticed? |
I think you're right. Enabling the NOP check for x86 fixes this failure. New codegen:
Would you like me to turn it on in this PR, or a separate one? It might be interesting to see how many cases this fires for. |
I'm surprised we haven't run into the other cases yet, like a call instruction preceding the start of a try region. Since we haven't faced issues for those, should we be conservative in our fix by only emitting the NOP for the "call before EH region end" case for now? |
Good question. Seems like if the try-entering side was a problem we'd have already run into it somehow, since there is nothing preventing us from falling into a try? |
Our guess is that it won't ever fire without this PR. If you verify that I don't see why a separate PR would be interesting. |
When I experimented with placing a call to a normal method that throws at the end of the try region, there wasn't any code after the call instruction, but it didn't fail, so I suspect there might be other methods that didn't fail, but we'll end up inserting NOPs for with the fix. |
I added a check to I think this is the minimally impactful solution we can use to unblock this for now. |
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
@AndyAyersMS emitting a NOP before the end of the try region seemed to fix the failure. Linux ARM64 failure is unrelated. Should we update the ABI documentation to cover this case? |
I would like to get confirmation from the runtime side that NOP padding for x86 EH makes sense. @mangod9 who can we ask about this? |
Follow-up to #101611.
fgRelocateEHRegions
previously would not try to move handler regions due to implementation restrictions infgDetermineFirstColdBlock
that have since been lifted. This should only affect behavior on Windows x86 when we aren't using funclets.Also, I noticed we never set
Compiler::fgCanRelocateEHRegions
to false anywhere, so it seems safe to remove this member.