Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISC-V] Use common CodeGen::genHomeRegisterParams #101288
[RISC-V] Use common CodeGen::genHomeRegisterParams #101288
Changes from 4 commits
25ab569
6c01d95
5423299
06e9d65
85c2217
2a97285
5a0aede
4806478
123fcee
9c565ac
c86cb09
7989c18
5225a43
f7b0bcc
bfaf9ce
2d2717e
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this going to result in a lot of extra stack stores for RISCV?
At this point it seems like just factoring parts of this helper out and calling it from the same place as
genHomeRegisterParams
on the split stack segment will end up being a simpler change overall.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It didn't make unnecessary stores from what I examined the diffs but you're right, a RISC-V specific routine (I tried to phrase it so LoongArch can also reuse) turned out much handier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling genHomeSplitStructParameters only with handleStack=true feels a bit hackish but it does solve the problem and generates the same instructions as previously.
@jakobbotsch Let me know if you think 06e9d65 is the right approach for split structs in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling it only for the stack segments makes sense to me. We should be able to do that for Swift as well now that
genHomeRegisterParams
is sufficiently general (no need to include that change as part of this PR).I'm surprised you need this very general treatment for RISC-V. This handling is needed for the Swift ABI because we cannot rebuild the struct layout simply by storing parts passed in registers below the parts passed on stack. The layout does not match and there can be intervening return address on the stack for x64.
For our other ABIs (windows arm64 with varargs, arm32, my impression is LA64 too) the stack segments of the split parameters will always be at SP on entry to the function. Spilling the register parts right below that will result in the correct struct layout. So there is no need to move the stack segments around at all. Is it not also the case for RISC-V? The diff in #101340 has some more context that may shed some light.
For example: this test has a split parameter
V04
on windows-arm64:To reassemble the struct we allocate
V04
at the very top of the frame and store just the register part there:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may not need it that general for RISC-V, I went for sharing the Swift code because I'm not that familiar with the common parts of the codegen and it did sth similar to what we had for handling of split structs here in genHomeRegisterParams. For prolog of method_1I4S in test struct16args it used to generate a sequence like this:
The common genHomeRegisterParams doesn't generate the last load-store.
Hm, that would be reconstructing the struct cross-frame, interesting. If I understand correctly, "Spilling the register parts right below" in the above asm example would be
sd a7 216(sp)
instead ofsd a7, 144(fp)
(and the following load-store)? If so, I need to check if it's possible to have a contiguous space for the struct that way. Where are the spilling instructions normally generated for other archs?Varargs are now unsupported on RISC-V but the above quoted test is similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, exactly. It would be possible if the stack segment is always guaranteed to start at
sp+0
when the split happens. I think the calling conventions are generally designed to allow this because it makes handling ofvarargs
possible (the callee can spill all parameters and know that everything is laid out contiguously on stack).arm64 handles this in
genHomeRegisterParams
; it falls out naturally due to how things work out. arm32 has its own mechanism calls "prespills" where it will push registers on entry to the function to make this the case. It is effectively the same thing but represented a different way in the JIT, so I would like to unify it with arm64 at some point (arm32 was the first non-x86-x64 target, so it has a lot of special behavior. We have been working to get rid of it.)lvaAssignVirtualFrameOffsetsToArgs
is responsible for assigning these offsets. It allocates offsets relative to the "virtual 0", which is the first offset of space passed on the stack by the caller. For your split parameter I think it should already be allocating it at address -8, which is what is expected to handle the split properly without moving the stack segment. I am not sure what ends up overwriting this assignment. (FWIW I am rewriting it in #101340 to be based on the new ABI info, so it may be simpler to read the diff there. That PR will definitely allocate your split parameter at -8.)Right, I didn't mean to imply that this was a varargs thing, it was just an example (win-arm64 splits parameters, but only if there is also varargs. arm32 splits parameters always, so I could have used some cases from there as an example too).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lvaAssignVirtualFrameOffsetsToArgs
looks healthy, V04 is at -8 like you expected. But then looks likelvaAllocLocalAndSetVirtualOffset
called here is setting the stack offset to -0x48(-72) which ends up being 144 after delta bump.Not sure if V04 shouldn't fall into one of the
continue
statements beforelvaAllocLocalAndSetVirtualOffset
is called...EDIT: ... maybe here?
runtime/src/coreclr/jit/lclvars.cpp
Lines 7120 to 7131 in 0a220d5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do see other stuff allocated on the frame first, before the parameters, so the frame bump can probably not be done in the loop like I suggest and instead needs handling before we go into the loop over the locals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, thank you. I'll try to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Putting the split argument at the top of the frame requires moving everything down the stack, including stack pointer and return address, basically what arm does for varargs (for RISC-V we could limit that only to the last argument register). This is a bit more involved than I anticipated. Since this is just an optimization, I think it's better to revisit it after #101340 is merged to have a cleaner slate to work on.
Meanwhile, could we review & merge this PR as is (temporarily using Swift code for split arg reconstruction)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's more involved than anticipated then I think it's fine to just go with this approach permanently. No need to spend more effort to make the split parameters contiguous on my behalf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries. I like the idea but there's still a bit work to do on RISC-V before we get to optimizations like this.