Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tail Recursion Modulo Cons for record payloads #6071

Draft
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

HajagosNorbert
Copy link
Contributor

Step towards #5979

Status quo: nested structs inside recursive tag union are not recognized yet as trmc candidates. The trmc transformation is only implemented for 0 levels of struct nesting as well. So it works for IR like this

        let `#UserApp.27` : {Str, [<rnu><null>, C {Str, *self}]} =
            Struct {`#UserApp.string_value`, `#UserApp.recursive_call_ret_value`};
        let `#UserApp.26` : [<rnu><null>, C {Str, *self}] = TagId(0) `#UserApp.27`;
        ret `#UserApp.26`;

but not for (pseudo) IR like this (I don't know the types)

        let `#UserApp.struct_containing_rec_call` : {Str, [<rnu><null>, C {Str, *self}]} = 
            Struct {`#UserApp.string_value1`, `#UserApp.recursive_call_ret_value`};
        let `#UserApp.27` : {Str, {Str, [<rnu><null>, C {Str, *self}]} } = 
            Struct {`#UserApp.string_value2`, `#UserApp.struct_containing_rec_call`};
        let `#UserApp.26` : [<rnu><null>, C {Str, C { Str, *self}}] = TagId(0) `#UserApp.27`;
        ret `#UserApp.26`;

Some measures are in place for extending it to arbitrary nested levels (like storing an slice of indices, not a single number for the index of the recursive struct field) .

dev backed:
Getting the pointer from a recursive tag union is prepared for payload combinations of struct and non recursive tag unions, with no limit on nesting. Or at least that's what I read from the new test running successfully.

cargo test-gen-dev trmc passes with the new linked_list_with_record_trmc test case, but running all the test-gen-dev tests end with a free(): invalid pointer error, signal 6

@folkertdev folkertdev self-requested a review November 23, 2023 22:58
Copy link
Contributor

@folkertdev folkertdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good. I left some (mostly style) comments.

I'd also like to see a mono test (those in the test_mono crate) for this feature

Comment on lines 3437 to 3438
tag_id: TagIdIntType,
index: u64,
_todo_delete_me_later_tag_id: TagIdIntType,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

later is after this PR is merged?

Comment on lines +5411 to +5416
//TODO: can any of these be a valid target for getting a pointer out of their inner layouts?
//(so not them being pointers)
Union(_) | Builtin(_) | Ptr(_) | RecursivePointer(_) | LambdaSet(_) | FunctionPointer(_)
| Erased(_) => unreachable!(
"Following the path of the indices, a layout has been reached where the address of the desired pointer is behind another pointer"
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lambda set probably could (it should just recurse on its inner type). Also the error message is not correct for Builtin. The remainder (other union values, ptr, recursiveptr, functionpointer and erased) are all pointer types. I'd say just give builtin its own branch with a custom error message.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the lambda set case, I believe the access path should be at most 1 right? Since I believe we will never explicitly TR-optimize a lambda set. If so, let's add a debug assertion for the size of the indices in that case.

#[derive(Debug, Clone)]
enum RecCallLocation<'a> {
DirectlyInTag,
MemberOfStruct(Symbol, &'a [u64]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this &'a [] means we get more lifetimes in the code and need to pass the arena. Maybe it's better to also make this an arrayvec? very deep paths are unlikely, I'd think 4, or maybe 8 indices at most is fine. I think that simplifies the code quite a bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't read that before the new commits. Not good at rust. That change you mentioned, and adding more Vec-s to the mix for the CandidateChain did cause me to eventually insert a drop() fn in there later, which you can see in that file. I know that is not pretty. It is working though :)
Do you think it is worth refactoring to not include Vec-s for the CandidateChain but instead arrayvecs with the same capacity as what a RecCallLocation::MemberOfStruct's arrayvecs would hold? right now, arbitrary nesting is allowed, which is an overkill, but would it be better to pre-allocate maybe 8 indices and symbols (see def of CandidateChain) worth of space for all candidates?

Comment on lines 626 to 630
if call == *value {
Some((i, value))
} else {
None
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be (call == *value).then_some((i, value))

let_tag(arena.alloc(
//
let_new_hole(arena.alloc(
match call_location {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this nesting is getting pretty severe, maybe this logic can be factored out into its own function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

Comment on lines 1445 to 1447
//comment written in: 2023.11.22; indices.len() can only be >2 after trmc happens in
//monomorphization, tail_recursion, but that happens after alias analysis
debug_assert!(indices.len() == 2);
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is right. Alias analysis only ever happens after all passes in mono, including monomorphization and tail recursion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must have read somewhere that alias analysis comes before generally. I don't even know what that means tbh. But I know I found that assertion fail on me later :)

let ptr_reg = self
.storage_manager
.load_to_general_reg(&mut self.buf, structure);

let sym_reg = self.storage_manager.claim_general_reg(&mut self.buf, sym);

let index = &index[1..];
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider renaming to indices for consistency in the call below.

Comment on lines +5411 to +5416
//TODO: can any of these be a valid target for getting a pointer out of their inner layouts?
//(so not them being pointers)
Union(_) | Builtin(_) | Ptr(_) | RecursivePointer(_) | LambdaSet(_) | FunctionPointer(_)
| Erased(_) => unreachable!(
"Following the path of the indices, a layout has been reached where the address of the desired pointer is behind another pointer"
),
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the lambda set case, I believe the access path should be at most 1 right? Since I believe we will never explicitly TR-optimize a lambda set. If so, let's add a debug assertion for the size of the indices in that case.

let enclosing_layout = layouts[idx as usize];
return match layout_interner.get_repr(enclosing_layout) {
Struct(layouts) => {
load_union_field_ptr_at_index_help(layout_interner, &indices[1..], layouts, offset)
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't need to happen now, but it would be nice if we could transform this into a loop instead of performing recursive calls. I think it would also simplify the logic. Something like

let mut indices = &indices[..];
let mut offset = 0;
while !indicies.is_empty() {
  let idx = indices[0];
  for field in &layouts[..idx as usize] {
      offset += layout_interner.stack_size(*field);
  }
  let enclosing_layout = layouts[idx as usize];
  match layout_interner.get_repr(enclosing_layout) {
    Struct(layouts) => { indices = &indices[1..]; }
    Union(UnionLayout::NonRecursive(tags)) => {
      // ...
    }
  }
}
offset

interner: arrayvec::ArrayVec<Symbol, 64>,
call_localtions: arrayvec::ArrayVec<RecCallLocation<'a>, 64>,
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
call_localtions: arrayvec::ArrayVec<RecCallLocation<'a>, 64>,
call_locations: arrayvec::ArrayVec<RecCallLocation<'a>, 64>,

@HajagosNorbert
Copy link
Contributor Author

Wow, thank you all for all the comments! I haven't checked them before doing more work on the branch and committing. Will definitely go through them all. This is still not the final version. The cargo test-gen-dev trmc tests pass, but other things are broken, related to the use of GetElementPointer.

The local goal was to implement trmc for arbitrary nesting of structs as payloads, which just happened! Yes, arbitrary, so millions of nesting... You are right about it being an overkill.
Now I want to make it actually work when building the whole compiler from source.

Just a heads-up; the deadline of my thesis is approaching terrifyingly fast, so I'll prioritize writing that, even if that means I'll write about a change I made that is only perceivable at the level of tests. I'm not abandoning the code, I'll just need time. And maybe a few help from time-to-time about parts I'm not familiar with (basically everywhere where GetElementPointer is referenced outside of what I worked on). But I'll ask for help when I have the time to act upon it.

@folkertdev
Copy link
Contributor

Just a heads-up; the deadline of my thesis is approaching terrifyingly fast,

I know the feeling! The paper/grade is ultimately what matters, prioritizing that makes total sense

Let us know when you've made a pass at the comments, and we're always happy to help out when you're stuck somewhere (e.g. please just ask us to take a look at a specific failing test if you are not making progress)

Copy link

Thank you for your contribution! Sometimes PRs end up staying open for a long time without activity, which can make the list of open PRs get long and time-consuming to review. To keep things manageable for reviewers, this bot automatically closes PRs that haven’t had activity in 60 days. This PR hasn’t had activity in 30 days, so it will be automatically closed if there is no more activity in the next 30 days. Keep in mind that PRs marked Closed are not deleted, so no matter what, the PR will still be right here in the repo. You can always access it and reopen it anytime you like!

@HajagosNorbert
Copy link
Contributor Author

Graduated from university, so I can get back to this! Will allocate some time for Roc outside work.

@folkertdev
Copy link
Contributor

Congrats! let us know when you need something from us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants