Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explain why Copy is required for a type specified %parse-param #404

Open
FranklinChen opened this issue May 20, 2023 · 7 comments
Open

Explain why Copy is required for a type specified %parse-param #404

FranklinChen opened this issue May 20, 2023 · 7 comments

Comments

@FranklinChen
Copy link

It might be useful for the book to explain briefly why Copy is required for the type specified in %parse-param. This restriction has led me to have to pass in mutable state using a layer of indirection with &RefCell<State> and repeatedly using borrow_mut() in actions, so I wonder if this is really necessary.

@ratmice
Copy link
Collaborator

ratmice commented May 20, 2023

It is theoretically possible that it could be relaxed such that %parse-param could allow of an &mut reference, but it is exceedingly difficult. With the main reason being the unique ownership of an &mut, the type for action functions needs to be changed to a type of function which cannot capture ownership of the &mut (so that when the reference passed to the action falls out of scope it can be passed to the next action). Rust can type such functions as an HRTB/Higher ranked lifetimes using for <'a>.

One of the things that makes this difficult may be the pattern matching involved in generated code, and the difference between FunctionParamPattern and Pattenrs.

Copy patterns are fairly easy to implement without actually parsing the %parse-param using the argument in the generated code verbatim. There were a lot of attempts to get mutable references working in #214
But I kind of ran out of steam on that, and the RefCell workaround.

@ltratt
Copy link
Member

ltratt commented May 20, 2023

It would be nice if it supported &mut but as @ratmice said, I'm not sure how practical that is. One of the challenges of very-clever type systems is that it can be hard to experiment on existing codebases, and we've definitely encountered that challenge!

Another possibility for your case might be to use interior mutation: so you could pass a type T: Copy around which internally has a RefCell or whatnot that hides some of the horrors of borrow_mut. It's still not completely ideal, of course.

@FranklinChen
Copy link
Author

I guess that lalrpop (which I was using before) doesn't have this problem for its parameter passing mechanics https://lalrpop.github.io/lalrpop/tutorial/009_state_parameter.html because it generates a parser struct for each exported rule, whereas lrpar generates static functions. In lalrpop, I passed mutable state using the directive

grammar<'extra>(state: &'extra mut Vec<u8>);

whereas for lrpar I am using

%parse-param state: &RefCell<Vec<u8>>

@ltratt
Copy link
Member

ltratt commented May 20, 2023

To some extent, the current design reflects the fact that I always write parsers that bubble state up rather than mutate state: honestly, it didn't really occur to me to deal with mutable state! Could lrpar be adapted to deal with mutable state? I guess it probably could. I must admit that I don't think I'll be the person who does that though :/ Sorry, that's not a very satisfactory answer on my part!

@FranklinChen
Copy link
Author

I personally prefer to program purely functionally, but for performance I've been collecting some things during parsing mutably, which is very cheap when using a Vec and simply pushing to it. I've considered passing everything upward instead, at the expense of basically manually threading state through everything, and using Rust doubly LinkedList to collect things recursively without paying a quadratic concatenation penalty. I haven't done that yet for comparative benchmarking purposes.

The other thing I want to do, where mutable state seems particularly sensible, is to catch semantic errors during parsing and log error objects into a Vec before doing recovery and going on, since I don't want to fail fast in parsing but want to generate as many errors as possible for the end user.

@ratmice
Copy link
Collaborator

ratmice commented May 20, 2023

I don't mind having another look at it, though I can't promise I'll be any more successful than last time I attempted to do so.
I think that the %parse-param work that made it's way in tree may have simplified things.

The primary difference being in these 2 commits, where we went from accepting multiple named arguments to a single named argument.
3298c50
#216

So, if we can in fact go with the simpler single named argument approach it may well avoid a lot of the difficulty I encountered in my previous attempt. As such I think there is reason to hope that much of the difficulty I'd previously encountered can be avoided.

@ltratt
Copy link
Member

ltratt commented May 21, 2023

If it is doable, that would be great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants