Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apparently infinite recursive rule #290

Open
ratmice opened this issue May 12, 2022 · 5 comments
Open

Apparently infinite recursive rule #290

ratmice opened this issue May 12, 2022 · 5 comments

Comments

@ratmice
Copy link
Collaborator

ratmice commented May 12, 2022

One of the "fun" things about my project is running the parser on strange, half edited/incomplete changes.
Here is one such case I encountered that way, and have minimized.

given the input character a, this will cause an infinite loop pushing to pstack between
https://github.com/softdevteam/grmtools/blob/master/lrpar/src/lib/parser.rs#L297
https://github.com/softdevteam/grmtools/blob/master/lrtable/src/lib/statetable.rs#L461-L466
adding a case like: Some(i) if i == usize::from(stidx) + 1 => None, to goto fixes it, (i.e. the return value of goto == prior).

Filing this as a bug report rather than sending a PR though, because I haven't yet tested it against valid parsers, or
as of yet tried to surmise if this case can only and always lead to infinite recursion or if it ever actually comes up in a valid way.

%%
a "a"
[\t\n ] ;
%%
Start: Bar;
Foo: "a" | ;
Bar: Foo | Foo Bar;
@ltratt
Copy link
Member

ltratt commented May 12, 2022

From memory, yacc (or Bison maybe?) statically detects such cases and refuses to compile them (or, at least, issues a warning). If that memory is correct, we should consider doing something similar IMHO.

ratmice added a commit to ratmice/grmtools that referenced this issue May 15, 2022
StateTableError previously required a PIdx.
At the point we discover this self loop we don't have one.

In order to convey this error change StateTableError into an enum,
combining the `pidx` field with its kind.
ratmice added a commit to ratmice/grmtools that referenced this issue Aug 17, 2022
ratmice added a commit to ratmice/grmtools that referenced this issue Aug 17, 2022
ratmice added a commit to ratmice/grmtools that referenced this issue Aug 17, 2022
@ratmice
Copy link
Collaborator Author

ratmice commented Aug 18, 2022

I wrote a gist that contains, some graphs and a breakdown of the issue, some potential ways we could go about fixing it.
the potential impacts of trying to do exactly what yacc/bison do here. I think the last potential fix mentioned (providing a mechanism to produce a pruned StateGraph), seems like a nice way to maintain the existing relation shared between StateTable and StateGraph's StIdx.

https://gist.github.com/ratmice/180cc7daa6e28a285885affd2e623f71

I mostly wanted to think through the semver ramifications of fixing this, and I think there are 2 options for fixing it with low impact to the code or semver, however changing the goto table the way yacc appears to seems to be quite impactful.

If you think that one of these low impact options is viable, I'm kind of inclined not to be in any rush to fix this,
since running a parser with conflicts, without %expect, %expect-rr is probably unwise, and I'm fairly convinced
this can only happen in the presence of conflicts.

@ltratt
Copy link
Member

ltratt commented Aug 19, 2022

Broadly speaking I'm in favour of making these errors (with the minor caveat that I'd like https://softdevteam.github.io/grmtools/master/book/errorrecovery.html#turning-lexing-errors-into-parsing-errors (where we use an otherwise-unused rule) to remain possible, but there might be a better way of doing it!). I find it hard to imagine anyone writes a grammar intending to leave such problems in.

@ratmice
Copy link
Collaborator Author

ratmice commented Aug 19, 2022

I would think that perhaps a new declaration like %allow-unused Unmatched could be used to disable such an error
(Or perhaps something which indicates its usage by error recovery?).

I'm mildly cognizant though of where it seems likely when you are initially writing a rule, you are less likely to have referenced within the rest of your grammar, which is kind of why I was thinking of the same way that conflicts() and missing_from_lexer/missing_from_parser returned from set_rule_ids, in that those don't actually produce a StateTableError, but are treated as warnings/errors by the individual tools.

One last thing, is I'm getting the impression that unused (due to conflicts) it seems like is just a way of deriving which state of the conflict was dropped (And I think in yacc just Shift/Reduce ones), I think it is basically able to be derived already from the information within Conflicts, since I believe it returns the PIdx of the production which was discarded during conflict resolution.
So it seems like we might have everything to produce that within the tools already.

@ltratt
Copy link
Member

ltratt commented Aug 19, 2022

I would think that perhaps a new declaration like %allow-unused Unmatched could be used to disable such an error

This feels quite yacc-y to me! I can certainly live with that.

[I must admit that the other parts of statetable generation / analysis have receded so far into the back of my brain that I can't remember how this edge-case stuff might work!]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants