Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement handling of left recursion #47

Open
zevv opened this issue Nov 4, 2022 · 5 comments
Open

Implement handling of left recursion #47

zevv opened this issue Nov 4, 2022 · 5 comments
Labels
help wanted Extra attention is needed

Comments

@zevv
Copy link
Owner

zevv commented Nov 4, 2022

Handling left recursion is problematic in PEGs, see https://en.wikipedia.org/wiki/Parsing_expression_grammar#Indirect_left_recursion for a nice explanation of the problem.

@zevv zevv added the help wanted Extra attention is needed label Nov 4, 2022
@zevv
Copy link
Owner Author

zevv commented Nov 7, 2022

People smarter than me (@sqmedeiros) have thought hard about this problem, and there seems to be a viable solution for this, described in this paper: https://github.com/zevv/npeg/blob/master/doc/papers/Left_recursion_in_parsing_expression_grammars.pdf

I've been trying to get a good grasp of this paper, but the formal description seems to be quite over my head and I'm not sure how this model would fit into NPeg.

@sqmedeiros
Copy link

Hi, @zevv . Thanks for mentioning my paper.

I will try to explain the idea without the formalism. Essentially, you should have a map where the keys have the form (A, i), where A is left-recursive variable and i is a position of the input. The initial value associated with this key should be a value as fail, indicating that it's not possible to match A at position i. The value stored in this map will be the result of matching A at input position i.

When we try to match the right-hand side of A, the matching of the left-recursive invocation of A will fail, but the matching of the right-hand side o A can still succeed, usually using a non-left-recursive alternative. See the example below:
A -> A a / b

Input: "bac"
Position: 123

When matching the first letter of the input, the alternative A a will fail, as the result of matching A at position 1 is fail. Fortunately, the second alternative succeeds. In this case, we need to try to increment the input prefix matched by A. Thus, we should update our map to something as mymap[(A, 1)] = 2 and try to match the right-hand side of A again at input position 1. In case it is possible to match a larger input prefix, we update our map again and try another matching of the right-hand side of A at input position 1, otherwise we stop the matching and give 2 as the result of matching A at position 1. In this example,

I hope this helps. Feel free to ask anything else.

By the way, this extended version https://arxiv.org/pdf/1207.0443.pdf also discusses (see Section 4) the problem of operator precedence.

@zevv
Copy link
Owner Author

zevv commented Nov 9, 2022

Hi @sqmedeiros, thanks for taking the time to respond, much appreciated.

After chewing some more time on the paper, rereading a few times and and spending the evening at the kitchen table with pencil and notepad I was able to get a better grasp of the process; my current understanding more or less matches your above description, so it seems I'm on the right track! I will let this sink in for a bit and see if I can prototype an implementation one of these days.

Thanks for the updated paper, I'll include this one as well. Operator precedence is - I think - mostly a solved issue in NPeg; the NPeg language has been extended with some constructs to indicate precedence for specific rules, which NPeg then uses for precedence-climbing in the parser. (https://github.com/zevv/npeg#precedence-operators)

@zevv
Copy link
Owner Author

zevv commented Nov 9, 2022

Ah, what a shame I did not find the updated paper before: I think chapter 5 of the updated version (A Parsing Machine for Left-recursive PEGs) is going to help me out a lot.

Thanks again!

@sqmedeiros
Copy link

Yes, chapter 5 is another way to describe how left-recursion could work in PEGs. You could use a similar idea without having to follow an approach based on a parsing machine. By the way, when implementing left-recursion, I think it is important to distinguish left-recursive rules from non-left recursive ones, so we only try to increase the matching of left-recursive rules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants