Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Struct capture wrongly applying previous captures in a failed branch #216

Open
petee-d opened this issue Mar 1, 2022 · 2 comments
Open

Comments

@petee-d
Copy link
Contributor

petee-d commented Mar 1, 2022

Hey, while working on parser code generation and implementing lookahead error recovery, I noticed a bug. Consider this example:

type BugStructCapturesInBadBranch struct {
	Bad bool                `parser:"(@'!'"`
	A   BugFirstAlternative `parser:" @@) |"`
	B   int                 `parser:"('!' '#' @Int)"`
}

type BugFirstAlternative struct {
	Value string `parser:"'#' @Ident"`
}

func TestBug_GroupCapturesInBadBranch(t *testing.T) {
	var out BugStructCapturesInBadBranch
	require.NoError(t, MustBuild(&BugStructCapturesInBadBranch{}, UseLookahead(2)).ParseString("", "!#4", &out))
	assert.Equal(t, BugStructCapturesInBadBranch{B: 4}, out)
}

I tried to make it as minimalistic as reasonable, it's quite an obscure bug that's unlikely to bother someone but I thought I'd report it anyway. strct.Parse will call ctx.Apply even if s.expr.Parse returned an error. The purpose of that is apparently providing a partial AST in case the entire parsing fails, but is has an unwanted side-effect. Any captures added to parseContext.apply added by the branch so far will be applied, even though the error may later be caught by a disjunction or a ?/*/+ group and recovered. I think this can only happen if lookahead is at least 2, as it requires one token for that unwanted capture and a second token for the strct to return an error instead of nil out.

In the example above, the input is constructed to match the second disjunction alternative, but the first tokens will initially lead it into the first alternative and into the BugFirstAlternative struct. When attempting to match Ident for the Value field, the sequence will fail and return an error, but ctx.apply will already contain a capture for BugStructCapturesInBadBranch.Bad, which will be applied in strct.Parse, even though the disjunction recovers it and matches the second alternative.

I don't think it's super important this is fixed, but my code generation parser's behavior will differ from this, because I'm trying to take a different approach to recovering failed branches - restoring to a backup of the parsed struct when branch fails instead of delaying applying captures.

@petee-d
Copy link
Contributor Author

petee-d commented Aug 13, 2022

@alecthomas ping about this. The generated parser will differ from the reflective parser's behavior here and I would like to make sure that the reported behavior is indeed incorrect - definitely seems so.

@alecthomas
Copy link
Owner

Definitely a bug. I'm pretty surprised this hasn't come up before TBH.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants