Support lookaround assertions? #134

alecthomas · 2021-03-22T19:02:25Z

Should be fairly straightforward to implement, but is it useful?

Look ahead assertions.
Look behind assertions.

petee-d · 2021-05-06T09:04:37Z

I'm not 100% sure we mean the same thing, but it sounds like I have a use case for something like that when writing a parser for Jinja (specifically, implicit test arguments). In the parser, I need to match some complex struct term only if the next token isn't one of certain 3 keywords. Here is a grossly oversimplified example:

type Condition struct {
	True  Expression `@@`
	Cond  Test       `'if' @@ 'else'`
	False Expression `@@`
}

type Test struct {
	Left     Expression  `@@`
	Name     string      `'is' @Ident`
	Argument *Expression `@@?` // Allowed to match only if next token isn't 'else' or a few other keywords
}

type Expression struct {
	// Would be much more complex in reality (operators with 10+ precedence groups, a huge recursive mess)
	Variable string `@Ident`
}

func TestNegativeLookahead(t *testing.T) {
	parser := participle.MustBuild(&Condition{})
	result := &Condition{}
	assert.NoError(t, parser.ParseString("", "one if x is odd else two", result))
	assert.NoError(t, parser.ParseString("", "small if x is below y else large", result))
}

I basically would need some way of modifying the optional Test.Argument (some "tests" may take arguments, some may not, it's unknown to the parser) so that the rule matches only if the next token isn't the keyword else (or 2 other keywords). In real Jinja, they did this with a hand-written recursive descent parser that peeks at the next token and doesn't try to match an expression if it sees one of those keywords. As you can imagine, the real grammar is much more complex (45 structs, ~3400 characters of generated EBNF, much more complex than https://github.com/alecthomas/participle/blob/master/_examples/sql/main.go), so hacks like a separate structure of Expression structs that won't match those keywords for Variable isn't an option.

Do you think this can somehow be done with existing participle grammar options? I've tried !<expr> but that apparently can't be used with a struct (naively, I tried something like @!('else'|'and'|'or')@, to no avail), plus I guess it would match only one token anyway, the expression could in reality be any number of tokens. Any hacks that wouldn't be too nasty? Or do you think participle could be extended somehow to support this use case? Some kind of negative lookahead?

If you look into this, thank you very much in advance! This seems to be about the only thing that my Jinja parser can't handle yet.

petee-d · 2021-05-07T08:35:26Z

@alecthomas is this something you would be able to look at any time soon? Please let me know, as I could give it a try myself, but would like to know what kind of a syntax you would like for the negative lookahead. I think the ideal choice would be the negative lookahead syntax in https://pythonhosted.org/pyrser/dsl.html, so for example:

	Argument *Expression `( !('else'|'and'|'or')  @@ )?` // Match Expression only if the next token isn't else/and/or

I.e. within a sequence, any expression prefixed by ! would be treated as negative lookahead and the sequence will fail before matching anything else if the next token matches the rule, but the negative lookahead itself will not consume any tokens. This is of course in conflict with the existing negation syntax, which behaves the same except it does consume a token. I think it's somewhat unfortunate as with the behavior I'm suggesting the existing use case for negation

	EverythingUntilSemicolon *[]string `@!';'* @';'`

could be written as

	EverythingUntilSemicolon *[]string `@(!';' Ident)* @';'` // Ident could be another token or union of tokens, of course

But since breaking backwards compatibility would suck, I think we could do something like this for my first example:

	Argument *Expression `( !?('else'|'and'|'or')  @@ )?` // Match Expression only if the next token isn't else/and/or

I.e. ! followed by ? will tell the negation not to consume any token. Since it would be using the existing negation code and just work with a boolean flag, I think it wouldn't even be that hard to implement.

What do you think? Do you have another idea for what the syntax should be? Would you like to do it yourself, or should I take a crack at it?

petee-d · 2021-05-07T12:35:16Z

Was kind of pressing me, so I just did it. :) Chose !?<expr> as it's kind of similar to the (?!<expr>) syntax of negative lookahead in PERL-like regular expressions. Changes to the syntax are of course welcome, I just need the functionality.

alecthomas · 2021-05-07T23:14:05Z

Mmm actually I did merge it, but I think I'd prefer it to be ?! to match regex. Would you mind updating it to that?

alecthomas · 2021-05-08T02:04:33Z

I think, regarding syntax, that it would be preferable to emulate Perl regex syntax. It should be part of the grouping construct, which will probably make this a bit more of a complex change than your one, but will be more consistent.

alecthomas · 2021-05-08T02:04:48Z

If you'd like to take a crack at it, I would be most appreciative.

petee-d · 2021-05-08T08:30:32Z

Could try. How about I still use &negation{consume: false, node: ...} in the parse tree, just have an if branch in parseGroup that creates it? Would still be quite a simple change and work correctly, but would help to avoid duplicating a lot of negation.Parse code. The only problem is it would potentially be a bit weird parseGroup can create negation.

petee-d · 2021-05-08T08:40:34Z

Actually, just tried it and it wasn't much good. I'll try something a bit different... :)

petee-d · 2021-05-08T10:56:33Z

I changed it up by separating the code from negation completely (as negation had some unnecessary restrictions anyway) and took the opportunity to also implement positive lookahead with the PERL-like (?= ...) syntax. 🙂
#145

alecthomas added enhancement question proposal and removed question labels Mar 22, 2021

petee-d mentioned this issue May 7, 2021

Add support for parser negative lookahead #134 #144

Merged

petee-d mentioned this issue May 8, 2021

Lookahead groups #145

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support lookaround assertions? #134

Support lookaround assertions? #134

alecthomas commented Mar 22, 2021 •

edited

petee-d commented May 6, 2021 •

edited

petee-d commented May 7, 2021 •

edited

petee-d commented May 7, 2021 •

edited

alecthomas commented May 7, 2021

alecthomas commented May 8, 2021

alecthomas commented May 8, 2021

petee-d commented May 8, 2021 •

edited

petee-d commented May 8, 2021

petee-d commented May 8, 2021

Support lookaround assertions? #134

Support lookaround assertions? #134

Comments

alecthomas commented Mar 22, 2021 • edited

petee-d commented May 6, 2021 • edited

petee-d commented May 7, 2021 • edited

petee-d commented May 7, 2021 • edited

alecthomas commented May 7, 2021

alecthomas commented May 8, 2021

alecthomas commented May 8, 2021

petee-d commented May 8, 2021 • edited

petee-d commented May 8, 2021

petee-d commented May 8, 2021

alecthomas commented Mar 22, 2021 •

edited

petee-d commented May 6, 2021 •

edited

petee-d commented May 7, 2021 •

edited

petee-d commented May 7, 2021 •

edited

petee-d commented May 8, 2021 •

edited