Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recovery at end of input not working? #1116

Open
tlrobinson opened this issue Feb 14, 2020 · 2 comments
Open

Recovery at end of input not working? #1116

tlrobinson opened this issue Feb 14, 2020 · 2 comments
Labels

Comments

@tlrobinson
Copy link

tlrobinson commented Feb 14, 2020

I'm trying to get recovery to insert a token at the end of the input but it doesn't appear to be working. Here's a really simple grammar to parse function calls like foo(). I want it to also parse foo( using token insertion. Am I doing something wrong?

(function expressionExample() {
  // ----------------- Lexer -----------------
  const createToken = chevrotain.createToken;
  const Lexer = chevrotain.Lexer;

  const Identifier = createToken({name: "Identifier", pattern: /[a-zA-Z]+/});
  const LCurly = createToken({name: "LCurly", pattern: /\(/});
  const RCurly = createToken({name: "RCurly", pattern: /\)/});

  const expressionTokens = [Identifier, LCurly, RCurly];

  const ExpressionLexer = new Lexer(expressionTokens, {
    positionTracking: "onlyStart"
  });

  // Labels only affect error messages and Diagrams.
  LCurly.LABEL = "'{'";
  RCurly.LABEL = "'}'";

  // ----------------- parser -----------------
  const Parser = chevrotain.Parser;

  class ExpressionParser extends Parser {
    constructor() {
      super(expressionTokens, {
        recoveryEnabled: true
      })

      const $ = this;

      $.RULE("expression", () => {
        $.CONSUME(Identifier);
        $.CONSUME(LCurly);
        $.CONSUME(RCurly);
      });

      // very important to call this after all the rules have been setup.
      // otherwise the parser may not work correctly as it will lack information
      // derived from the self analysis.
      this.performSelfAnalysis();
    }
  }

  // for the playground to work the returned object must contain these fields
  return {
    lexer: ExpressionLexer,
    parser: ExpressionParser,
    defaultRule: "expression"
  };
}())
@bd82
Copy link
Member

bd82 commented Feb 14, 2020

Hi @tlrobinson

The simple examples certainly helps. 👍

I don't think the recovery logic handles the edge case of EOI.

  canRecoverWithSingleTokenInsertion(
    this: MixedInParser,
    expectedTokType: TokenType,
    follows: TokenType[]
  ): boolean {
    if (!this.canTokenTypeBeInsertedInRecovery(expectedTokType)) {
      return false
    }

    // must know the possible following tokens to perform single token insertion
    if (isEmpty(follows)) {
      return false
    }

    let mismatchedTok = this.LA(1)
    let isMisMatchedTokInFollows =
      find(follows, (possibleFollowsTokType: TokenType) => {
        return this.tokenMatcher(mismatchedTok, possibleFollowsTokType)
      }) !== undefined

    return isMisMatchedTokInFollows
  }

So to perform single token Insertion the encountered token must match a possible NEXT
token. This condition is met in your scenario:

  • foo ( EOF
  • foo ( ) EOF

However I do not believe EOF is counted as part of the possible next tokens.
As it is an implicit EOF.

I've tried to explicitly add a CONSUME(chevrotain.EOF) at the end of the rule but without luck.
I guess I need to debug this in more depth, I'll update when I find out more.

@bd82
Copy link
Member

bd82 commented Feb 22, 2020

All-right, I've debugged this again but this time using a full dev env instead of the playground.

Adding an EOF token explicitly seems to resolve the problem.

    $.RULE("expression", () => {
      $.CONSUME(Identifier)
      $.CONSUME(LCurly)
      $.CONSUME(RCurly)
      $.CONSUME(chevrotain.EOF)
    })
  • Note the EOF should be consumed at the top level rule (entry point) of your grammar.

It is possible to make a patch infer the existence of EOF as a "possible next token" in such a case, however because EOF is implicit it is a tiny bit complicated and may not be warranted or high priority when a simple workaround is available...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants