Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax error recovery #14695

Merged
merged 11 commits into from Mar 21, 2022
Merged

Syntax error recovery #14695

merged 11 commits into from Mar 21, 2022

Conversation

odersky
Copy link
Contributor

@odersky odersky commented Mar 16, 2022

This is a PR to improve syntax error recovery. It's meant to be integrated with one of the PRs by @adampauls to do trailing comma detection in Parser instead of Scanner.

I fixed the two CB issues that had illegal trailing commas. Here's the transcript of what I did. @anatoliykmetyuk might be able to point to a more worked out tutorial for this.

For specs2:

> cd community-build/community-projects/specs2
> git checkout -b fix-trailing-comma

Fix problem in editor

> git add -u
> git status
On branch fix-trailing-comma
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	modified:   core/shared/src/main/scala/org/specs2/matcher/ActionMatchers.scala

> git commit -m "fix trailing comma error"
[fix-trailing-comma e1ae96e7a] fix trailing comma error
 1 file changed, 1 insertion(+), 1 deletion(-)
> git push --set-upstream origin fix-trailing-comma
Enumerating objects: 21, done.
Counting objects: 100% (21/21), done.
Delta compression using up to 16 threads
Compressing objects: 100% (7/7), done.
Writing objects: 100% (11/11), 732 bytes | 732.00 KiB/s, done.
Total 11 (delta 4), reused 0 (delta 0)
remote: Resolving deltas: 100% (4/4), completed with 4 local objects.
remote: 
remote: Create a pull request for 'fix-trailing-comma' on GitHub by visiting:
remote:      https://github.com/dotty-staging/specs2/pull/new/fix-trailing-comma
remote: 
To https://github.com/dotty-staging/specs2.git
 * [new branch]          fix-trailing-comma -> fix-trailing-comma
Branch 'fix-trailing-comma' set up to track remote branch 'fix-tr

Then, in the dotty directory:

> git add ../../community-build/community-projects/specs2
> git commit -m "fix trailing commas in community build"
> git push staging HEAD

Fixes #14507

It turns out it's not needed. We can safely stop at a comma when in a statement sequence. It's not materially
different from stopping a closing parenthesis.
 - Drop Indented region only after seeing an OUTDENT node. This aligns
   region stack popping with the other closing tokens `]`, `)`, and `}`
   and thereby fixes a problem in `skip`.

 - When skipping, allow OUTDENT insertions even in n on-indent regions
   if the outdent matches a previous indent. This improves recovert when
   closing `]`, `)`, or `}` are missing.

 - When skipping an OUTDENT that matches a previous indent can discard
   nested regions.

Fixes scala#14507
Treat them as stop tokens only if there is a region that expects them
As for the other errors, don't issue an incomplete input error if the
last error offset is the current offset.
We now have the following handling of stop symbols in skip:

 - `)`, `]`, `}`: stop if there is a corresponding opening parenthesis
 - `,`          : stop if we are in a `commaSeparated` production
 - undent       : stop if indent width matches a previously seen indent width
 - `;`, nl      : stop unconditionally, the parser will sync after the current statement
@odersky
Copy link
Contributor Author

odersky commented Mar 19, 2022

I also had to disable two scalatest tests: ExpectationsSpec and DirectExpectationsSpec. These were test suites containing tests that checked the precise error message output of the dotty compiler. IMO such tests should never be in a regression test suite for the community build, since they essentially lock in current behavior. We do want tests about error messages of course, but these should be compiler tests that can easily be changed if the compiler's error diagnostics improve.

Not knowing scalatest, I was not able to figure out from the diagnostics which particular tests broke in the two suites, so I disabled them wholesale. I'll open a separate issue to bring back tests that are OK.

@odersky
Copy link
Contributor Author

odersky commented Mar 19, 2022

We should do much better now with error recovery which should mean many fewer spurious errors in compiles and in the IDE. Previously, a mismatch of some kind of parenthesis could lead to a large part of the source being skipped without parsing. This in turn could lead to many follow-on type errors errors since definitions were not recognized.

The new improvements are:

  • When skipping we are more robust when dealing with mismatched parentheses. A missing closing parenthesis is now implicitly assumed when skipping determines that the parenthesis is probably missing. It does so by taking the indentation level into account. This means that outdents are recognized and inserted when skipping in a region opened by (. Also, that the region is discarded once an outdent goes back to at most of the indention level of the region enclosing the (.

    Before, such a missing parenthesis would have often triggered skipping until the next enclosing }. Which is particularly bad if your code uses indentation instead of braces; then it would skip to the end.

  • A closing parenthesis (or }, or ]) is treated as a stop symbol for skipping only if there is a matching open parentheses. Previously, the stop symbol was unconditional, which means that recovery after skipping would tend to unwind the stack too far before advancing.

  • A closing OUTDENT is treated as a stop symbol if it has the right indentation width.

  • A comma is treated as a stop symbol if a comma is expected by some enclosing production. Previously, commas were never stop symbol, which meant we always skipped pass them.

Copy link
Contributor

@tgodzik tgodzik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor nitpick, otherwise looks great!

val offset = in.offset
if (in.token != token)
def assumedOutdent = token == OUTDENT && in.token == EOF
if in.token != token /*&& !assumedOutdent*/ then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like assumedOutdent is not needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, that was a remainder of something else I tried that led nowhere.

@odersky odersky merged commit 0aafca1 into scala:main Mar 21, 2022
@odersky odersky deleted the syntax-error-recovery branch March 21, 2022 16:15
@Kordyjan Kordyjan added this to the 3.1.3 milestone Aug 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

No recovery from empty if
4 participants