Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SI-12290: support JDK15 text blocks in Java parser #9548

Merged
merged 1 commit into from Apr 23, 2021

Conversation

harpocrates
Copy link
Contributor

JDK15 introduced text blocks (JEP 378) for writing multiline strings.
This adds support for parsing these strings in the Java parser.

The logic for interpretting the literals is a little complicated, but
follows from the "3.10.6. Text Blocks" of the Java language specification.
The test cases include examples from there and from the JEP.

Fixes scala/bug#12290

@scala-jenkins scala-jenkins added this to the 2.13.6 milestone Mar 23, 2021
@SethTisue SethTisue added release-notes worth highlighting in next release notes java interop labels Mar 24, 2021
Copy link
Member

@lrytz lrytz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you and welcome! :)

src/compiler/scala/tools/nsc/javac/JavaScanners.scala Outdated Show resolved Hide resolved
src/compiler/scala/tools/nsc/javac/JavaScanners.scala Outdated Show resolved Hide resolved
test/files/run/t12290/TextBlocks.java Show resolved Hide resolved
@harpocrates harpocrates force-pushed the alec/t12290 branch 2 times, most recently from 8cdbbe4 to af3e6fd Compare March 24, 2021 15:40
@lrytz
Copy link
Member

lrytz commented Mar 26, 2021

Oh another detail: \s is a valid escape sequence now in Java (not only for text blocks, also for strings and char literals). So it should be added to getlitch.
https://docs.oracle.com/javase/specs/jls/se16/html/jls-3.html#jls-3.10.7

@harpocrates
Copy link
Contributor Author

Oh another detail: \s is a valid escape sequence now in Java (not only for text blocks, also for strings and char literals).

I added support for \s and also for escaping newlines in text blocks (trailing \). All escape sequences should be covered now.


I've been force-pushing updates (since I read that Scala prefers to keep a clean commit log), but I can also just add commits if it is easier to review and squash them at the end.

@SethTisue
Copy link
Member

SethTisue commented Mar 26, 2021

I've been force-pushing updates (since I read that Scala prefers to keep a clean commit log), but I can also just add commits if it is easier to review and squash them at the end.

Either way is fine. We do it either way depending on the nature of the PR and on individual preference, rather than trying to establish a firm policy.

If in doubt, I would say default to squashing at the end, especially on anything where the reviewing gets complicated or appears likely to get complicated.

Copy link
Member

@lrytz lrytz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave it another pass... But we're close!

I'll talk to Seth next week about testing. It would be good if we could mark tests as Java N+ and have some CI job that runs those regularly.

}
if (in.ch != LF && in.ch != CR) { // CR-LF is already normalized into LF by `JavaCharArrayReader`
syntaxError("illegal text block open delimiter sequence, missing line terminator")
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of / before returning, maybe we try to advance reader to the end of the text block? Otherwise there are a lot of errors issued when continuing to parse.

Although javac is not any better :-) So I'm fine if you leave it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just not returning would do the trick? I'll try this out after work.

src/compiler/scala/tools/nsc/javac/JavaScanners.scala Outdated Show resolved Hide resolved
src/compiler/scala/tools/nsc/javac/JavaScanners.scala Outdated Show resolved Hide resolved
src/compiler/scala/tools/nsc/javac/JavaScanners.scala Outdated Show resolved Hide resolved
src/compiler/scala/tools/nsc/javac/JavaScanners.scala Outdated Show resolved Hide resolved
src/compiler/scala/tools/nsc/javac/JavaScanners.scala Outdated Show resolved Hide resolved
Copy link
Member

@lrytz lrytz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now!

We'll get back to you about the testing situation.

A few more tests could be added

  • mixed indentation with tabs and spaces (each count 1 character of indentation)
  • escaped new line \
  • \uXXXX escapes are handled before the text block (\u0020 is leading whitespace, \s is not)
  • invalid text blocks
    • non-whitespace after initial """
    • difference to scala, which just uses the last 3 " to close a multiline string """this is "fine""""
  • maybe CRLF, FF, exoctic unicode whitespaces...

@harpocrates
Copy link
Contributor Author

I've added some extra tests to cover the cases you mentioned. I imagine that depending on how testing plays out there may be some modification needed to indicate the tests only run on JDK 14+. I'll be looking out for those updates to testing.

Thanks for the patient reviews!

@SethTisue
Copy link
Member

WIP PR adding JDK 16 to Travis-CI: #9579

@SethTisue
Copy link
Member

#9579 has landed, but now this is waiting on #9587

@som-snytt
Copy link
Contributor

I don't think lrytz wanted to wait.

@lrytz lrytz force-pushed the alec/t12290 branch 2 times, most recently from 3aa3d68 to a428770 Compare April 23, 2021 15:22
JDK15 introduced text blocks (JEP 378) for writing multiline strings.
This adds support for parsing these strings in the Java parser.

The logic for interpretting the literals is a little complicated, but
follows from the "3.10.6. Text Blocks" of the Java language specification.
The test cases include examples from there and from the JEP.
@lrytz lrytz merged commit ff9cea1 into scala:2.13.x Apr 23, 2021
@harpocrates harpocrates deleted the alec/t12290 branch April 24, 2021 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
java interop release-notes worth highlighting in next release notes
Projects
None yet
5 participants