New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode escapes are ordinary escape sequences #8480
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello, and thank you for opening this PR! 🎉
All contributors have signed the CLA, thank you! ❤️
Have an awesome day! ☀️
} | ||
Some(escapedStrs, elems) | ||
} catch { | ||
case _: StringContext.InvalidEscapeException => None | ||
case iee: StringContext.InvalidEscapeException => { | ||
ctx.error(iee.getMessage() + "\n", stringPosition) //should be positioned at str.po |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before, this didn't emit an error at compile time, only at runtime.
I could use a hand in positioning this properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could use a hand in positioning this properly.
Are you looking for tree.sourcePos
or something more specific ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to narrow it to a position of the actual error within the sourcePos.
Also, the transcript looks off (with the position too far "left") here, but I'm not sure anymore whether it just looks wrong or is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This proved a challenge, position of the first part of the interpolation was off, and in test output the second line of error output is indented differently than the first, giving the impression the caret is in a different position than it's really in, but I think I got it.
ctx.error(iee.getMessage() + "\n", stringPosition) //should be positioned at str.po | ||
None | ||
} | ||
case iuee: IllegalArgumentException => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StringContext.InvalidUnicodeEscapeException
has access to the positioning stuff, but it's not publicly accessible for bincompat reasons. What's the best course of action?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the best course of action?
Good question, this is tricky. I guess we could hack the Scala2Unpickler to disregard the protected[scala] on this definition but that's pretty ugly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Workarounds with java reflection, going through Java and extending StringContext$
(or is it final?), parsing the error message (the index is indicated in there) or not refining the position are also possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parsing the error message
I suggest doing that for now if it's not too much trouble, and we should leave a TODO for whenever we get to break forward compatibility of scala-library.
Not sure offhand how this change interacts with https://github.com/lampepfl/dotty/pull/8282/files which includes handling |
From the looks of it, it might not interact, or it might become redundant. I'll double-check if that's correct (and which it is). EDIT: I should to look better. |
@som-snytt merging that patch into this branch keeps the test passing. If I understand that patch correctly primarily forbids \r, \n and FF char literals, and the test tests some other situations where those, or their escape sequences are accepted or not, right? A strict reading of the scala2 spec allows only printable characters in char and string literals. That rules out tab. I suspect that's not intended. What the intent is exactly I don't know. I never understood why we shouldn't have form feed char literals. |
One purpose of the linked PR is to disallow the control char between single quotes, as shown in the test; not to disallow the escape
but to allow the unicode escape in that position, which historically was translated early to the control char. I guess |
d5cabee
to
dd41a2f
Compare
CI fails because of missing credentials. @anatoliykmetyuk can you take a look? |
rebased with 2.13.2 lib |
compiler/src/dotty/tools/dotc/transform/localopt/StringInterpolatorOpt.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks a lot!
} catch { | ||
case exception: Throwable => Some(s" raised exception $exception") | ||
} | ||
for (e <- res) println(s"test $name $e") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was following up that the unicodes on line 11 were edited out, when I noticed that this (edit: partially) reverts my last change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test has to assert because there is no longer a check file -- vulpix only checks the output if there is a check file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll see if I can restore those changes. Does vulpix detect a test failing an assertion? I guess I'll jiggle with it a bit and let you know.
As for the identifier on line 11, the variant that is still supported is in tests/run/unicodeEscapes.scala line 13-14, which makes good on the promise of temporarily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NBD, I PR'd it, thanks. I was about to futz with more parsing. It's amazing how quickly I forgot like how do I even run a test?
Scala 3 changes compared to the existing Scala 2 spec: - Reusing alphaid in the definition of plainid (this does not change its meaning) - Addition of quoteId and spliceId - Correctly specifying the use of _ in numeric literals. - Dropping symbolLiteral Scala 2 changes compared to the existing Scala 3 spec: - Various refactorings - Specifying the new Unicode escape handling stuff, this was already implemented in Scala 3 but not part of syntax.md (see scala#8480).
Scala 3 changes compared to the existing Scala 2 spec: - Reusing alphaid in the definition of plainid (this does not change its meaning) - Addition of quoteId and spliceId - Correctly specifying the use of _ in numeric literals. - Dropping symbolLiteral Scala 2 changes compared to the existing Scala 3 spec: - Various refactorings - Specifying the new Unicode escape handling stuff, this was already implemented in Scala 3 but not part of syntax.md (see scala#8480).
port of scala/scala#8282
s/replaceAllLiterally/replace because it's deprecated in scala2
The implementation is a bit simpler overall because it no longer needs to allow unicode escapes in triple quoted strings and raw escapes that scala2 needed to be grandfathered in for a deprecation cycle.