Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect handle of unicode escapes in triple-quoted string #11640

Closed
lihaoyi opened this issue Mar 7, 2021 · 7 comments
Closed

Incorrect handle of unicode escapes in triple-quoted string #11640

lihaoyi opened this issue Mar 7, 2021 · 7 comments

Comments

@lihaoyi
Copy link
Contributor

lihaoyi commented Mar 7, 2021

Compiler version

3.0.0-RC1

Minimized code

object Bar{
  def main(args: Array[String]): Unit = {
    println(""""\\\uCAFE"""".getBytes.toList)
  }
}

Output

This is what it prints in 3.0.0-RC1

List(34, 92, 92, 92, 117, 67, 65, 70, 69, 34)

Expectation

This is what it prints in 2.13.4 and 2.12.13

List(34, 92, 92, -20, -85, -66, 34)
@smarter
Copy link
Member

smarter commented Mar 7, 2021

/cc @martijnhoekstra for his escaping expertise :)

@martijnhoekstra
Copy link
Contributor

martijnhoekstra commented Mar 7, 2021

2.13 emits a deprecation warning for the escape in a triple quoted string, scala 3 doesn't handle the escape anymore. I believe that behaviour follows the principle of least astonishment: Unicode escapes are treated as any other escape, that is to say, they're escaped in single quoted strings, backquoted identifiers and char literals and by the f and s interpolators.

The three options that will work with any version are

  • Use a literal (which is usually the best idea)
  • Use a a raw interpolation and interpolate it in, which can be useful for non-printable characters
  • Use an interpolation or single-quoted string and escape the other things too, when the unicode escape is the only thing you need to escape (i.e. not here)

For this particular string, those are respectively """"\\쫾"""", raw"$"\\${'\uCAFE'}$"" and "\"\\\\\uCAFE\""

Reference #8480 and scala/scala#8282

@martijnhoekstra
Copy link
Contributor

martijnhoekstra commented Mar 7, 2021

I just realized the quote escaping needed for the interpolation is not in scala 2. I don't have the determination to get that through SIP. If you go for that route, you need to interpolate those in too if you want to cross compile with the interpolation

@SethTisue
Copy link
Member

fyi @dwijnand

@som-snytt
Copy link
Contributor

Maybe there is room in Dr Odersky's research budget to hire a PhD specializing in backslashes.

Usually the academic life is characterized by backstabbing rather than backslashing.

I was intrigued by the last comment, which I don't understand yet. I see there is StringContext.processUnicode which is guarded by -Xsource:3 and which should be private[scala] and not protected[scala].

scala> StringContext.processEscapes
def processEscapes(str: String): String
scala> StringContext.processEscapes("\\"*3 + "uCAFE")
val res6: String = \쫾

Sorry, I got pulled away from my experiment. I'll post this comment anyway for the joke. Hopefully worth it. I think -Xsource:3 ought to be as seamless as possible, and everyone who cares about migration should use it.

@lrytz
Copy link
Member

lrytz commented Mar 8, 2021

raw""""\\${'\uCAFE'}"""" also works

@smarter
Copy link
Member

smarter commented Mar 8, 2021

2.13 emits a deprecation warning for the escape in a triple quoted string

That's enough for me to close the issue then, sounds like we're not doing anything wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants