New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Matching strings makes switches in bytecode #8451
Conversation
Darn, I missed the chance to use the shiny "WIP" feature. @sjrd: the positions test is implicitly a scala.js-compatibility test. The trees coming out of patmat are hopefully not too difficult for the SJS backend to handle. Maybe this is a good time for me to get started over there. |
Cool :) Thanks for the separation in cleanup.
Sure, why not? :) If you do, you'll want to work in the 0.6.x branch. We forward-merge 0.6.x into master every now and then. |
The tests fail when the compiler has been bootstrapped through this change. A pattern match in Minimized: import annotation.switch
object Test {
def test(s: String): Int = {
(s : @switch) match {
case null | "" => 0
case _ => s.toInt
}
}
def main(args: Array[String]): Unit = {
println(test("2"))
}
}
|
There appear to be two separate problems. First, if we want to make this work for The
Also fails. |
I push-f-ed fixes for these problems. |
Thanks! I'm also not correct in my handling of |
*/ | ||
|
||
val stats = mutable.ListBuffer.empty[Tree] | ||
var failureBody = Throw(New(definitions.MatchErrorClass.tpe_*, REF(sel))) : Tree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't remember why I thought I couldn't use the synth default case from patmat. This should never be used (the Ident(WILDCARD, _, body)
case should always be hit but it seems safer.
case cd@CaseDef(StringsPattern(strs), _, body) => | ||
val jump = currentOwner.newLabel(unit.freshTermName("case"), swPos).setInfo(MethodType(Nil, restpe)) | ||
stats += LabelDef(jump, Nil, succeed(body)) | ||
strs.map((_, jump, cd.pat.pos)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used to copy the body of a case if there were alternatives, so
x match { case "a" | "b" => body }
would introduce tree sharing.
@@ -498,6 +595,9 @@ abstract class CleanUp extends Statics with Transform with ast.TreeDSL { | |||
super.transform(localTyper.typedPos(tree.pos)(consed)) | |||
} | |||
|
|||
case switch: Match => | |||
super.transform(transformSwitch(switch)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks... I can't believe I forgot this
Probably worth a test to show that the scrutinee is only evaluated once. I've manually tested with the following. object Test {
def test(s: String, expected: Int): Unit = {
val it = Iterator(s)
val r = (it.next(): @annotation.switch) match {
// collision
case "AaAa" => 1
case "BBBB" => 2
// collision with "FB"
case "Ea" => 3
case _ => 4
}
assert(expected == r, (expected, r))
}
def main(args: Array[String]): Unit = {
test("AaAa", 1)
test("BBBB", 2)
test("Ea", 3)
test("FB", 4)
test("XXX", 4)
test(null, 4)
}
} |
/* From this: | ||
* string match { case "AaAa" => 1 case "BBBB" | "c" => 2 case _ => 3} | ||
* Generate this: | ||
* goto matchSuccess (string.## match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why "goto matchSuccess" here? All cases of the match branch so it seems like a no-op to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. I had it before I realized the problem with Alternative
s
Comparing javac and scalac: retronym/string-switch@c2820bb class Test {
int test(String s) {
return switch(s) {
case "AaAa" -> 1;
case "BBBB" -> 2;
case "Ea" -> 3;
case "xx" -> 4;
default -> 5;
};
}
} class Test {
def test(s: String, expected: Int) = {
s match {
case "AaAa" => 1
case "BBBB" => 2
case "Ea" => 3
case "xx" => 4
case _ => 5
}
}
} |
That's interesting. Seems like Java emits two switch instructions: one to get a number representing what code block to run, and one to run it. I'm not sure I see a benefit there compared to this. |
I'm honored as always, @SethTisue |
There is a comment in the javac source that they do it to keep fall through simpler, so that doesn't apply to match. |
Nice find; thanks! |
Another advantage the double-lookup would be the opportunity to use an invokedynamic based implementation like is being added to JDK 14 (https://github.com/openjdk/amber/blob/pattern-runtime/src/java.base/share/classes/java/lang/runtime/PatternHandles.java). That's not something we're targetting right now as we obviously can't depend on JDK14, but something for later on. |
I guess the |
The way I see it, match-on-string is desugared into match-on-int, but the latter is compiled as is to a JVM instruction. Match-on-int is never really desugared. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@hrhino Are you interested in trying to make the required changes in the Scala.js compiler backend, or should I do them? |
@sjrd I'll give it a go. |
Switchable matches with string-typed scrutinee survive the pattern matcher in the same way as those on integer types do: as a series of `CaseDef`s with empty guard and literal pattern. Cleanup collates them by hash code and emits a switch on that. No sooner, so scala.js can emit a more JS-friendly implementation. Labels were used to avoid a proliferation of `throw new MatchError`. Works with nulls. Works with Unit. Enclosed "pos" test stands for positions, not positivity. Fixes scala/bug#11740 Co-Authored-By: "Jason Zaugg" <jzaugg@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! @sjrd should we wait with the merge until Scala.js is ready?
Yes please, otherwise the community build will break. @hrhino told me he intended to do it, but he needs some approval. |
So ... it turns out that in Scala.js 0.6.x, this actually just works out of the box 😆 The thing is that in 0.6.x, our IR supports any kind of literal in our Now, ironically, Scala.js 1.x crashes and burns with this change, because in 1.x we restricted our |
Since Scala 2.13.2, the pattern matcher will keep `Match` nodes that match on `String`s and `null`s as is, to be desugared later in `cleanup`. This was implemented upstream in scala/scala#8451 We implement a more general translation that will accept any kind of scalac `Literal`. If the scrutinee is an integer and all cases are int literals, we emit a `js.Match` as before. Otherwise, we emit an `if..else` chain with `===` comparisons.
OK we've got a PR for Scala.js with green CI and review, ready to be merged. So feel free to merge this PR in Scala, and we'll follow suit in Scala.js. |
The pattern matcher will now emit `Match` with `String` scrutinee as well as the existing `Int` scrutinee. The JVM backend handles this case by emitting bytecode that switches on the String's `hashCode` (this matches what Java does). The SJS already handles `String` matches. The approach is similar to scala/scala#8451 (see scala/bug#11740 too), except that instead of doing a transformation on the AST, we just emit the right bytecode straight away. This is desirable since it means that Scala.js (and any other backend) can choose their own optimised strategy for compiling a match on strings.
The pattern matcher will now emit `Match` with `String` scrutinee as well as the existing `Int` scrutinee. The JVM backend handles this case by emitting bytecode that switches on the String's `hashCode` (this matches what Java does). The SJS already handles `String` matches. The approach is similar to scala/scala#8451 (see scala/bug#11740 too), except that instead of doing a transformation on the AST, we just emit the right bytecode straight away. This is desirable since it means that Scala.js (and any other backend) can choose their own optimised strategy for compiling a match on strings.
The pattern matcher will now emit `Match` with `String` scrutinee as well as the existing `Int` scrutinee. The JVM backend handles this case by emitting bytecode that switches on the String's `hashCode` (this matches what Java does). The SJS already handles `String` matches. The approach is similar to scala/scala#8451 (see scala/bug#11740 too), except that instead of doing a transformation on the AST, we just emit the right bytecode straight away. This is desirable since it means that Scala.js (and any other backend) can choose their own optimised strategy for compiling a match on strings.
The pattern matcher will now emit `Match` with `String` scrutinee as well as the existing `Int` scrutinee. The JVM backend handles this case by emitting bytecode that switches on the String's `hashCode` (this matches what Java does). The SJS already handles `String` matches. The approach is similar to scala/scala#8451 (see scala/bug#11740 too), except that instead of doing a transformation on the AST, we just emit the right bytecode straight away. This is desirable since it means that Scala.js (and any other backend) can choose their own optimised strategy for compiling a match on strings. Fixes scala#11923
The pattern matcher will now emit `Match` with `String` scrutinee as well as the existing `Int` scrutinee. The JVM backend handles this case by emitting bytecode that switches on the String's `hashCode` (this matches what Java does). The SJS already handles `String` matches. The approach is similar to scala/scala#8451 (see scala/bug#11740 too), except that instead of doing a transformation on the AST, we just emit the right bytecode straight away. This is desirable since it means that Scala.js (and any other backend) can choose their own optimised strategy for compiling a match on strings. Fixes scala#11923
Switchable matches with string-typed scrutinee survive the pattern matcher in the same way as those on integer types do: as a series of
CaseDef
s with empty guard and literal pattern.Cleanup collates them by hash code and emits a switch on that. No sooner, so scala.js can emit a more JS-friendly implementation.
Labels were used to avoid a proliferation of
throw new MatchError
.Works with nulls. Works with Unit.
Enclosed "pos" test stands for positions, not positivity.
Fixes scala/bug#11740.