Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment with lookupswitch based pattern matching on Strings or type tests #285

Closed
retronym opened this issue Dec 6, 2016 · 13 comments
Closed
Assignees

Comments

@retronym
Copy link
Member

retronym commented Dec 6, 2016

Large pattern matches on string literals would benefit from the javac translation, which implements a switch on the hashcode.

We could extend this approach to type tests on final classes, by switching on the hascode of the class fully qualified name. @martijnhoekstra prototyped this in https://github.com/martijnhoekstra/switchbench (although at the time of writing I'm not sure he's pushed the latest commits)

In gittter, he states:

if elseif chaining is faster than this abomination for 4 cases by single digit percentages
this abomination is faster 8 cases up
I haven't looked at the exact point where they cross
but that's not for the specialized workloads yet, if the majority is the first case, if-elseif chaining is going to win more I suppose

@retronym
Copy link
Member Author

retronym commented Dec 6, 2016

Implementation wise, we could create a dummy (or macro?) method like constantHashOf(literal: String) that computes a compile time constant of the hashcode of literal argument.

constantHashOfFQN(literal: Class[_]) could do the same, but one challenge would be to know what the JVM name of the class will be during this computation for local classes that get renamed by lambdalift. Maybe we could restrict this classes that don't have any transitive term owners to make life simple. We could also exclude specialized classes where final doesn't actually mean final.

@retronym retronym self-assigned this Dec 6, 2016
@retronym
Copy link
Member Author

retronym commented Dec 6, 2016

First, we can try to manually perform this translation in a few of the hottest pattern matches (Traverser.traverse* /Transformer#transform / TypeMap#mapOver / Typer#typed*) in the compiler to see if we get any benefits.

@martijnhoekstra
Copy link

I've tried experimenting with TypeMap#mapOver, but matching over Type is more involved than matching over their fqn, due to the hierachy not being completely flat, and there seem to be non-trivial unapply methods on some companion objects

@retronym
Copy link
Member Author

retronym commented Dec 8, 2016

Yeah, that's not the easiest hierarchy to work with. I think this patch will do it:

https://github.com/scala/scala/compare/2.12.x...retronym:topic/switcheroo?expand=1

@sjrd
Copy link
Member

sjrd commented Dec 8, 2016

That encoding would be a disaster for Scala.js. In fact it would even break code for people who use scalaJSSemantics.withRuntimeClassName. May I request that this rewriting be done in cleanup or in the back-end, please, so that it doesn't affect Scala.js?

@retronym
Copy link
Member Author

retronym commented Dec 8, 2016

This is a far fetched experiment at this stage, but thanks for the heads up :)

@retronym
Copy link
Member Author

retronym commented Dec 8, 2016

The change to mapOver doesn't improve or degrade compiler performance in my benchmark. Unfortunately, the compiler doesn't really have obvious hotspots, profiles show that no method reports more than 5% of time.

@DarkDimius
Copy link
Member

DarkDimius commented Dec 8, 2016

I think scala\dotty are really bad benchmarks for this, they are memory bound and speedups through better local instructions can be hidden by this.

@martijnhoekstra
Copy link

I have a working-but-not-too-pretty implementation of generating a switch over strings in https://github.com/martijnhoekstra/scala/tree/switchstring - this is the first time I touched the compiler, so I probably did everything wrong and broke everything

@retronym
Copy link
Member Author

retronym commented Aug 23, 2017

We should think about this in the context of the Java 10 pattern matching implementation. Eventually, we could share bootstrap methods that expand the switches. Even before then, we could copy the approach into our own bootstrap.

An advantage of expanding the switch at link time is that you don't need to bake knowledge of string hashcodes or enum ordinals into the compiled code. The former is specced and can'
t change (although that is somewhat limiting as there are security arguments to avoid well known hashcodes). The latter requires some fairly heroic code gen by Javac to validate its assumptions at runtime. The Oracle Java team seem to like the idea of using bootstrap methods to defer the details of translation.

@martijnhoekstra
Copy link

@retronym Is there any implementation of the Java 10 pattern matching available yet? I saw JEP 305 (http://openjdk.java.net/jeps/305) but it didn't give me much to hold on to.

Trying to generate the bodies of the cases of string hash matching - particularly the if/else chain needed to handle collisions - proved too hard for me.

@retronym
Copy link
Member Author

Development if happening in http://hg.openjdk.java.net/amber/amber/langtools/branches. I haven't built it myself yet.

@dwijnand
Copy link
Member

Landed in scala/scala#8451 / 2.13.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants