Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiler integrated scala-async #88

Closed
wants to merge 76 commits into from

Conversation

retronym
Copy link
Owner

@retronym retronym commented Mar 13, 2020

scala-async (SIP-22) is currently implemented as a macro.

This pull request incorporates async into a fully blown compiler phase of the standard compiler.

Why abandon the macro approach?

  • Typer is a somewhat unnatural place in the compiler pipeline for the expansion. The ANF transform is markedly simpler once things after uncurry and easier again after erasure. Custom integrations of async that use async pattern extactors and guards are only possible when running after patmat.
  • Async translated blocks generate larger ASTs than the original source code. Such a a translation is better deferred to later in the pipeline for performance reasons.
  • Async leans heavily on compiler APIs, and some of these are unavailable int the macro API facade.

Integrate Async into the compiler

  • Import the scala-async code base into scala.tools.nsc.transform.async
  • A small macro runs during typer to expand async(expr)(execCtx) into (approximately) {val temp = execCtx; class stateMachine { def apply(tr: Try) = { expr; () } }; new stateMachine.start()}. Wrapping expr in a class lets intervening compiler phases do their job (e.g. captured enclosing values give rise to outer pointer usage).
  • The async phase proper runs after erasure. As before, it consists of a) a selective ANF transform of the expression; b) a translation of control flow that crosses async boundaries into separate states of a state machine; c) lifting locals that are referred to from multiple states into members of the state machine.
  • Refactor to be idiomatic compiler phase (use normal Transformers/Traversers APIs, use other internal APIs rather than c.internal_.
  • This functionality is opt-in with -Xasync. Under that, mode, it is optional to include scala-async.jar on the compiler classpath. If it is absent, the compiler will create synthetic symbols so source code may still refer to scala.async.Async.{await,async}.

Improve the implementation of the ANF transform

  • Refactor the code for better performance and maintainability
  • Be more selective about when to transform code.
  • Simplify with the assumption that only post-erasure Tree shapes will be seen.

Improve the implementation of the State Machine Transform

  • Refactor the code for better performance and maintainability
  • Drastically reduce the number of states needed to represent pattern matches and if/else constructs with "sparse" usages of await among the branches. Blocks prior to async boundaries are now kept int the same state as the If or LabelDef itself.
  • Clean up the generated code to factor repeated code in each state that inspects the completed/failed status of futures into shared code.

Improve performance

  • Compile time performance is improved markedly. This is due to emitting fewer states, performing less bookkeeping, and avoiding fixing inefficiencies in the way we interact with compiler facilities like tree attachments, tree transformers.
  • Runtime performance is likely improved for certain use cases, although the savings may be negligible compared to the cost of async boundaries for many realistic workloads. For each state transition that we avoid, we save one tableswitch, some null field assignments and read/writes of the state variable. Another benefit would be needing to lift fewer values into fields (saving space in the state machine and the put/get field cost.

@retronym retronym force-pushed the topic/scala-integrate-async-review2 branch 9 times, most recently from 64c3ae5 to e453980 Compare March 14, 2020 14:01
adriaanm and others added 18 commits March 15, 2020 09:05
The macro wraps the (untransformed) expression in the skeleton of the
state machine class. Compiler phases between this point and the late
async transform will correctly add outer pointers, lift captured vars
into Ref objects etc. The state machine class itself will also undergo
any transforms that it needs, such as specialization.

The macro is wired up using the FastTrack facility in line with other
macros that are quasi-intrinsic (ie, implemented in the scala-compiler.jar).

After post-erasure, we run the ANF and state machine transform.

The old code used the macro refletion API -- these usages are
now being replaced with direct use of the richer Global API.
Passes the correct owner symbol to `useFields` to fix a bug
with local modules.

Refactors to store state about the current async block in a var
which allows us to mix in all the implementation once into
AsyncPhase.

Remove some testing FutureSystems. We'll refactor the tests
to use ScalaConcurrentFutureSystem.

Allow a custom front end to supply an alternative FutureSystem
attached to the block defining the state machine wrapper class.
Demonstrate this with a test case for an annotation driven
front end that targets an alternative Future implementation.

Remove some now-unused customization points from FutureSystem.
For example, we don't need to abstract over the choice of
base class for the state machine as a custom front end now
creates the state machine tree directly.

FutureSystem now deals directly in Trees, rather than in
`c.Expr`. I've kept a light weight option (a type alias
`type Expr[T] = Tree`) so we can have a little compiler support
for getting the types right and documentation for
`FutureSystem` implementors.

Refactor the ANF transform (and others) to directly be a
TypingTransformer.

Remove the need to subclass Function0 in the state machine
by kicking things off with `Future.unit.onComplete(stateMachine)`.
This reduces bloat in the generated code due to Function0
specialization.

I've also elminated the macro application in the early transform
in favour of wrapping the argument in `Block(<arg> :: Nil, q"()"}`.
This ran into some "pure expression has no effect" warnings which
I've suppressed with a new tree annotation.

Most of the `auto_` test cases are converted to use the
standard scala.concurrent version of async. Now that the
transform is always happening post-erasure, there is no need to
keep those tests under the annotation driven async test (which
used to be the only way to get post-typer expansion).

A few test can only be expressed with the annotation driven
version of async, namely those that have an async boundary
in an extractor. These have been moved into the
`AnnotationDrivenAsyncTest`.
  - most regular scala.async.Async tests pass
  - AsyncId converted to ScalaConcurrentAsync
  - non-standard tests not properly integrated yet
  - actual test failure in `run/futures.scala`
  - neg/NakedAwait still missing (alternative first)
  - remove late.scala (converted to AnnotationDrivenAsync)
  - Remove uncheckedBounds.scala (the problem no longer exists post-erasure)
Also:
  - remove late.scala (converted to AnnotationDrivenAsync)
  - Remove uncheckedBounds.scala (the problem no longer exists post-erasure)
UseFields needs to do what explicit outer would have done:
access inner classes must access fields via the outer
parameter and we must ensure fields accessed from inners
are not JVM private.
This code path was only relevant when async was run as a macro
in the presentation compiler.
As a late-running ANF transform must do.
Deal with the representation of local modules at async's
new place in the compiler pipeline.

Use more efficient/idiomatic collection of local def trees.
(`Tree#children` creates temporary lists.)
A long overdue overhaul.

Originally, we defined this in terms of a pair of mutually recursive
transforms (`anf` and `linearize`). We can simplfy the implementation
by collapsing these the cases of single Transformer.

This transformer is designed for "thicket" based transforms, where
a single input tree can map to a list of output trees which will
be flattened into the enclosing `Block`. An enclosing block will be
automatically added if there was none before.

Make the transform more idiomatic by more widespread use of
utilities to generated attributed trees rather than passing
untyped trees through `localTyper.typed`.

Rework special casing of Unit/Nothing typed expressions to
avoid temporary vals of these types to avoid littering the
resulting trees with `BoxedUnit` references.

Avoid unnecessary temporaries in:
  - the suffix of the args of an Apply the follows the final `await`
  - the result of If that only awaits the condition
  - similarly, the result of a Match that only awaits the scrutinee
  - "safe to inline" expressions like simple Idents or constants.

Remove special cases for derived value classes which are
eliminated now before the ANF transform.

Furthermore:

  - Deal with ArrayValue tree
  - Remove cast which was used ot suppress typer warning
  - Don't need to deal with nested Applys after uncurry
  - Remove an new-unneeded adjustment from the entry point to the ANF
    and from MatchResultTransformer.
@retronym retronym force-pushed the topic/scala-integrate-async-review2 branch from 9f72fa3 to c2ffe69 Compare March 19, 2020 00:54
@retronym
Copy link
Owner Author

retronym commented Mar 19, 2020

"Proof of concept integration with j.u.c.CompletableFuture" demonstrates that it is now relatively straight forward to adapt async to j.u.c.CompletableFuture.

@retronym retronym force-pushed the topic/scala-integrate-async-review2 branch 3 times, most recently from 795434e to 27cd02b Compare March 19, 2020 05:30
@retronym
Copy link
Owner Author

I've now removed the magic synthesis of scala.async.Async._ and instead changed the tests to use scala.partest.async.Async._, which shows what the new implementation of scala-async will look like.

@retronym retronym force-pushed the topic/scala-integrate-async-review2 branch from 27cd02b to 92334f9 Compare March 19, 2020 05:35
@retronym
Copy link
Owner Author

retronym commented Mar 20, 2020

... and here's how scala-async itself would change to use the compiler integrated transform when running on a capable compiler with -Xasync enabled: https://github.com/scala/scala-async/pull/237/files.

A new major version of scala-async could drop the fallback to the legacy implementation. It could also ship a base class for the state machine to reduce generated code size and simplify the macro implementation.

Now that we require custom integrations of async to
create their own state machine wrapper in an early phase,
require that the state machine implements a set of
methods to deal with the Try and Future types being
used.

As shown in the test `AnnotationDrivenAsync`, these
can be inherited from a base class.
For third party async integrations, rather than just ScalaConcurrentFutureSystem.
Extract base class, so far just in test sources, for the
contract of state machine implementations.

Show how to implement scala.concurrent._'s async
frontend as a regular macro rather than a compiler
intrinsic.
@retronym retronym force-pushed the topic/scala-integrate-async-review2 branch from 6de268e to bf95c73 Compare March 20, 2020 04:39
@retronym retronym closed this Mar 20, 2020
@retronym
Copy link
Owner Author

This is getting close the finish line. Moving over to scala#8816

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants