Shape improvements #10808

gasche · 2021-12-03T02:04:28Z

This is an alternative to #10796 -- a change to the way "shapes" are computed to avoid time or space blowups in the compiler.

In essence, "shapes" are lambda-terms representing the module structure of OCaml programs, and Merlin has to reduce shape to normal forms to produce an effective "jump to definition" tool. The design chosen by @voodoos and @trefis is to perform some reductions of shapes in the compiler, and some in Merlin; but in the original PR, #10718, this was done too naively, and it would blow up on programs using modules and functors heavily, such as Irmin (see report #10779).

The current shape-reduction engine is very naive (it performs substitutions!). @trefis proposal in #10796 is a series of elaborate optimization to avoid some of the naive-reduction downsides -- for example, grouping n-ary applications together to perform a single substitution. The approach in the present PR is to implement a better reduction strategy from the start, to avoid the need for optimizations.

With this PR, we perform strong reduction on shapes (that is, as much as we can do without accessing shapes of other compilation units), in neglectible time and using no memory. (I only measured prformance on irmin.cmt, reported as a worst case). #10796 ends up producing a 95Mio irmin.cmt file in 2s, we produce a 2.5Mio cmt file in 0.5s, same as before shapes were introduced.

The PR has not been tested outside the compiler codebase + Irmin. I hope that @trefis or @voodoos could possibly test that it works well for Merlin, and/or use their cmt-measurement tools to check that there is no issue with other .cmt files.

Notes:

The previous work on shapes was already implement some form of strong reduction (reducing under lambdas), by doing some reductions during type-checking (deeply inside terms).
Implementing strong reduction properly (using an environment instead of substitutions) gives good space behavior, but irmin.cmt still sees a lot of redundant compitations (I think they come from projections corresponding to module-signature coercions). I had to implement memoization to get good time behavior. - I tried to add hashconsing on top of memoization, but did not observe any gain on irmin.cmt.
Note that memoization reduces in-memory size of structures, by sharing identical subresults, but the -dshape output can still blowup.
I kept two of @trefis commits from Module shapes for easier tooling #10718; we checked that removing them does not degrade performances of this PR, but they still produce better / more readable -dshape outputs.
I believe that with more clever strong-reduction strategies (substitution only for "useful redexes"), we may not need memoization and could remove the redudancy of -dshape output as well.

xavierleroy · 2021-12-03T16:47:49Z

Neat idea! I admit I didn't read the code, but: which strategy / abstract machine do you use for strong reduction?

gasche · 2021-12-03T21:51:12Z

It's a home-grown strategy that I got to by trying to minimally adapt a simple (weak)call-by-value evaluator to strong reduction:

The core looks like this (simplified):

type strategy = Weak | Strong

let reduce env t =
  reduce_ env Strong t

and reduce_ strategy env t =
  (* the full implementation memoizes on (strategy, env, t) *)
  let reduce env t = reduce_ strategy env t in
  match t with
  | Abs(var, body) ->
    begin match strategy with
    | Weak -> t
    | Strong ->
      let env = bind env var (Var var) in
      let body = reduce env body in
      Abs(var, body)
    end
   | App(f, arg) ->
     let f = reduce_ Weak env f in
     let arg = reduce_ Strong env arg in
     begin match f with
     | Abs(var, body) ->
         (* we only add Strong normal forms
            to the environment. *)
       reduce (bind env var arg) body
     | _ -> App(f, arg)
     end
   | Var id ->
     (* already in Strong normal form *)
     Ident.find_same id env

The use of two strategies (as opposed to using the Strong mode everywhere) avoids traversing the body of each lambda-abstraction twice, first when computing reduce env f and second when computing reduce (bind env var arg) body.

reduce_ Weak does not exactly perform weak-reduction, it will strongly reduce function arguments. If I remember correctly your work with Benjamin Grégoire, you perform standard weak reduction and then iteratively reduce under abstraction bodies; but I thought that this may duplicate reduction work by substituting terms in weak-normal-form (rather than strong-normal-forms) during beta-reduction.

Ideally I think one would have an abstract machine that only performs "needed" substitutions: I would prefer for the normal form of (fun x -> { foo = x.foo; bar = x.bar; }) (f(a)(b)(c)) (terms of the form { foo = x.foo; bar = x.bar } come from the translation of module coercions (X : sig type foo type bar end)) to be let x = f(a)(b)(c) in { foo = x.foo; bar = x.bar } rather than the current { foo = f(a)(b)(c).foo; bar = f(a)(b)(c).bar }. But I didn't think of a simple way to implement this (I guess I should ask @Ekdohibs...), and the current naive approach works well: the two occurrences of f(a)(b)(c) in the normal form are in fact shared in memory, so the -dshape output is large but the .cmt is small.

(Another thing to try would be to not do strong reduction at all, stick with weak reduction. My intuition tells me that the normal-form size blowup problems that we see are due to neutral terms occurring in strong reduction; in the example above, if we wait to reduce until the f(a)(b)(c) are known and reduce to a structure, there is no duplication anymore. The question is whether those strong reductions save some work that Merlin would have to do; I think it would be fine, but again the current approach works well enough that I didn't try.)

irmin.cmt back to 2.5Mio This changes the test output for shapes that are not printed in normal form anymore. (Following commits go back to the original outputs.)

The previous approach to shape reduction, which use the Shape.app smart constructor during module type-checking, would perform shape reduction under functor abstractions -- strong reductions. Strong reductions are the case that blows up on some interesting examples, including Irmin: under functor abstractions, shapes of module type may not reduce to structures (but neutral terms containing free variables), and the projections {f = S.f; g = S.g; ...} that correspond to the translation of module signature coercions (M : sig type f type g end) may generate arbitrary many copies of the non-trivial shape S> The approach in this commit is to reduce shapes for the whole module (treating references to other compilation units as free variables; this remains a "local" reduction), but very weakly: without going under functor abstractions, and in fact without reducing functor applications either. In particular, the production of irmin.cmt, a known-bad case, does not blow up. We added a new example in the testsuite that produces an exponentially-large shape, but can still be computed in a short amount of time. Note: existing shape tests do *not* pass with this commit, as we reduce less aggrssively than before. Follow-up commits are going to implement a stronger reduction stragegy.

In the cases where we see a shape size blowup, repeated substitutions can result in a memoy usage blowup. When using an envionment instead, memory usage to build irmin.cmt remains constant (instead of growing continuously to gigabytes of live RAM), even under stronger reduction strategies.

xavierleroy · 2021-12-06T18:01:27Z

The Grégoire-Leroy strategy is (possibly) interesting in conjunction with a compiled implementation of weak symbolic evaluation, but I agree that it is useless for an interpreter, and that you don't want to do any compilation here.

Yes, it would be good to have @Ekdohibs 's expert opinion, and perhaps also that of your team members in Saclay who work on exactly this kind of things :-)

Naive question: the normal form of a lambda-term can be very very much bigger than the original term, so why is it appropriate to store shapes that are in normal form?

gasche · 2021-12-06T20:16:04Z

I have been hacking on more stuff in the direction of the present PR (including fixing a bug, oops), and I have yet to get the commit history in a "fully satisfying" state. I will finish this before I bang on more doors for reviews. (But of course volunteers are always welcome.)

I thought indeed of using one of Beniamino Accattoli's abstract machines for strong reduction, but abstract machines (especially when they involve explicit substitutions) tend to require fairly invasive changes when starting from a standard big-step evaluator, so I haven't crossed that bridge yet.

gasche · 2021-12-06T20:26:11Z

Naive question: the normal form of a lambda-term can be very very much bigger than the original term, so why is it appropriate to store shapes that are in normal form?

Currently there are known cases where shape computation seems to behave badly for some module-heavy programs (Irmin). If we start from the premises that shape computation is a computationally-hard problem, and that shapes can benefit all tools consuming -bin-annot information, then doing the reduction (well) in the compiler makes sense.

I am not sure that this is in fact correct: I suspect that with more massaging (see the next PR :-), shape reduction should be instantaneous for all OCaml programs (and normal forms fairly small), so both strategies would be equally reasonable.

Ekdohibs · 2021-12-06T22:13:34Z

From what I see, the potential problem with reduce Weak reducing function arguments strongly is that these arguments may be reduced to strong normal form, but that strong normal form may be unused (the argument itself could even be unused!). I suspect this may not be the case of standard OCaml programs, but if a change is desired to ensure no useless computations are done, the interpreter would need to be changed from strong call-by-value to strong call-by-need, with both (weak) function arguments and strong form of said arguments being computed lazily. (I can show an implementation of this with standard lambda-calculus if you want).

gasche · 2021-12-06T22:17:06Z

I can show an implementation of this with standard lambda-calculus if you want.

Sure :-) I must say that the idea hadn't crossed my mind that I would want to solve a practical problem using a strong call-by-need evaluation strategy, but it sounds like fun.

(I believe that the "useless work" of evaluating then-unused subterms is not a problem for shapes in OCaml programs, because people tend to not create complex module expressions that are mostly-useless. So the call-by-need aspect is probably not worth the complexity. But yet I'm curious to see the code.)

Ekdohibs · 2021-12-07T08:27:27Z

I have several versions of it:

A full version able to handle lambda-calculus + constructors and fix
A simpler version able to handle only the lambda-calculus
And an even simpler version which handles laziness using OCaml's lazy, thus avoiding needing the flag for deep/shallow reduction, at the cost of the output structure being partially lazy and needing to be fully evaluated if we really want to compute the normal form (this is the force_deep function at the end).

Any reduction work not done by the compiler will have to be done by Merlin. We believe that strong-reduction (including under functors) is preferable to weak-reduction in the compiler, if we can avoid time and space blowups. At this point, irmin.cmt needs 5mn to produce a 2.5Mio .cmt file; this is too slow, but memory usage is fine.

This change has a radical effect on performance, it completely eliminates the redundancies inherent in som examples, both in time (avoiding duplicated computations) and space (reusing physically-identical results instead of copies of the same results). At this point, irmin.cmt takes 0.5s to produce a 2.5Mio result; this is fast and small.

gasche · 2021-12-16T16:59:52Z

I added documentation comments to my implementation and rebased the history (it's not perfect, but good enough).

I haven't tried to reuse @Ekdohibs's approach. It's surprisingly nice (in the last version, I think it would be nicer and clearer to present force_deep as a read-back function from normal forms to terms), I could give it a try.

gasche · 2021-12-16T21:46:17Z

I proposed a version using strong call-by-need in #10825.

gasche · 2022-01-04T17:15:43Z

I realized that the design of the strong call-by-need evaluator, using an explicit type of normal forms with a read_back : nf -> t function, can also be used to simplify the strong call-by-need evaluator (and get them very close to each other).

I have updated the present PR to use this design. The code is simpler than before, and it is also smaller than the strong call-by-need evaluator, thanks to not having to work around the non-hashability of lazy thunks (there are no lazy thunks here).

Given that we expect strong CBV and strong CBN to have the same performance profiles (as long as they are memoized) on this problem (with an advantage for CBN in corner cases), I think that this version could be considered better (shorter) than the other.

Ekdohibs · 2022-01-04T22:37:34Z

It seems to me that in this type of normal forms, the normal forms of functions is always computed, while it would not be necessary in the case of functions which are later applied to an argument. In the case of functions with large normal forms which are only ever applied to arguments, the difference can be quite significant. Of course, this case may not happen in practice, so I am not sure whether the change is worth it.

gasche · 2022-01-04T22:59:04Z

This is a good (technical) point. Could I fix it by using nf option ref (no nf Lazy.t!) to evaluate it only on demand within read_back? This feels very ad-hoc.

Ekdohibs · 2022-01-04T23:41:34Z

That would indeed work! Also, you could completely avoid the use of Lazy.t in the strong call-by-need evaluator as well using the type (nf, t * env) result ref where you would have a lazy value, since the code pointer used for the closure is always the same (reduce), and the important part is the value of the variables bound by this closure, i.e. a term and an environment. You could also add a NLazy of t * env constructor to the type nf, and simply use nf ref where a lazy value is needed, taking care to evaluate it.

gasche · 2022-01-05T09:22:58Z

I'll discuss the call-by-value aspects here and your call-by-need remark in the other PR.

Thinking more about this, my opinion is that no change is required to this PR in reaction to your remark:

The case of "useless computation" that you point out is not a new issue coming from the nf + read_back version of the strong call-by-value evaluator, it was already present with my previous design.
My understanding of strong call-by-value is that this useless-computation is part of the specification (not a problem with my implementations). Your suggestion does in the direction of adding more "by-need" aspects to the reduction strategy. I would rather stick to standard systems (as "standard" as we can get in a strong setting...).
I believe that this useless-computation case will not be a problem in practice in OCaml programs. This specific form of useless computation comes from a lambda-expression that is not returned as part of the normal form, only used for applications. But this is relatively uncommon for OCaml module shapes. It would naturally arise when a functor is defined locally, but not exported through the module interface.

(Finally: there is something that I don't fully understand with shapes, which is whether we can really erase any part of the input program. If a functor is only used locally and not exported by a compilation unit, we don't need to normalize its body to compute the compilation unit shape. But what happens when the user is editing the file locally, and asking information about names defined by this local functor? My intuition would be that we in fact need to normalize the shape of any sub-term that occurs in the program, as it can be edited by Merlin; but none of the two PRs currently does it, nor the original shape implementation.)

trefis · 2022-01-05T10:13:06Z

If a functor is only used locally and not exported by a compilation unit, we don't need to normalize its body to compute the compilation unit shape. But what happens when the user is editing the file locally, and asking information about names defined by this local functor? My intuition would be that we in fact need to normalize the shape of any sub-term that occurs in the program, as it can be edited by Merlin; but none of the two PRs currently does it, nor the original shape implementation

Merlin will do that. But not by going through the the compilation unit shape. The way merlin intends to use shapes is as follows: get the Path.t under the user's cursor, call Env.shape_of_path, then call reduce on that shape it just got. The reduce function used will load shapes for external compilation units, but it will also perform the appropriate reductions on any local shapes.

Note: having shapes stored in the environment make it really easy to implement this procedure, because merlin has easy access to the environment at the cursor position. It would be a lot more annoying if shapes were computed after the fact from the typedtree (and so weren't stored in the environment). I plan to answer your other questions here and in the other PRs soon, but you can view this as a sort of preliminary reply for at least one of those questions.

Ekdohibs · 2022-01-05T10:24:54Z

Concerning my understanding of strong cbv, it is indeed not necessary to have laziness on the normal form of functions -- I think my understanding of it was rather a weak call-by-value, strong call-by-need hybrid strategy. The main point was originally to have the property that a program terminating in weak cbv with a strong value (like a variable) would terminate as well in this strategy, unlike in strong cbv (for instance, the use of an Y combinator for recursion is enough to break termination in a strong setting). However, I realized it does not work either, as terms such as (fun y -> x) (x (fun z -> [non-terminating term])) where x is an open variable (defined by an above lambda) is terminating in weak cbv (with the value x), but will not terminate with most definitions of strong cbv, as x (fun z -> ...) needs to be reduced to a value, so the strong normal form beneath the function will be needed.

gasche · 2022-01-05T10:25:02Z

But currently the shapes stored in the environment (let's call them sub-shapes) are not in normal form. (With either of my PRs they are not reduced at all, with your original implementation they were only partly-normalized.) So in theory there is a risk of computation blowup there, same as when we proposed to not normalize shapes at all in the compiler. And the memoization in my PRs is only local to a given reduce call, so computation is not shared across those sub-shapes. We could:

decide that this is fine in practice and change nothing
try to memoize shape reduction globally (by also indexing on the global environment?)
strongly normalize all sub-shapes
not reduce shapes at all in the compiler, and leave the work to Merlin

voodoos · 2022-01-06T15:21:37Z

I think the first option is a good and pragmatic choice:

decide that this is fine in practice and change nothing

@trefis benchmarks on the strong call-by-need version show that, in practice, having to reduce local shapes does not cause noticeable latency on locate queries (he did not run it yet on the cbv version but results should be close).

The current compromise of having the compiler reducing whole compilation-unit shapes and Merlin only local shapes coupled with an efficient reduction algorithm (memoized cbv or cbn) seems like a reasonable choice performance-wise.

However I just tried the latest version of this PR (strong cbv) as I wanted to have a last round of benchmark comparing it to the strong-cbn one and it hangs indefinitely while building Irmin.

But the latest CBN version performs very well: total build time on Irmin 48.572s ±0.138 instead of 48.505s ±0.208 without any shapes. My take is that we should focus on reviewing the CBN version as it works right now and performs quite well. The only drawback being the increased complexity of the implementation.

We use a proper type of normal forms, where functions are represented by closures carrying and environment, and a read-back function from normal forms to terms. This simplifies the environment-capture shenanigans of the previous implementation.

gasche · 2022-01-06T15:33:36Z

@voodoos thanks!

I didn't reproduce the issue when I tried locally to rebuild irmin.cmt, but I think there was an issue with my test setup, apologies.

I think the issue was a silly mistake in how read_back is memoized; which should now be fixed (I updated the PR).

voodoos · 2022-01-06T15:34:21Z

@voodoos thanks!

I didn't reproduce the issue when I tried locally to build irmin.cmt, but I think there was an issue with my test setup, apologies!

I think the issue was a silly mistake in how read_back is memoized; which should now be fixed (I updated the PR).

OK I run my bench again with the latest version of both PRs.

voodoos · 2022-01-06T16:12:19Z

@gasche I think it stills hangs ?
Running dune build @install --verbose it stops when building irmin_git.cmo/x.

gasche · 2022-01-06T16:43:01Z

Ah, I was testing on irmin.cmt, thanks.

Octachron · 2022-01-20T10:02:28Z

Closed in favor of #10825.

trefis added 2 commits December 3, 2021 01:30

shape: only refresh vars when necessary

dc99122

shapes: more sharing in includemod

f929312

gasche added 4 commits December 4, 2021 20:55

shape: disable all reductions

435c1c7

irmin.cmt back to 2.5Mio This changes the test output for shapes that are not printed in normal form anymore. (Following commits go back to the original outputs.)

Shape: nicer handling of fuel

2f622b0

xavierleroy added this to the 4.14.0 milestone Dec 6, 2021

gasche force-pushed the shape-improvements branch from ae6bd56 to a7bc83b Compare December 15, 2021 20:09

voodoos mentioned this pull request Dec 16, 2021

[WIP] Use OCaml 4.14 + shapes to ré-implement locate in Merlin ocaml/merlin#1416

Closed

3 tasks

gasche force-pushed the shape-improvements branch from a7bc83b to 61969e6 Compare December 16, 2021 15:13

gasche added 3 commits December 16, 2021 16:58

[minor] shape: refactor the printing of shape items

50aea3b

gasche force-pushed the shape-improvements branch from 61969e6 to 50aea3b Compare December 16, 2021 16:48

gasche mentioned this pull request Dec 16, 2021

Shapes: a strong-call-by-need evaluator #10825

Merged

gasche force-pushed the shape-improvements branch 2 times, most recently from a7bc83b to caecad8 Compare January 4, 2022 17:12

gasche force-pushed the shape-improvements branch from caecad8 to b943128 Compare January 6, 2022 15:32

Octachron closed this Jan 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shape improvements #10808

Shape improvements #10808

gasche commented Dec 3, 2021

xavierleroy commented Dec 3, 2021

gasche commented Dec 3, 2021 •

edited

xavierleroy commented Dec 6, 2021

gasche commented Dec 6, 2021

gasche commented Dec 6, 2021

Ekdohibs commented Dec 6, 2021

gasche commented Dec 6, 2021 •

edited

Ekdohibs commented Dec 7, 2021

gasche commented Dec 16, 2021

gasche commented Dec 16, 2021

gasche commented Jan 4, 2022

Ekdohibs commented Jan 4, 2022

gasche commented Jan 4, 2022

Ekdohibs commented Jan 4, 2022

gasche commented Jan 5, 2022

trefis commented Jan 5, 2022

Ekdohibs commented Jan 5, 2022

gasche commented Jan 5, 2022

voodoos commented Jan 6, 2022 •

edited

gasche commented Jan 6, 2022 •

edited

voodoos commented Jan 6, 2022

voodoos commented Jan 6, 2022 •

edited

gasche commented Jan 6, 2022

Octachron commented Jan 20, 2022

Shape improvements #10808

Shape improvements #10808

Conversation

gasche commented Dec 3, 2021

xavierleroy commented Dec 3, 2021

gasche commented Dec 3, 2021 • edited

xavierleroy commented Dec 6, 2021

gasche commented Dec 6, 2021

gasche commented Dec 6, 2021

Ekdohibs commented Dec 6, 2021

gasche commented Dec 6, 2021 • edited

Ekdohibs commented Dec 7, 2021

gasche commented Dec 16, 2021

gasche commented Dec 16, 2021

gasche commented Jan 4, 2022

Ekdohibs commented Jan 4, 2022

gasche commented Jan 4, 2022

Ekdohibs commented Jan 4, 2022

gasche commented Jan 5, 2022

trefis commented Jan 5, 2022

Ekdohibs commented Jan 5, 2022

gasche commented Jan 5, 2022

voodoos commented Jan 6, 2022 • edited

gasche commented Jan 6, 2022 • edited

voodoos commented Jan 6, 2022

voodoos commented Jan 6, 2022 • edited

gasche commented Jan 6, 2022

Octachron commented Jan 20, 2022

gasche commented Dec 3, 2021 •

edited

gasche commented Dec 6, 2021 •

edited

voodoos commented Jan 6, 2022 •

edited

gasche commented Jan 6, 2022 •

edited

voodoos commented Jan 6, 2022 •

edited