-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shapes: bugfix and performance improvement #10796
Conversation
Can you provide repro steps to build irmin on a 4.14 switch? I just tried and it complains that |
@gasche using https://github.com/kit-ty-kate/opam-alpha-repository should do:
|
And also we did our testing on the latest |
Shape.fresh_var Uid.internal_not_actually_unique | ||
in | ||
var, Shape.app orig_shape ~arg:shape_var | ||
in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would make sense to move this code to the Shape
module in order to keep the Shape.desc
type morally abstract.
if final_res_shape == res_shape | ||
then orig_shape | ||
else Shape.abs var final_res_shape | ||
in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather deduplicate this optimisation, in order to ensure that the two instances are kept in sync in the future.
cross-ref: the original Shapes PR is #10718. |
Closed in favor of #10825. |
The first commit in this PR fixes the excessive memory usage reported in #10779. Following that fix we decided to build irmin (and mesure the time it takes), which appears to be a heavy user of functors. A characteristic which our previous test codebase was lacking, allowing the issue to remain undetected.
Thus we discovered a pretty big regression in compilation time on projects which make aggressive uses of functors, as well as bigger shape size than we'd previously seen.
Apart from the fix mentioned above, this PR includes a few commits aimed at mitigating those:
Includemod
by returning the input shape when it is not modified in any way.Abs (X, res_shape)
, we would substitute the shape of the argument forX
inres_shape
. For functors with N-parameters, their applications implied going over the result shape N-times. With this commit we first collect all the arguments, and then go over the result shape only once.Some measurements are available below to see the impact of those patches.
N.B. an alternative solution was suggested by @gasche: to not reduce shapes in the compiler, leave it all to merlin. This does fix the compilation time increases, as well the growth in cmt size. However, with a straightforward (some could say naïve) reduction strategy as is currently implemented in the compiler (and exposed by compiler-libs), this implies recomputing the same reduction over and over again, taking several minutes (if not hours) to fully reduce the shape of
Irmin
.This could obviously be mitigated with some amount of memoization (though we haven't tried), but overall it doesn't seem like it'd be much simpler than the patches proposed here. Furthermore, I personally would rather spend a few more seconds compiling my project, than have merlin hang for a second or two when I ask it something.
And now, for some numbers:
Irmin
(This is a build of https://github.com/mirage/irmin/, which includes several packages)
Compilation time
dune build @install -j 1
, 10 runsNote: the noticeable performance degradation only impacts a few calls to the compiler:
src/irmin/.../irmin.cmo
src/irmin/.../irmin.cmx
src/irmin-git/.../irmin_git.cmo
src/irmin-git/.../irmin_git.cmx
src/irmin-mirage/git/.../irmin_mirage_git.cmo
src/irmin-mirage/git/.../irmin_mirage_git.cmx
For all the other invocations of the compiler (of which there are 361), the change in compilation time is less than 50ms, and often lost in the noise.
(We can make the raw logs available if people are curious)
Cmt sizes:
Relatedly, there is a noticeable increase in
.cmt
size onIrmin
, but once again very localized:Monorepo
To not be entirely biaised towards functors, we have also reproduced the measurements on a more diverse "monorepo", which includes the following packages:
containers
,base
,react
,astring
,cmdliner
,cppo
,dune
,fpath
,result
,fmt
,logs
,re
,ppxlib
,ctypes
and all their dependencies.Timings
dune build @install -j 1
, 10 runsCmt sizes:
414-no-shapes
414-shape-bugfix
414-shape-nary-shared