Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile time performance #110

Open
kcsongor opened this issue Feb 12, 2020 · 5 comments
Open

Compile time performance #110

kcsongor opened this issue Feb 12, 2020 · 5 comments
Assignees
Milestone

Comments

@kcsongor
Copy link
Owner

I took @arybczak's benchmarks and started experimenting with generic-optics a little.
It seems that optimisations (unsurprisingly) are responsible for most of the compile-time overhead. I don't think much can be done to speed that up without changing GHC itself. However, it should be possible to speed up the -O0 compile times at least, which I think a lot of people use during development.

With -O0 and some experimental changes in the internals yield the following results:

multiple generic memory time
0 0 37M 1.1s
0 1 58M 1.4s
1 0 47M 1.5s
1 1 65M 1.6s
2 0 50M 1.7s
2 1 67M 1.9s

The "experimental changes" are essentially flattening and specialising the class hierarchies, which result in improved compile times at the cost of some duplication in the library internals.
Before these changes, the last row (multiple=2 and generic=1) allocated 246M and took 5.4s to compile on -O0, so the improvement is quite significant.

I will investigate further to see if anything else could be done.

@kcsongor kcsongor added this to the 2.1.0.0 milestone Feb 12, 2020
@kcsongor kcsongor self-assigned this Feb 12, 2020
@kcsongor
Copy link
Owner Author

As an additional point, when loading the file via ghcid, the generic version reloads noticeably faster than the TH version (though both slightly below 1s) on the multiple=2 setting.

@kcsongor
Copy link
Owner Author

Another idea I had was to try and eliminate some redundant simplifier runs by carefully phase annotating the INLINE pragmas, but my experiments on this haven’t been fruitful so far.

@arybczak
Copy link
Contributor

Can you push these changes (to a branch if you don't want to merge them yet)? I'm curious.

@kcsongor
Copy link
Owner Author

Yes of course. I plan on getting back to this in the coming days, and hope to merge it soon.

@arybczak
Copy link
Contributor

BTW, I checked how #112 affects the benchmark and I noticed weird things.

First of all, I expected the compilation to be slower with #112, but it was actually faster.

Then I checked core and it turned out that even with #112 applied the core with multiple constructors isn't equivalent to TH version (residue of generics remains and field lookups are linear).

I then upped unfolding threshold to 250 to get core equivalent to the TH version (MULTIPLE=2 needs 250, for MULTIPLE=1 150 is sufficient) and compilation is even faster (I also used field' instead of field).

Here are times for -funfolding-use-threshold=250 and usage of field':

multiple memory time
0 147M 2.5s
1 251M 4.5s
2 227M 6.5s

I have no idea what is going on there (especially with memory usage), but it seems worth investigating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants