Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise IndexedSeqOps.foldRight #8929

Merged
merged 1 commit into from May 8, 2020

Conversation

japgolly
Copy link
Contributor

Inherited foldRight implementation calls reverse.foldLeft

Full benchmarks results and source: https://github.com/japgolly/misc/tree/scalaBM

@japgolly
Copy link
Contributor Author

CI is showing a failure but I think it's spurious. I don't have the right access on Travis to re-run the task unfortunately.

@Jasper-M
Copy link
Member

I think there's some magic incantation like /rebuild that triggers CI.

@mkeskells
Copy link
Contributor

/rebuild

@lrytz
Copy link
Member

lrytz commented Apr 27, 2020

The travis failure is still being hunted #8921

@lrytz
Copy link
Member

lrytz commented Apr 27, 2020

I remain sceptical about using apply instead of the iterator (see my previous comment, also @Ichoran's).

The fact that your benchmark shows an improvement could well be due to overfitting: a new JVM is forked that only executes Vector.foldLeft for a specific vector size, and then quits. So the JIT can optimize foldLeft for Vector, and within that, it optimizes Vector.apply for one specific implementation (the megamorphic call becomes monomorphic because of the fixed Vector size). See also this comment by @retronym, changing the benchmark warmup to neutralize the profiles had a big impact, in particular on the Vector.apply method: #8534 (comment).

@japgolly
Copy link
Contributor Author

Thanks a lot for the eyes on this and my other PRs @lrytz. Your info about JMH overfitting is fascinating! I hadn't thought of that at all. How about I add a bunch of IndexedSeq usage/noise in BM setups, re-run them all tonight and see how it goes.

Otherwise if you and @Ichoran prefer, let me know and I can just ignore the benchmarks and just use iterators at this level. It's no problem :)

@lrytz
Copy link
Member

lrytz commented Apr 28, 2020

It depends on how much time you're motivated to invest :) Benchmarking is hard, using JMH doesn't magically make it easy. A big difficulty here is that we're looking at an open supertype, changing it affects multiple implementations. In general, I think we have to apply optimizations to individual concrete collection types (like you did for ArraySeq).

Even besides benchmark numbers, I beleive that iterator-based implementation is the right default at this level in the collections hierarchy.

@SethTisue SethTisue added library:collections PRs involving changes to the standard collection library performance the need for speed. usually compiler performance, sometimes runtime performance. labels Apr 29, 2020
@japgolly
Copy link
Contributor Author

I re-ran benchmarks, this time with a bunch of collection noise in the setup and lo and behold, it made a massive difference to some of the results! See results-02.txt. In terms of what the results tell us are the fastest, nothing changes. while-loops still win against iterators.

  • For foldRight, the improvement over using reverseIterator is so small that I don't think it justifies going against your experience and intuitions.
  • For foldLeft I'm perplexed. It's a huge difference vs the current implementation which Intellij tells me is the IterableOnce impl which uses while + iterator (!). Given what I just said about foldRight I cannot understand why the improvement would be so large. It's nearly twice as fast according to JMH! Any insights that could explain this?

@japgolly
Copy link
Contributor Author

japgolly commented Apr 30, 2020

To clarify, in foldRight a while-loop is only a tiny bit slower that using while+reverseIterator, so why is it that in foldLeft, a while-loop is nearly twice as fast as using while+iterator? Could maybe the lack of finality of IterableOnce#foldLeft affect JIT?

@japgolly japgolly changed the title Optimise IndexedSeqOps.fold{Left,Right} Optimise IndexedSeqOps.foldRight May 7, 2020
@japgolly
Copy link
Contributor Author

japgolly commented May 7, 2020

Hi all. I think we all got stuck here because a while-loop was being reported as faster by JMH which goes against the intuition and experience of a number of people.

I've now updated this PR to only optimise foldRight and leave foldLeft alone, and I've made foldRight use reverseIterator which is a big win and not contentious. It's better to have this merged now as we all agree it's a big improvement, then in future if someone wants to further explore the (slight) wins of switching to while/apply reported by JMH, it can be done in a follow up PR.

@lrytz Please take another look when you're free.

@lrytz lrytz force-pushed the topic/optIndexedSeqFolds branch from 6150705 to a14606d Compare May 8, 2020 07:46
@lrytz
Copy link
Member

lrytz commented May 8, 2020

Thanks, @japgolly. I didn't manage to find the time to look at and think about your benchmark for the previous version in detail.

Pushed a tiny diff to keep the same parameter name.

@lrytz lrytz merged commit f2681d4 into scala:2.13.x May 8, 2020
@japgolly japgolly deleted the topic/optIndexedSeqFolds branch May 8, 2020 08:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
library:collections PRs involving changes to the standard collection library performance the need for speed. usually compiler performance, sometimes runtime performance.
Projects
None yet
6 participants