New lint `missing_iterator_fold` #12211

Philippe-Cholet · 2024-01-29T14:45:02Z

Specialize fold might improve performance of methods relying on it and other iterator adaptors calling fold.

changelog: [missing_iterator_fold]: Check that Iterator implementations specialize the fold method.

I am still questioning myself about the lint name, is "missing" appropriate?

I am unsure if in_external_macro check is needed or not in this situation.

Note: I would (later) suggest similar lints for DoubleEndedIterator::rfold (and Iterator::try_fold/DoubleEndedIterator::try_rfold except they currently can't be specialized on Rust stable since the trait Try is still unstable) and Iterator::size_hint. All those lints could solved with a single LateLintPass.

`cargo dev new_lint --name=missing_iterator_fold --pass=late`

rustbot · 2024-01-29T14:45:07Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @blyxyas (or someone else) soon.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

@rustbot author: the review is finished, PR author should check the comments and take action accordingly
@rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

blyxyas

Very good first iteration! Thanks for the contribution, I have some small questions and nits, but this is a very good state.

blyxyas · 2024-01-30T16:39:43Z

clippy_lints/src/missing_iterator_fold.rs

+
+impl LateLintPass<'_> for MissingIteratorFold {
+    fn check_item(&mut self, cx: &LateContext<'_>, item: &Item<'_>) {
+        if rustc_middle::lint::in_external_macro(cx.sess(), item.span) {


We generally import this function

blyxyas · 2024-01-30T17:02:49Z

clippy_lints/src/missing_iterator_fold.rs

+                .map(|assoc| assoc.ident.name.as_str())
+                .any(|name| name == "fold");


blyxyas · 2024-01-30T17:16:59Z

clippy_lints/src/missing_iterator_fold.rs

+    /// ```
+    #[clippy::version = "1.77.0"]
+    pub MISSING_ITERATOR_FOLD,
+    nursery,


Hmmmm... what category do you think that this should be? I'm thinking about perf, but the lint may be too demanding to be put in perf

This is definitely about performance but there is also pedantic because this is "rather strict".

Is this actually actionable in all cases? As in, can you always provide a more useful implementation than the default fold implementation? Because if not, I'd go as far as saying it could be a restriction lint. For example, std's Range Iterator implementation also does not implement fold and I'm not sure how/why it should

I don't see a reason why specialize fold would necessarily improve performance in the general case. It certainly is in some cases but not all.
This is definitely a lint that should be allowed in some cases after some consideration (and benchmark). This is exactly our case at rust-itertools.
Not sure why either about Range (while RangeInclusive does if I remember well).

I updated the category to restriction.

blyxyas · 2024-01-30T17:17:55Z

tests/ui/missing_iterator_fold.rs

Just for the sake of completion, could you add another struct but implement fold, just to make sure that it doesn't lint then?

Philippe-Cholet · 2024-02-01T16:07:19Z

@blyxyas Hi! Each point was addressed. Is there anything else or should I just squash my commits? (and rebase?)

Should I suggest a blank fold specialization? (fn fold... { todo!() })

blyxyas · 2024-02-03T12:41:06Z

Should I suggest a blank fold specialization

I don't think so, that would prompt users to just leave it unchanged.

Note: I would (later) suggest similar lints for DoubleEndedIterator::rfold (and Iterator::try_fold/DoubleEndedIterator::try_rfold except they currently can't be specialized on Rust stable since the trait Try is still unstable) and Iterator::size_hint. All those lints could solved with a single LateLintPass.

Hmmmm... These should be the same lint imo. Creating 3 - 4 lints just for slightly different methods would be...

Also, do you have an example where manually implementing fold is beneficial for performance? I'm not seeing how it could improve it.

Philippe-Cholet · 2024-02-03T14:27:51Z

rust-itertools/itertools#775 had benchmarks 34 times faster but that's the exception. Most other fold specializations are "only" up to 2-4 times faster like rust-itertools/itertools#849 and rust-itertools/itertools#848. Some others are only (very roughly) 20% faster.
Some other do not benefit from a specialization and allow this lint would help document that.
Have the lint would also help maintain a massive TODO list and prevent us from adding a new iterator without considering a fold specialization.

Have a single lint for size_hint/[try_][r]fold/... would be bad for flexibility. We would not be able to allow a lint for one of the method without allowing it for others, even with a crate-level configuration like disallowed_types has.

However, I clearly understand that you don't want a bunch of lints.

The current lint could simultanously check for missing DoubleEndedIterator::rfold specializations as well, I have no problem with that because it's on different implementations.
One for size_hint seems important to me as a more precise size hint can allow a single allocation (e.g. when collecting to a vector or any collection) and prevent repetive slow reallocations.
One for try_fold/try_rfold seems reasonable to me but to be specialized, it needs to name the unstable Try trait. However, I noticed a few lints about nightly features such as empty_enum.
This one can wait for Try to be stable, but it deserved to be mentionned.

Leading to a total of three lints (in a same lint pass).

Most Iterator (non-adaptor) methods rely on [try_][r]fold by default (count/last/for_each/partition/all/any/find/find_map/[r]position/reduce/(max|min)[_by[_key]]/[partial_]cmp[_by]/eq[_by]/ne/lt/le/gt/ge and some try_ variants). Have [try_][r]fold methods faster makes all others faster with them and all adaptors relying on them too.
The methods not relying on them is somehow short (nth/by_ref/[try_]collect/unzip, probably one or two more).

I think that specialize those are good practices and can have huge impact on the Iterator ecosystem.
I think that such (somehow limited) inclusions would be beneficial to more than rust-itertools. But it's your call.

EDIT: Maybe merge [try_][r]fold to a single lint, but I'm not fan.

EDIT 2: [try_]fold by default repeatedly calls next. But next might update some internal state (like an index) that we won't need once the iterator is finished (for fold) or maybe we can update the internals only once (for try_fold). Specialize [try_]fold might allow to avoid repetitive updates of the internals. And if we adapt an iterator, we can use the specializations of the adapted iterator for our own specializations.

Philippe-Cholet · 2024-02-09T09:56:22Z

Before this PR, I took advice on zulip and the discussion here eventually took place back on zulip.
The discussion was insightful and lead me to close this.

Philippe-Cholet added 3 commits January 29, 2024 14:28

New lint missing_iterator_fold

90a915d

`cargo dev new_lint --name=missing_iterator_fold --pass=late`

missing_iterator_fold: describe the lint

96d30a0

missing_iterator_fold: test the lint

4e8fc93

rustbot assigned blyxyas Jan 29, 2024

rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Jan 29, 2024

Philippe-Cholet added 2 commits January 29, 2024 16:09

missing_iterator_fold: code the lint logic

3e11421

Ignore if in_external_macro

84b52f8

Philippe-Cholet force-pushed the iterator-fold-late branch from a77a1b0 to 84b52f8 Compare January 29, 2024 15:09

blyxyas requested changes Jan 30, 2024

View reviewed changes

Philippe-Cholet added 4 commits January 30, 2024 18:44

import in_external_macro

80079ba

Collapse map-any

f057f1a

Test it does not lint when fold is specialized

496139e

Select restriction category

2008967

Philippe-Cholet requested a review from blyxyas January 30, 2024 18:41

Philippe-Cholet closed this Feb 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New lint `missing_iterator_fold` #12211

New lint `missing_iterator_fold` #12211

Philippe-Cholet commented Jan 29, 2024 •

edited

rustbot commented Jan 29, 2024

blyxyas left a comment

blyxyas Jan 30, 2024

blyxyas Jan 30, 2024

blyxyas Jan 30, 2024

Philippe-Cholet Jan 30, 2024

y21 Jan 30, 2024

Philippe-Cholet Jan 30, 2024

Philippe-Cholet Jan 30, 2024 •

edited

blyxyas Jan 30, 2024

Philippe-Cholet commented Feb 1, 2024 •

edited

blyxyas commented Feb 3, 2024

Philippe-Cholet commented Feb 3, 2024 •

edited

Philippe-Cholet commented Feb 9, 2024

		.map(\|assoc\| assoc.ident.name.as_str())
		.any(\|name\| name == "fold");

	.map(\|assoc\| assoc.ident.name.as_str())
	.any(\|name\| name == "fold");
	.any(\|assoc\| assoc.ident.name.as_str() == "fold");

New lint missing_iterator_fold #12211

New lint missing_iterator_fold #12211

Conversation

Philippe-Cholet commented Jan 29, 2024 • edited

rustbot commented Jan 29, 2024

blyxyas left a comment

Choose a reason for hiding this comment

blyxyas Jan 30, 2024

Choose a reason for hiding this comment

blyxyas Jan 30, 2024

Choose a reason for hiding this comment

blyxyas Jan 30, 2024

Choose a reason for hiding this comment

Philippe-Cholet Jan 30, 2024

Choose a reason for hiding this comment

y21 Jan 30, 2024

Choose a reason for hiding this comment

Philippe-Cholet Jan 30, 2024

Choose a reason for hiding this comment

Philippe-Cholet Jan 30, 2024 • edited

Choose a reason for hiding this comment

blyxyas Jan 30, 2024

Choose a reason for hiding this comment

Philippe-Cholet commented Feb 1, 2024 • edited

blyxyas commented Feb 3, 2024

Philippe-Cholet commented Feb 3, 2024 • edited

Philippe-Cholet commented Feb 9, 2024

New lint `missing_iterator_fold` #12211

New lint `missing_iterator_fold` #12211

Philippe-Cholet commented Jan 29, 2024 •

edited

Philippe-Cholet Jan 30, 2024 •

edited

Philippe-Cholet commented Feb 1, 2024 •

edited

Philippe-Cholet commented Feb 3, 2024 •

edited