Implement `FlattenOk::{fold, rfold}` #927

kinto-b · 2024-05-01T07:45:46Z

Relates to #755

$ cargo bench --bench specializations flatten_ok/fold -- --baseline flatten_ok

flatten_ok/fold         time:   [3.0078 µs 3.2944 µs 3.5502 µs]
                        change: [-47.164% -42.833% -38.023%] (p = 0.00 < 0.05)
                        Performance has improved.

Edit: .rfold is an almost identical implementation, so I'll put it through in a moment. Just running benchmarks...

Edit:

$ cargo bench --bench specializations flatten_ok/rfold -- --baseline flatten_ok

flatten_ok/rfold        time:   [2.7442 µs 2.8254 µs 2.9134 µs]
                        change: [-42.924% -39.835% -36.485%] (p = 0.00 < 0.05)
                        Performance has improved.

codecov · 2024-05-01T07:47:59Z

Codecov Report

Attention: Patch coverage is 97.22222% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 94.55%. Comparing base (6814180) to head (568021d).
Report is 64 commits behind head on master.

Files	Patch %	Lines
src/flatten_ok.rs	97.22%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #927      +/-   ##
==========================================
+ Coverage   94.38%   94.55%   +0.16%     
==========================================
  Files          48       48              
  Lines        6665     6926     +261     
==========================================
+ Hits         6291     6549     +258     
- Misses        374      377       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Philippe-Cholet

How can it pass tests without using the fields inner_front and inner_back?! That's troubling!

EDIT: Note that you can use an unique baseline for running benchmarks.
I agree we should make the tests fail before fixing [r]fold as this indicates the tests are a bit buggy.

kinto-b · 2024-05-01T07:57:33Z

@Philippe-Cholet I was just looking back over the code wondering that myself! Investigating now...

Edit: Have to step out for dinner. Will look into this some more later on. I feel as though it shouldn't work. The necessary fix seems pretty simple, but I want to make the tests fail with the current code before I change anything

kinto-b · 2024-05-01T11:25:31Z

Ah so the tests fail to capture what's wrong with this code because they use Option<u8> values for each Ok item in the iterator. We need non-singleton values.

There is a comment there about it being too slow to use Vec<u8>:

itertools/tests/specializations.rs

Lines 455 to 456 in dd6a569

    
           // `Option<u8>` because `Vec<u8>` would be very slow!! And we can't give `[u8; 3]`. 
        
           fn flatten_ok(v: Vec<Result<Option<u8>, char>>) -> () {

I've tested changing over to Vec<u8> and, though it catches the problem, it takes several minutes for the test to finish once the bug is ironed out.

What's the best approach here do you think?

Philippe-Cholet · 2024-05-01T12:09:21Z

Ah I think I wrote this comment. Use Vec<u8> here is really slow, it's not really an option. So if we can't give [u8; 3] instead of Option<u8> then maybe we can write our own limited vector (what a pain) and implement the necessary traits for this to work. I'm gonna investigate this.

Philippe-Cholet · 2024-05-01T12:39:47Z

So we basically need a light small vector, here is one with max 2 elements. The flatten_ok test pass under 1 second for me.

use quickcheck::Arbitrary;
use rand::Rng;
...
    // remove comment
    fn flatten_ok(v: Vec<Result<SmallIter2<u8>, char>>) -> () {
...

/// Like `VecIntoIter<T>` with maximum 2 elements.
#[derive(Debug, Clone, Default)]
enum SmallIter2<T> {
    #[default]
    Zero,
    One(T),
    Two(T, T),
}

impl<T: Arbitrary> Arbitrary for SmallIter2<T> {
    fn arbitrary<G: quickcheck::Gen>(g: &mut G) -> Self {
        match g.gen_range(0u8, 3) {
            0 => Self::Zero,
            1 => Self::One(T::arbitrary(g)),
            2 => Self::Two(T::arbitrary(g), T::arbitrary(g)),
            _ => unreachable!(),
        }
    }
    // maybe implement shrink too, maybe not
}

impl<T> Iterator for SmallIter2<T> {
    type Item = T;

    fn next(&mut self) -> Option<Self::Item> {
        match std::mem::take(self) {
            Self::Zero => None,
            Self::One(val) => Some(val),
            Self::Two(val, second) => {
                *self = Self::One(second);
                Some(val)
            }
        }
    }

    fn size_hint(&self) -> (usize, Option<usize>) {
        let len = match self {
            Self::Zero => 0,
            Self::One(_) => 1,
            Self::Two(_, _) => 2,
        };
        (len, Some(len))
    }
}

impl<T> DoubleEndedIterator for SmallIter2<T> {
    fn next_back(&mut self) -> Option<Self::Item> {
        match std::mem::take(self) {
            Self::Zero => None,
            Self::One(val) => Some(val),
            Self::Two(first, val) => {
                *self = Self::One(first);
                Some(val)
            }
        }
    }
}

scottmcm · 2024-05-01T16:11:26Z

~~nit: that looks wrong since it never shrinks from One to Zero.~~ EDIT: No, I'm blind and missed the take, nevermind.

(ArrayVec<T, 2> is the classic solution for an iterator like this, but I would also understand not wanting to add a dev-dep just for this.)

Philippe-Cholet · 2024-05-01T16:20:19Z

I thought of ArrayVec<T, N> but I would have to wrap it anyway to implement quickcheck::Arbitrary (and therefore Iterators too). I would have added it if we wanted more control than a fixed max-length of 2.

kinto-b · 2024-05-03T01:20:34Z

@Philippe-Cholet Ah neat, thanks for that! Shall I pop that into its own module within tests/ or just stick it in tests/specializations.rs?

Philippe-Cholet · 2024-05-03T05:42:29Z

@kinto-b Just add this in "tests/specializations.rs" where it's needed. We can move it later if we need to.

Sidenote: For CI to pass again (because of recent Clippy 1.78), you will need to rebase your branch on master.

Previously the test used Option<u8> but the coverage was bad. We cannot use Vec<u8> because it is too slow.

Philippe-Cholet

Looks good to me.
The specialization test did slow us down. An Option seemed good enough when I wrote it.
I guess the benchmark is up-to-date. -40% is nice enough.

Thanks again!

kinto-b · 2024-05-04T16:14:29Z

Thanks!

Philippe-Cholet requested changes May 1, 2024

View reviewed changes

Philippe-Cholet added this to the next milestone May 1, 2024

Philippe-Cholet changed the title ~~Implement FlattenOk::fold~~ Implement FlattenOk::{fold, rfold} May 1, 2024

kinto-b force-pushed the feature/FlattenOk-fold branch from e9707bc to fc5aab4 Compare May 1, 2024 11:36

kinto-b added 3 commits May 5, 2024 00:25

Implement FlattenOk::fold

27079c0

Implement FlattenOk::rfold

10a8bb2

Implement two-element 'vector' for flatten_ok test

568021d

Previously the test used Option<u8> but the coverage was bad. We cannot use Vec<u8> because it is too slow.

kinto-b force-pushed the feature/FlattenOk-fold branch from fc5aab4 to 568021d Compare May 4, 2024 14:26

Philippe-Cholet approved these changes May 4, 2024

View reviewed changes

Philippe-Cholet added this pull request to the merge queue May 4, 2024

Merged via the queue into rust-itertools:master with commit e53f635 May 4, 2024
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement `FlattenOk::{fold, rfold}` #927

Implement `FlattenOk::{fold, rfold}` #927

kinto-b commented May 1, 2024 •

edited

codecov bot commented May 1, 2024 •

edited

Philippe-Cholet left a comment •

edited

kinto-b commented May 1, 2024 •

edited

kinto-b commented May 1, 2024 •

edited

Philippe-Cholet commented May 1, 2024

Philippe-Cholet commented May 1, 2024 •

edited

scottmcm commented May 1, 2024 •

edited

Philippe-Cholet commented May 1, 2024

kinto-b commented May 3, 2024

Philippe-Cholet commented May 3, 2024

Philippe-Cholet left a comment •

edited

kinto-b commented May 4, 2024

Implement FlattenOk::{fold, rfold} #927

Implement FlattenOk::{fold, rfold} #927

Conversation

kinto-b commented May 1, 2024 • edited

codecov bot commented May 1, 2024 • edited

Codecov Report

Philippe-Cholet left a comment • edited

Choose a reason for hiding this comment

kinto-b commented May 1, 2024 • edited

kinto-b commented May 1, 2024 • edited

Philippe-Cholet commented May 1, 2024

Philippe-Cholet commented May 1, 2024 • edited

scottmcm commented May 1, 2024 • edited

Philippe-Cholet commented May 1, 2024

kinto-b commented May 3, 2024

Philippe-Cholet commented May 3, 2024

Philippe-Cholet left a comment • edited

Choose a reason for hiding this comment

kinto-b commented May 4, 2024

Implement `FlattenOk::{fold, rfold}` #927

Implement `FlattenOk::{fold, rfold}` #927

kinto-b commented May 1, 2024 •

edited

codecov bot commented May 1, 2024 •

edited

Philippe-Cholet left a comment •

edited

kinto-b commented May 1, 2024 •

edited

kinto-b commented May 1, 2024 •

edited

Philippe-Cholet commented May 1, 2024 •

edited

scottmcm commented May 1, 2024 •

edited

Philippe-Cholet left a comment •

edited