Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/stateless filtering and transforming #823

Merged
merged 5 commits into from
Jan 31, 2024

Conversation

JakenVeina
Copy link
Collaborator

@JakenVeina JakenVeina commented Jan 6, 2024

Implemented one of the operators described in #758 as well as an equivalent transform operator.

Also made a few housekeeping changes, but I've split those out into separate commits, so they can more-easily be reviewed and/or excluded if others don't agree.

Benchmark results, highlighting the performance of the new operators, versus existing ones:

Method Mean Error StdDev Ratio RatioSD Gen0 Allocated Alloc Ratio
Filter 293.3 μs 5.82 μs 7.78 μs 1.00 0.00 76.1719 311.27 KB 1.00
FilterByValues 229.8 μs 3.16 μs 2.80 μs 0.78 0.02 57.3730 235.05 KB 0.76
Method Mean Error StdDev Median Ratio RatioSD Gen0 Gen1 Allocated Alloc Ratio
Transform 457.1 μs 8.79 μs 18.15 μs 448.8 μs 1.00 0.00 153.3203 25.3906 627.89 KB 1.00
TransformByValues 261.1 μs 5.13 μs 4.28 μs 260.8 μs 0.55 0.03 57.1289 - 235.04 KB 0.37

@JakenVeina JakenVeina added enhancement Housekeeping Pull requests for minor code maintenance issues labels Jan 6, 2024
@dwcullop
Copy link
Member

dwcullop commented Jan 6, 2024

I'm in favor of these additions, but I do have a small issue with the names. FilterByValues is confusing. Values makes it seem like each one is going to be filtered by multiple values. At the very least, it should be singular FilterByValue but I think that ValueFilter might be an even better choice.

These are going to require some documentation to ensure that the provided filter/transform functions are completely deterministic and very efficient because they'll be invoked more times than one might expect.

@JakenVeina
Copy link
Collaborator Author

JakenVeina commented Jan 6, 2024

I'm not sold on the names, either. Other options include FilterDeterministic, FilterSlim, FilterValues. I'm gonna say ValueFilter is a no-go, for me, because there's a lot of discoverability value in having it start with Filter, so it shows up right alongside the other Filter operators.

Regarding documentation, is the documentation I already added insufficient for you?

@dwcullop
Copy link
Member

dwcullop commented Jan 7, 2024

I'm not sold on the names, either. Other options include FilterDeterministic, FilterSlim, FilterValues. I'm gonna say ValueFilter is a no-go, for me, because there's a lot of discoverability value in having it start with Filter, so it shows up right alongside the other Filter operators.

Agreed... Although Intellisense finds anything with "Filter" in the name, the things starting with "Filter" are shown first. So... No ValueFilter and probably not anything with "Value" because that word has special meaning in DotNet and, to some, it may imply it somehow uses a stack-based value type (ala ValueTask), which is not necessarily the case at all.

Of the remaining choices, I like FilterSlim. FilterDeterministic is okay, I guess, but it's rather long and it isn't immediately clear to what the "Deterministic" refers. Other ideas:

  • FilterInline
  • FilterChangeSet (because it just filters the changeset without the intermediate cache)
  • FilterFilterFilter (okay, I guess I'm out of ideas for now)

I feel like it might be better to get the opinions of someone who isn't familiar with DD to find a name that people can know what it does intuitively. My intuition is ruined by familiarity. Maybe we could take a survey in the Rx slack?

Regarding documentation, is the documentslation I already added insufficient for you?

Yes, your documentslation is fine... Great, actually, but the not everyone will see the XML comments, so a Wiki article might also be helpful.

@JakenVeina
Copy link
Collaborator Author

JakenVeina commented Jan 8, 2024

In a perfect world, I'd say it should be Filter() with all the OTHER overloads being FilterWithCaching(), so maybe FilterWithoutCaching() as another option?

a Wiki article might also be helpful.

I can get behind that.

@dwcullop
Copy link
Member

dwcullop commented Jan 8, 2024

In a perfect world, I'd say it should be Filter() with all the OTHER overloads being FilterWithCaching(), so maybe FilterWithoutCaching() as another option?

Of the names we've examined so far, my favorite is FilterSlim. It isn't long too long and it's clear it does something less than the other Filter.

@dwcullop
Copy link
Member

dwcullop commented Jan 9, 2024

Maybe your PR title has the right of it... FilterStateless? I dunno.

Copy link
Collaborator

@RolandPheasant RolandPheasant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been sat on this for a while so I could think long and hard about it and I think using optimiser options (see my comments against the code) could be a good solution.

The only draw back would be yet more optional overloads, unless of course we introduce an overload which accepts the transform / filter functions + a record containing all the additional options. This would be similar to my suggestion on #819 and also similar to the new binding options.


namespace DynamicData.Cache.Internal;

internal sealed class TransformByValues<TDestination, TSource, TKey>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhhh, yep, that looks pretty much the same, save for a few technical optimizations.

I think it's worth considering that .Convert() has a bit of a discoverability problem, then. So, we could consider options for addressing that:

  1. Add TransformByValues() (or whatever name we agree on) anyway, and have it and .Convert() both call the same implementation behind the scenes.
  2. Option 1 but also mark .Convert() as Obsolete, to be removed some day.
  3. Just try and address it with some new Wiki content, as Darrin already suggested. In which case, I would still swap .Convert() over to the implementation I wrote here.

/// <param name="source">The source stream of collection items to be transformed.</param>
/// <param name="transformFactory">The transformation to be applied to each item.</param>
/// <returns>A stream of collection changesets where upstream collection items are transformed by the given factory function.</returns>
public static IObservable<IChangeSet<TDestination, TKey>> TransformByValues<TDestination, TSource, TKey>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we can achieve this overload using an optimizer. For example Sort accepts SortOptimisations which amongst other options allows the consumer to specify SortOptimisations.ComparesImmutableValuesOnly.

If something similar was used for Transform, or for Filter we could use different code paths in the background and eliminate the need to new named overloads.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could probably get behind that, in theory, but part of the problem is that this optimization is ONLY applicable for certain scenarios of filtering and transforming. Like, any of the overloads that take IObservable<> of something to control filtering flat-out cannot use this optimization.

So, as an example, if we were to define FilterOptimizations.DontCacheItems, we'd have to do one of a few things:

  1. Only define one overload to take a FilterOptimizations paramter, the one where the filter predicate is just Func<T, bool>. With this, if we ever wanted to add more optimizations for other overloads, we'd have to write SEPARATE enums for those overloads.
  2. Throw an exception if you try and specify FilterOptimizations.DontCacheItems with an overload that supports dynamic filtering, E.G. IObservable<Func<T, bool>>.
  3. Just silently have FilterOptimizations.DontCacheItems do nothing for overloads that support dynamic filtering, leading users to think they're getting a performance boost when they're not.

On a separate note (and this comes with a grain of salt, because I have not done a DEEP dive into it, just seen it on the surface) my current impression of how sorting works in DynamicData is that I don't really like it. I don't like how much seemingly-unnecessary stuff there is inside ISortedChangeSet<>, and SortOptimizations is part of that. I have found that the "extra" stuff in ISortedChangeSet<> makes it impossible to implement some of the most common querying scenarios supported by LINQ. All that being said, I'm initially very wary about the idea of adding FilterOptimizations in the same manner as SortOptimizations.

@RolandPheasant
Copy link
Collaborator

@JakenVeina regarding naming just go with your gut - dealers choice

One more suggestion however is maybe FilterImmutable / TransformImmutable after all there is a GroupByImmutable which although has a slightly different behaviour it may provide consistency.

Let me know when this is ready and let's merge.

@JakenVeina
Copy link
Collaborator Author

JakenVeina commented Jan 19, 2024

If .GroupByImmutable() already exists, then that sounds like a winner.

And my gut says .TransformImmutable() fits better than .Convert() for the same reasons, so I think I'l go with the "alias for now, deprecate for later" approach for that one.

@dwcullop
Copy link
Member

I can deal with the Immutable names. TransformImmutable / FilterImmutable, etc.

@JakenVeina JakenVeina force-pushed the feature/stateless-filtering-and-transforming branch from 0367828 to d1e9859 Compare January 31, 2024 03:45
@JakenVeina
Copy link
Collaborator Author

JakenVeina commented Jan 31, 2024

@RolandPheasant I've updated namings, as discussed, and re-wrote the documentation, after some feedback from @Insire.

Since the older .Convert() operator is already marked as obsolete, I left it as-is, to be removed eventually.

We should be good for a merge.

@Insire
Copy link

Insire commented Jan 31, 2024

I'm quite happy with the new xml comments, which was a major point of discussion i had with @JakenVeina

@JakenVeina JakenVeina merged commit d6d748e into main Jan 31, 2024
1 check passed
@JakenVeina JakenVeina deleted the feature/stateless-filtering-and-transforming branch January 31, 2024 21:55
@JakenVeina JakenVeina mentioned this pull request Jan 31, 2024
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement Housekeeping Pull requests for minor code maintenance issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants