Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid consolidate in minsmaxes hierarchical #27085

Merged
merged 2 commits into from
May 23, 2024

Conversation

antiguru
Copy link
Member

@antiguru antiguru commented May 14, 2024

Like #27068 but for MinxMaxesHierarchical.

Change the rendering of mins-maxes-hierarchical plans to avoid an intermediate consolidate. At the moment, we render plans by forking the inputs, arranging and reducing once side, then concatenating the inputs with negated reduction output, and consolidating the result. This makes sure that we consolidate eagerly, but at the same time does duplicate work: The next operator forms an arrangement, so we could just reuse that instead.

Ths PR implements this pattern, removing one consolidate from each stage, and adding it back after the final stage to ensure the stage's output itself is consolidated. Note that we now apply the hash modulus on uncompacted data, whereas it previously was guaranteed to be consolidated. This might increase the cost of the operator by a factor of 2.

The PR also does some refactorings:

  • It applies the initial modulus eagerly to save one operator preparing the hash value.
  • It extracts a build_bucketed_stage function to make the code more readable.
  • I did a cleanup pass to fix a few things I noticed.

Checklist

@antiguru antiguru force-pushed the minmax_no_consolidate branch 2 times, most recently from 60c5a96 to 1e9dc3e Compare May 15, 2024 00:44
@antiguru antiguru marked this pull request as ready for review May 15, 2024 00:49
@antiguru antiguru requested a review from a team May 15, 2024 00:49
@antiguru antiguru force-pushed the minmax_no_consolidate branch 3 times, most recently from 9474482 to f495aad Compare May 16, 2024 19:59
@antiguru antiguru requested a review from teskje May 21, 2024 18:07
Copy link
Contributor

@frankmcsherry frankmcsherry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. We left some notes, some of which we might want to do or at least record as things to do. Thanks!

src/compute/src/render/reduce.rs Show resolved Hide resolved
src/compute/src/render/reduce.rs Outdated Show resolved Hide resolved
src/compute/src/render/reduce.rs Show resolved Hide resolved
Signed-off-by: Moritz Hoffmann <mh@materialize.com>
Signed-off-by: Moritz Hoffmann <mh@materialize.com>
@antiguru antiguru merged commit 3c7ffd6 into MaterializeInc:main May 23, 2024
73 checks passed
@antiguru antiguru deleted the minmax_no_consolidate branch May 23, 2024 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants