Avoid consolidate in minsmaxes hierarchical #27085

antiguru · 2024-05-14T17:57:34Z

Like #27068 but for MinxMaxesHierarchical.

Change the rendering of mins-maxes-hierarchical plans to avoid an intermediate consolidate. At the moment, we render plans by forking the inputs, arranging and reducing once side, then concatenating the inputs with negated reduction output, and consolidating the result. This makes sure that we consolidate eagerly, but at the same time does duplicate work: The next operator forms an arrangement, so we could just reuse that instead.

Ths PR implements this pattern, removing one consolidate from each stage, and adding it back after the final stage to ensure the stage's output itself is consolidated. Note that we now apply the hash modulus on uncompacted data, whereas it previously was guaranteed to be consolidated. This might increase the cost of the operator by a factor of 2.

The PR also does some refactorings:

It applies the initial modulus eagerly to save one operator preparing the hash value.
It extracts a build_bucketed_stage function to make the code more readable.
I did a cleanup pass to fix a few things I noticed.

Checklist

This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
This PR includes the following user-facing behavior changes:

frankmcsherry

This looks good. We left some notes, some of which we might want to do or at least record as things to do. Thanks!

src/compute/src/render/reduce.rs

Signed-off-by: Moritz Hoffmann <mh@materialize.com>

antiguru force-pushed the minmax_no_consolidate branch 2 times, most recently from 60c5a96 to 1e9dc3e Compare May 15, 2024 00:44

antiguru requested a review from frankmcsherry May 15, 2024 00:45

antiguru marked this pull request as ready for review May 15, 2024 00:49

antiguru requested a review from a team May 15, 2024 00:49

antiguru force-pushed the minmax_no_consolidate branch 3 times, most recently from 9474482 to f495aad Compare May 16, 2024 19:59

antiguru requested a review from teskje May 21, 2024 18:07

antiguru force-pushed the minmax_no_consolidate branch from f495aad to 2ea211c Compare May 22, 2024 13:40

frankmcsherry approved these changes May 22, 2024

View reviewed changes

src/compute/src/render/reduce.rs Show resolved Hide resolved

src/compute/src/render/reduce.rs Outdated Show resolved Hide resolved

src/compute/src/render/reduce.rs Show resolved Hide resolved

antiguru added 2 commits May 22, 2024 16:08

Avoid consolidate in minsmaxes hierarchical

c1ba6c5

Signed-off-by: Moritz Hoffmann <mh@materialize.com>

Cleanup

aadbc73

Signed-off-by: Moritz Hoffmann <mh@materialize.com>

antiguru force-pushed the minmax_no_consolidate branch from 2ea211c to aadbc73 Compare May 22, 2024 20:37

antiguru merged commit 3c7ffd6 into MaterializeInc:main May 23, 2024
73 checks passed

antiguru deleted the minmax_no_consolidate branch May 23, 2024 13:59

materialize-bot mentioned this pull request May 23, 2024

release: v0.101.0 required reviews #27274

Closed

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid consolidate in minsmaxes hierarchical #27085

Avoid consolidate in minsmaxes hierarchical #27085

antiguru commented May 14, 2024 •

edited

frankmcsherry left a comment

Avoid consolidate in minsmaxes hierarchical #27085

Avoid consolidate in minsmaxes hierarchical #27085

Conversation

antiguru commented May 14, 2024 • edited

Checklist

frankmcsherry left a comment

Choose a reason for hiding this comment

antiguru commented May 14, 2024 •

edited