Avoid consolidate in minsmaxes hierarchical #27085
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Like #27068 but for MinxMaxesHierarchical.
Change the rendering of mins-maxes-hierarchical plans to avoid an intermediate consolidate. At the moment, we render plans by forking the inputs, arranging and reducing once side, then concatenating the inputs with negated reduction output, and consolidating the result. This makes sure that we consolidate eagerly, but at the same time does duplicate work: The next operator forms an arrangement, so we could just reuse that instead.
Ths PR implements this pattern, removing one consolidate from each stage, and adding it back after the final stage to ensure the stage's output itself is consolidated. Note that we now apply the hash modulus on uncompacted data, whereas it previously was guaranteed to be consolidated. This might increase the cost of the operator by a factor of 2.
The PR also does some refactorings:
build_bucketed_stage
function to make the code more readable.Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way), then it is tagged with aT-proto
label.