Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Enable transparent union rewrite by default (backport #44764) #45586

Merged
merged 2 commits into from
May 14, 2024

Conversation

mergify[bot]
Copy link
Contributor

@mergify mergify bot commented May 14, 2024

Why I'm doing:

Since we have supportted transparent mv(#42541 and #43304), we can treat a mv as always-consistent, and no need consider its freshness since the mv self will union refreshed data and no-refresh data.

For mv union rewrite, we can also no need consider mv's refreshness either.

eg: test_mv1 can be used for mv rewrite whenever it's updated or not.

DROP MATERIALIZED VIEW test_mv1;
CREATE MATERIALIZED VIEW test_mv1 
PARTITION BY dt 
REFRESH DEFERRED MANUAL 
AS SELECT dt, sum(num) as num FROM t1 where dt > '2020-07-01' GROUP BY dt;

function: check_hit_materialized_view("SELECT dt, sum(num) as num FROM t1 where dt > '2020-07-01'  GROUP BY dt order by 1, 2 limit 3;", "test_mv1")
function: check_hit_materialized_view("SELECT dt, sum(num) as num FROM t1 where dt >'2020-07-01' GROUP BY dt having sum(num) > 10 order by 1, 2 limit 3;", "test_mv1")
function: check_hit_materialized_view("SELECT dt, sum(num) as num FROM t1 where dt > '2020-06-20'  GROUP BY dt order by 1, 2 limit 3;", "test_mv1", "UNION")
function: check_hit_materialized_view("SELECT dt, sum(num) as num FROM t1 where dt >'2020-06-20' GROUP BY dt having sum(num) > 10 order by 1, 2 limit 3;", "test_mv1", "UNION")
function: check_hit_materialized_view("SELECT dt, sum(num) as num FROM t1 GROUP BY dt order by 1, 2 limit 3;", "test_mv1", "UNION")
function: check_hit_materialized_view("SELECT dt, sum(num) as num FROM t1 GROUP BY dt having sum(num) > 10 order by 1, 2 limit 3;", "test_mv1", "UNION")
function: check_hit_materialized_view("SELECT * FROM (SELECT dt, sum(num) as num FROM t1 GROUP BY dt having sum(num) > 10 UNION ALL SELECT dt, sum(num) as num FROM t1 GROUP BY dt) t order by 1, 2 limit 3;", "test_mv1", "UNION")

What I'm doing:

  • Enable transparent (union) rewrite by default since we consider mv as always-consistent.
  • Move TRANSPARENT from UnionRewriteMode to SessionVariable since transaprent rewrite can be used associated with other union rewrite mode.
  • Fix explain costs bugs if final tree contains PhysicalProject.

Further

  • Fix duplication bug in aggregate rewrite;
  • Enable PULL_PREDICATE_V1/PULL_PREDICATE_V2 by default, since we can use mv as more scenes.

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.3
    • 3.2
    • 3.1
    • 3.0
    • 2.5

This is an automatic backport of pull request #44764 done by [Mergify](https://mergify.com). ## Why I'm doing:

Since we have supportted transparent mv(#42541 and #43304), we can treat a mv as always-consistent, and no need consider its freshness since the mv self will union refreshed data and no-refresh data.

For mv union rewrite, we can also no need consider mv's refreshness either.

eg: test_mv1 can be used for mv rewrite whenever it's updated or not.

DROP MATERIALIZED VIEW test_mv1;
CREATE MATERIALIZED VIEW test_mv1 
PARTITION BY dt 
REFRESH DEFERRED MANUAL 
AS SELECT dt, sum(num) as num FROM t1 where dt > '2020-07-01' GROUP BY dt;

function: check_hit_materialized_view("SELECT dt, sum(num) as num FROM t1 where dt > '2020-07-01'  GROUP BY dt order by 1, 2 limit 3;", "test_mv1")
function: check_hit_materialized_view("SELECT dt, sum(num) as num FROM t1 where dt >'2020-07-01' GROUP BY dt having sum(num) > 10 order by 1, 2 limit 3;", "test_mv1")
function: check_hit_materialized_view("SELECT dt, sum(num) as num FROM t1 where dt > '2020-06-20'  GROUP BY dt order by 1, 2 limit 3;", "test_mv1", "UNION")
function: check_hit_materialized_view("SELECT dt, sum(num) as num FROM t1 where dt >'2020-06-20' GROUP BY dt having sum(num) > 10 order by 1, 2 limit 3;", "test_mv1", "UNION")
function: check_hit_materialized_view("SELECT dt, sum(num) as num FROM t1 GROUP BY dt order by 1, 2 limit 3;", "test_mv1", "UNION")
function: check_hit_materialized_view("SELECT dt, sum(num) as num FROM t1 GROUP BY dt having sum(num) > 10 order by 1, 2 limit 3;", "test_mv1", "UNION")
function: check_hit_materialized_view("SELECT * FROM (SELECT dt, sum(num) as num FROM t1 GROUP BY dt having sum(num) > 10 UNION ALL SELECT dt, sum(num) as num FROM t1 GROUP BY dt) t order by 1, 2 limit 3;", "test_mv1", "UNION")

What I'm doing:

  • Enable transparent (union) rewrite by default since we consider mv as always-consistent.
  • Move TRANSPARENT from UnionRewriteMode to SessionVariable since transaprent rewrite can be used associated with other union rewrite mode.
  • Fix explain costs bugs if final tree contains PhysicalProject.

Further

  • Fix duplication bug in aggregate rewrite;
  • Enable PULL_PREDICATE_V1/PULL_PREDICATE_V2 by default, since we can use mv as more scenes.

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Signed-off-by: shuming.li <ming.moriarty@gmail.com>
(cherry picked from commit 38aad60)

# Conflicts:
#	test/sql/test_materialized_view/R/test_mv_partition_compensate_iceberg_part1
#	test/sql/test_materialized_view/R/test_mv_partition_compensate_iceberg_part2
#	test/sql/test_materialized_view/R/test_mv_partition_union_rewrite_mode_iceberg
#	test/sql/test_materialized_view/T/test_mv_partition_compensate_iceberg_part1
#	test/sql/test_materialized_view/T/test_mv_partition_union_rewrite_mode_iceberg
#	test/sql/test_transparent_mv/R/test_transparent_mv_hive
#	test/sql/test_transparent_mv/T/test_transparent_mv_hive
@mergify mergify bot added the conflicts label May 14, 2024
Copy link
Contributor Author

mergify bot commented May 14, 2024

Cherry-pick of 38aad60 has failed:

On branch mergify/bp/branch-3.3/pr-44764
Your branch is up to date with 'origin/branch-3.3'.

You are currently cherry-picking commit 38aad60d86.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   fe/fe-core/src/main/java/com/starrocks/qe/SessionVariable.java
	modified:   fe/fe-core/src/main/java/com/starrocks/sql/Explain.java
	modified:   fe/fe-core/src/main/java/com/starrocks/sql/optimizer/MaterializationContext.java
	modified:   fe/fe-core/src/main/java/com/starrocks/sql/optimizer/operator/ScanOperatorPredicates.java
	modified:   fe/fe-core/src/main/java/com/starrocks/sql/optimizer/rule/transformation/materialization/AggregatedMaterializedViewRewriter.java
	modified:   fe/fe-core/src/main/java/com/starrocks/sql/optimizer/rule/transformation/materialization/MVCompensation.java
	modified:   fe/fe-core/src/main/java/com/starrocks/sql/optimizer/rule/transformation/materialization/MVUnionRewriteMode.java
	modified:   fe/fe-core/src/main/java/com/starrocks/sql/optimizer/rule/transformation/materialization/MaterializedViewRewriter.java
	modified:   fe/fe-core/src/main/java/com/starrocks/sql/optimizer/rule/transformation/materialization/OptExpressionDuplicator.java
	modified:   fe/fe-core/src/test/java/com/starrocks/sql/optimizer/rule/transformation/materialization/MvRewritePartialPartitionTest.java
	modified:   fe/fe-core/src/test/java/com/starrocks/sql/optimizer/rule/transformation/materialization/MvRewritePartitionTest.java
	modified:   fe/fe-core/src/test/java/com/starrocks/sql/optimizer/rule/transformation/materialization/MvRewriteUnionTest.java
	modified:   fe/fe-core/src/test/java/com/starrocks/sql/optimizer/rule/transformation/materialization/MvTransparentUnionRewriteHiveTest.java
	modified:   fe/fe-core/src/test/java/com/starrocks/sql/optimizer/rule/transformation/materialization/MvTransparentUnionRewriteOlapTest.java
	modified:   fe/fe-core/src/test/java/com/starrocks/utframe/UtFrameUtils.java
	modified:   test/sql/test_materialized_view/R/test_materialized_view_union_all_rewrite
	modified:   test/sql/test_materialized_view/R/test_mv_partition_compensate_hive
	deleted:    test/sql/test_materialized_view/R/test_mv_partition_compensate_iceberg
	modified:   test/sql/test_materialized_view/R/test_mv_partition_compensate_mysql
	modified:   test/sql/test_materialized_view/R/test_mv_partition_compensate_olap
	modified:   test/sql/test_materialized_view/T/test_materialized_view_union_all_rewrite
	modified:   test/sql/test_materialized_view/T/test_mv_partition_compensate_hive
	modified:   test/sql/test_materialized_view/T/test_mv_partition_compensate_iceberg_part2
	modified:   test/sql/test_materialized_view/T/test_mv_partition_compensate_mysql
	modified:   test/sql/test_materialized_view/T/test_mv_partition_compensate_olap
	modified:   test/sql/test_transparent_mv/R/test_transparent_mv_basic
	modified:   test/sql/test_transparent_mv/R/test_transparent_mv_mysql
	modified:   test/sql/test_transparent_mv/R/test_transparent_mv_olap
	new file:   test/sql/test_transparent_mv/R/test_transparent_mv_union_hive
	new file:   test/sql/test_transparent_mv/R/test_transparent_mv_union_iceberg
	new file:   test/sql/test_transparent_mv/R/test_transparent_mv_union_olap
	modified:   test/sql/test_transparent_mv/T/test_transparent_mv_basic
	modified:   test/sql/test_transparent_mv/T/test_transparent_mv_mysql
	modified:   test/sql/test_transparent_mv/T/test_transparent_mv_olap
	new file:   test/sql/test_transparent_mv/T/test_transparent_mv_union_hive
	new file:   test/sql/test_transparent_mv/T/test_transparent_mv_union_iceberg
	new file:   test/sql/test_transparent_mv/T/test_transparent_mv_union_olap

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   test/sql/test_materialized_view/R/test_mv_partition_compensate_iceberg_part1
	both modified:   test/sql/test_materialized_view/R/test_mv_partition_compensate_iceberg_part2
	both modified:   test/sql/test_materialized_view/R/test_mv_partition_union_rewrite_mode_iceberg
	both modified:   test/sql/test_materialized_view/T/test_mv_partition_compensate_iceberg_part1
	both modified:   test/sql/test_materialized_view/T/test_mv_partition_union_rewrite_mode_iceberg
	both modified:   test/sql/test_transparent_mv/R/test_transparent_mv_hive
	both modified:   test/sql/test_transparent_mv/T/test_transparent_mv_hive

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

Copy link
Contributor Author

mergify bot commented May 14, 2024

@mergify[bot]: Backport conflict, please reslove the conflict and resubmit the pr

@mergify mergify bot deleted the mergify/bp/branch-3.3/pr-44764 branch May 14, 2024 06:42
@LiShuMing LiShuMing restored the mergify/bp/branch-3.3/pr-44764 branch May 14, 2024 07:02
@LiShuMing LiShuMing reopened this May 14, 2024
@wanpengfei-git wanpengfei-git enabled auto-merge (squash) May 14, 2024 07:02
Signed-off-by: shuming.li <ming.moriarty@gmail.com>
Copy link

sonarcloud bot commented May 14, 2024

@wanpengfei-git wanpengfei-git merged commit 45d8c50 into branch-3.3 May 14, 2024
28 checks passed
@wanpengfei-git wanpengfei-git deleted the mergify/bp/branch-3.3/pr-44764 branch May 14, 2024 23:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants