Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-5625][VL] Support window range frame #5626

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

WangGuangxin
Copy link
Contributor

@WangGuangxin WangGuangxin commented May 6, 2024

What changes were proposed in this pull request?

Support window range frame by pre-project frame offset for non-SpecialFrameBoundary, based on the discussion facebookincubator/velox#7557

(Fixes: #5625)

How was this patch tested?

UT

Copy link

github-actions bot commented May 6, 2024

#5625

Copy link

github-actions bot commented May 6, 2024

Run Gluten Clickhouse CI

5 similar comments
Copy link

github-actions bot commented May 6, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented May 6, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented May 6, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented May 6, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented May 6, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented May 6, 2024

Run Gluten Clickhouse CI

@WangGuangxin
Copy link
Contributor Author

cc @PHILO-HE @rui-mo

@FelixYBW
Copy link
Contributor

FelixYBW commented May 7, 2024

@PHILO-HE is the PR offload SQL like below?

  sum(x) OVER (
    PARTITION BY id,sur
    ORDER BY date RANGE BETWEEN 6 preceding AND CURRENT ROW
  )

@PHILO-HE
Copy link
Contributor

PHILO-HE commented May 7, 2024

@PHILO-HE is the PR offload SQL like below?

  sum(x) OVER (
    PARTITION BY id,sur
    ORDER BY date RANGE BETWEEN 6 preceding AND CURRENT ROW
  )

@FelixYBW, yes, this pr can remove the fallback for such case.

@FelixYBW
Copy link
Contributor

FelixYBW commented May 7, 2024

Let me do a test once merged. Thank you!

@FelixYBW
Copy link
Contributor

FelixYBW commented May 7, 2024

@WangGuangxin can you help to resolve conflicts?

Copy link

github-actions bot commented May 8, 2024

Run Gluten Clickhouse CI

@WangGuangxin
Copy link
Contributor Author

@WangGuangxin can you help to resolve conflicts?

@FelixYBW done

Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will spare some time to review in next week. Sorry for late response.

Copy link
Contributor

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just added several comments.

return std::make_tuple(
core::WindowNode::BoundType::kFollowing,
exprConverter_->toVeloxExpr(boundType.following().ref(), inputType));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the only difference of the two branches is the expression. Can we differentiate the expressions only and reuse the rest of code?

return std::make_tuple(
core::WindowNode::BoundType::kPreceding,
exprConverter_->toVeloxExpr(boundType.preceding().ref(), inputType));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.
I tried to reuse the logic here, but since the type of boundType.preceding() and boundType.following() are different, so I have to pass the offset and ref instead of the expression itself

String upperBound,
String lowerBound,
org.apache.spark.sql.catalyst.expressions.Expression upperBound,
org.apache.spark.sql.catalyst.expressions.Expression lowerBound,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we import org.apache.spark.sql.catalyst.expressions.Expression and use it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Expression is conflict with io.substrait.proto.Expression which alreay imported in this class, and Java doen't have the alias grammer like scala

if (offset < 0) {
Expression.WindowFunction.Bound.Preceding.Builder offsetPrecedingBuilder =
Expression.WindowFunction.Bound.Preceding.newBuilder();
offsetPrecedingBuilder.setOffset(0 - offset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need -offset?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need. In fact, this branch is the existing logic


// the reference to pre-project range frame boundary.
Expression ref = 2;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[VL] Support window range frame
4 participants