Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Spark width_bucket function support #5634

Merged
merged 3 commits into from May 24, 2024

Conversation

gaoyangxiaozhu
Copy link
Contributor

@gaoyangxiaozhu gaoyangxiaozhu commented May 7, 2024

velox already support width_bucket function

#4039

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

@gaoyangxiaozhu gaoyangxiaozhu changed the title [VL ] Spark width_bucket function support [VL] Spark width_bucket function support May 7, 2024
Copy link

github-actions bot commented May 7, 2024

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

github-actions bot commented May 7, 2024

Run Gluten Clickhouse CI

@gaoyangxiaozhu
Copy link
Contributor Author

gaoyangxiaozhu commented May 7, 2024

fail due to Velox width_bucket treat bucket_number <=0 as a user error (invalid parameter) https://github.com/facebookincubator/velox/blob/668d578420f20568e2f6c152c8eed77d3ad64c59/velox/functions/prestosql/Arithmetic.h#L384C1-L388C75 however spark would return null for this case https://github.com/apache/spark/blob/08c6bb9bf32f31b5b9870d56cc4c16ab97616da6/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala#L1749

@zhouyuan @PHILO-HE and @zhli1142015 , may basic c++ question, do you know how to modify velox widthBatch function to let it support return null value for those specifal cases as the result type already be set to int64_t ?

@FelixYBW
Copy link
Contributor

FelixYBW commented May 7, 2024

hose specifal cases as the result type already be set to

Velox uses Presto implementation. You may need to create a new function for Velox/Spark, and pass the reference of null set to the function.

facebook-github-bot pushed a commit to facebookincubator/velox that referenced this pull request May 22, 2024
Summary:
Doc: https://spark.apache.org/docs/latest/api/sql/#width_bucket
Code: https://github.com/apache/spark/blob/3b1ea0fde44ec0aef8af24ca9a0a218a1c2d487d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala#L1746C1-L1790C1

this is a dependency PR for fixing failure of apache/incubator-gluten#5634

Pull Request resolved: #9743

Reviewed By: xiaoxmeng

Differential Revision: D57658402

Pulled By: mbasmanova

fbshipit-source-id: b7f35225c260be95869884676d8e253a3a0a0786
Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@gaoyangxiaozhu
Copy link
Contributor Author

@rui-mo help review , thanks!

Copy link
Contributor

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@rui-mo rui-mo merged commit ec3f6b3 into apache:main May 24, 2024
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants