Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Add a config to ignore fallback cost for scan #5617

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

PHILO-HE
Copy link
Contributor

@PHILO-HE PHILO-HE commented May 6, 2024

It's common that the file format isn't supported by Gluten like Hive Row format which won't be supported in near future, but the rest of the operators are already supported by Gluten. It's high possibility we can still get perf gain from the rest operators - the R2C cost. So let's add a specific config for the scan only

Copy link

github-actions bot commented May 6, 2024

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

github-actions bot commented May 6, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented May 6, 2024

Run Gluten Clickhouse CI

@@ -208,7 +208,7 @@ class GlutenConfig(conf: SQLConf) extends Logging {

def queryFallbackThreshold: Int = conf.getConf(COLUMNAR_QUERY_FALLBACK_THRESHOLD)

def fallbackIgnoreRowToColumnar: Boolean = conf.getConf(COLUMNAR_FALLBACK_IGNORE_ROW_TO_COLUMNAR)
def ignoreScanFallbackCost: Boolean = conf.getConf(COLUMNAR_IGNORE_SCAN_FALLBACK_COST)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the Row_to_column still used? I think it's only used by table scan

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config for ignoring RowToColumnar looks useless. So I'm removing it. It needs some discussion with our customer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's still useful, fallbackIgnoreRowToColumnar allows convert shuffle to columnar shuffle

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ulysses-you, I feel it is more clear to keep considering Row2Columnar, because we are calculating all transition cost (C2R/R2C) in the current evaluation logic. The Row2Columnar doesn't only exist prior to columnar shuffle. So it may be not that good to keep fallbackIgnoreRowToColumnar just for the purpose of using columnar shuffle. If we prefer using columnar shuffle, maybe we can add some dedicate code to realize the purpose.

Copy link

github-actions bot commented May 7, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented May 7, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented May 7, 2024

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

2 similar comments
Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants