Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-47681][FOLLOWUP] Fix schema_of_variant(decimal). #46549

Closed
wants to merge 1 commit into from

Conversation

chenhao-db
Copy link
Contributor

What changes were proposed in this pull request?

The PR #46338 found schema_of_variant sometimes could not correctly handle variant decimals and had a fix. However, I found that the fix is incomplete and schema_of_variant can still fail on some inputs. The reason is that VariantUtil.getDecimal calls stripTrailingZeros. For an input decimal 10.00, the resulting scale is -1 and the unscaled value is 1. However, negative decimal scale is not allowed by Spark. The correct approach is to use the BigDecimal to construct a Decimal and read its precision and scale, as what we did in VariantGet.

This PR also includes a minor change for VariantGet, where a duplicated expression is computed twice.

Why are the changes needed?

They are bug fixes and are required to process decimals correctly.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

More unit tests. Some of them would fail without the change in this PR (e.g., check("10.00", "DECIMAL(2,0)")). Others wouldn't fail, but can still enhance test coverage.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label May 13, 2024
@chenhao-db
Copy link
Contributor Author

@cloud-fan could you help review? Thanks a lot!

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 3456d4f May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants