Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: handle None when converting numerics to parquet #768

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Linchin
Copy link
Contributor

@Linchin Linchin commented May 13, 2024

decimal.Decimal reports an error when the input is None. This PR adds a lambda func to handle this case, as well as a unit test.

Fixes #719 🦕

@Linchin Linchin requested a review from chalmerlowe May 13, 2024 19:14
@Linchin Linchin requested review from a team as code owners May 13, 2024 19:14
@product-auto-label product-auto-label bot added size: s Pull request size is small. api: bigquery Issues related to the googleapis/python-bigquery-pandas API. labels May 13, 2024
@Linchin Linchin added the owlbot:run Add this label to trigger the Owlbot post processor. label May 13, 2024
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label May 13, 2024
# decimal.Decimal does not support `None` input, add support here.
# https://github.com/googleapis/python-bigquery-pandas/issues/719
def convert(x):
if x is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QUESTION:

The original issue mentions that None and pd.NA are not handled appropriately.

This code seems to only address None.
Is the intent to check for both?

IF yes... this may be a possible check that is more general AND as other edge cases arise could enable edits to the code quickly and easily.

Suggested change
if x is None:
if x in {None, pd.NA}:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, thank you for catching this! I'll update the code.

# pandas.testing.assert_frame_equal() doesn't distinguish Decimal('nan')
# vs. None, verify Decimal("nan") directly.
# https://github.com/pandas-dev/pandas/issues/18463
assert result["A"][1].is_nan()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QUESTION/PREFERENCE:

If the above question about pd.NA is answered yes, then would like to see the unit test expanded to also include feeding in a pd.NA value to ensure that it gets converted to a Decimal("NaN") value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I added the unit test for this case :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-pandas API. size: s Pull request size is small.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nullable numeric column does not handled well
2 participants