Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update upper bound for numpy. #24725

Merged
merged 7 commits into from Feb 8, 2023
Merged

Update upper bound for numpy. #24725

merged 7 commits into from Feb 8, 2023

Conversation

tvalentyn
Copy link
Contributor

@tvalentyn tvalentyn commented Dec 20, 2022

Due to numpy/numpy#22004, updating one of the unit tests that creates an nd.array with lists of variable length.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI.

sdks/python/setup.py Outdated Show resolved Hide resolved
@tvalentyn
Copy link
Contributor Author

Run Python 3.8 PostCommit

@github-actions
Copy link
Contributor

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

@tvalentyn
Copy link
Contributor Author

Another issue:

10:41:08 Traceback (most recent call last):
10:41:08   File "apache_beam/runners/common.py", line 1465, in apache_beam.runners.common.DoFnRunner._invoke_lifecycle_method
10:41:08   File "apache_beam/runners/common.py", line 551, in apache_beam.runners.common.DoFnInvoker.invoke_setup
10:41:08   File "/usr/local/lib/python3.8/dist-packages/apache_beam/ml/inference/base.py", line 427, in setup
10:41:08     self._model = self._load_model()
10:41:08   File "/usr/local/lib/python3.8/dist-packages/apache_beam/ml/inference/base.py", line 420, in _load_model
10:41:08     return self._shared_model_handle.acquire(load)
10:41:08   File "/usr/local/lib/python3.8/dist-packages/apache_beam/utils/shared.py", line 305, in acquire
10:41:08     return _shared_map.acquire(self._key, constructor_fn, tag)
10:41:08   File "/usr/local/lib/python3.8/dist-packages/apache_beam/utils/shared.py", line 246, in acquire
10:41:08     result = control_block.acquire(constructor_fn, tag)
10:41:08   File "/usr/local/lib/python3.8/dist-packages/apache_beam/utils/shared.py", line 139, in acquire
10:41:08     result = constructor_fn()
10:41:08   File "/usr/local/lib/python3.8/dist-packages/apache_beam/ml/inference/base.py", line 409, in load
10:41:08     model = self._model_handler.load_model()
10:41:08   File "/usr/local/lib/python3.8/dist-packages/apache_beam/ml/inference/base.py", line 167, in load_model
10:41:08     return self._unkeyed.load_model()
10:41:08   File "/usr/local/lib/python3.8/dist-packages/apache_beam/ml/inference/tensorrt_inference.py", line 267, in load_model
10:41:08     return TensorRTEngine(engine)
10:41:08   File "/usr/local/lib/python3.8/dist-packages/apache_beam/ml/inference/tensorrt_inference.py", line 134, in __init__
10:41:08     'dtype': np.dtype(trt.nptype(dtype)),
10:41:08   File "/usr/local/lib/python3.8/dist-packages/tensorrt/__init__.py", line 165, in nptype
10:41:08     bool: np.bool,
10:41:08   File "/usr/local/lib/python3.8/dist-packages/numpy/__init__.py", line 284, in __getattr__
10:41:08     raise AttributeError("module {!r} has no attribute "
10:41:08 AttributeError: module 'numpy' has no attribute 'bool'
10:41:08 

@tvalentyn
Copy link
Contributor Author

Rootcause tracked in: NVIDIA/TensorRT#2557

@tvalentyn
Copy link
Contributor Author

Run Python 3.8 PostCommit

@github-actions
Copy link
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @jrmccluskey for label python.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@jrmccluskey
Copy link
Contributor

Error message on the precommits:

Container does not include required Beam dependencies or has conflicting dependencies. If Beam dependencies have changed, you need to regenerate base_image_requirements.txt files. See: https://s.apache.org/beam-python-requirements-generate
The command '/bin/sh -c pip check || (echo "Container does not include required Beam dependencies or has conflicting dependencies. If Beam dependencies have changed, you need to regenerate base_image_requirements.txt files. See: https://s.apache.org/beam-python-requirements-generate" && exit 1)' returned a non-zero code: 1

@tvalentyn
Copy link
Contributor Author

thanks. that's WAI. Bot sent it for review too early. I have intentionally temporarily increased the lower bound temporarily to have a meaningful postcommit signal in https://ci-beam.apache.org/job/beam_PostCommit_Python38_PR/674/ (still running)

@tvalentyn
Copy link
Contributor Author

Run Python 3.8 PostCommit

@github-actions github-actions bot added docker and removed docker labels Dec 21, 2022
@tvalentyn
Copy link
Contributor Author

Run Python 3.8 PostCommit

@github-actions github-actions bot added docker and removed docker labels Dec 22, 2022
@github-actions
Copy link
Contributor

Reminder, please take a look at this pr: @jrmccluskey

@github-actions
Copy link
Contributor

github-actions bot commented Jan 3, 2023

Assigning new set of reviewers because Pr has gone too long without review. If you would like to opt out of this review, comment assign to next reviewer:

R: @damccorm for label python.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

@damccorm
Copy link
Contributor

damccorm commented Jan 3, 2023

stop reviewer notifications

@damccorm
Copy link
Contributor

damccorm commented Jan 3, 2023

@tvalentyn I'm happy to review, but I'm stopping the bot since its not ready yet

@github-actions
Copy link
Contributor

github-actions bot commented Jan 3, 2023

Stopping reviewer notifications for this pull request: requested by reviewer

@tvalentyn
Copy link
Contributor Author

cc: @jli

@tvalentyn
Copy link
Contributor Author

tvalentyn commented Jan 20, 2023

Interactive runner tests need a release from facets-overview: PAIR-code/facets#250

@tvalentyn
Copy link
Contributor Author

Run Python 3.8 PostCommit

@tvalentyn
Copy link
Contributor Author

Dataframes (ParquetIO) test requires numpy<1.20.0, this is a preexisting issue:

# pyarrow <3 doesn't work with 1.20.0, but doesn't restrict the bounds
, and shouldn't fail once I relax the lower bound.

@tvalentyn
Copy link
Contributor Author

Interactive runner tests need a release from facets-overview: PAIR-code/facets#250

This has been done

@tvalentyn
Copy link
Contributor Author

Interactive runner tests need a release from facets-overview: PAIR-code/facets#250

This has been done

needs one more patch... PAIR-code/facets#251

@tvalentyn
Copy link
Contributor Author

tvalentyn commented Feb 6, 2023

needs one more patch... PAIR-code/facets#251

should be released now.

One more issue. Looks like some DF tests fail with pandas 1.1.x The bug seems to be in Pandas 1.1.x, the change in behavior is caused by following line of code:

At some point the test is executing something akin to:

import numpy as np
x = np.empty(length=0, dtype='int64')
x.fill(np.NaN)

within pandas code.

This now fails. I am considering bumping up the lower bound of Pandas, the affected version is 2+ yrs old.

@tvalentyn
Copy link
Contributor Author

tvalentyn commented Feb 6, 2023

Pandas 1.1.x issue is reproducible in: import pandas as pd; pd.DataFrame(columns=pd.CategoricalIndex(['A'], dtype='category'), dtype='int64') , so bumping pandas is appropriate.

@codecov
Copy link

codecov bot commented Feb 6, 2023

Codecov Report

Merging #24725 (be88bb7) into master (d20d0b0) will decrease coverage by 0.02%.
The diff coverage is 0.00%.

❗ Current head be88bb7 differs from pull request most recent head 238c496. Consider uploading reports for the commit 238c496 to get more accurate results

@@            Coverage Diff             @@
##           master   #24725      +/-   ##
==========================================
- Coverage   72.96%   72.94%   -0.02%     
==========================================
  Files         745      745              
  Lines       99174    99195      +21     
==========================================
- Hits        72362    72359       -3     
- Misses      25446    25470      +24     
  Partials     1366     1366              
Flag Coverage Δ
python 82.44% <0.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...hon/apache_beam/ml/inference/tensorrt_inference.py 0.00% <0.00%> (ø)
.../python/apache_beam/testing/test_stream_service.py 88.09% <0.00%> (-4.77%) ⬇️
sdks/python/apache_beam/utils/interactive_utils.py 95.12% <0.00%> (-2.44%) ⬇️
...ks/python/apache_beam/runners/worker/data_plane.py 87.57% <0.00%> (-1.70%) ⬇️
...python/apache_beam/runners/worker/worker_status.py 75.33% <0.00%> (-1.34%) ⬇️
...che_beam/runners/interactive/interactive_runner.py 90.50% <0.00%> (-1.27%) ⬇️
...eam/runners/portability/fn_api_runner/execution.py 92.49% <0.00%> (-0.64%) ⬇️
sdks/python/apache_beam/runners/direct/executor.py 96.46% <0.00%> (-0.55%) ⬇️
sdks/python/apache_beam/transforms/util.py 96.16% <0.00%> (-0.13%) ⬇️
...ks/python/apache_beam/runners/worker/sdk_worker.py 89.24% <0.00%> (ø)
... and 2 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@tvalentyn
Copy link
Contributor Author

tvalentyn commented Feb 7, 2023

Pandas 1.1.x issue is reproducible in:

import pandas as pd
pd.DataFrame(columns=pd.CategoricalIndex(['A'], dtype='category'), dtype='int64')

, so bumping pandas is appropriate.

Looking closer, it seems that minimal pandas version that doesn't fail the above on the most recent version of numpy is pandas==1.4.3

@tvalentyn
Copy link
Contributor Author

note that pandas==1.4.3 is not available on Python 3.7, but neither is numpy==1.24.x.

@tvalentyn
Copy link
Contributor Author

Also bumping up lower bounds for pyarrow, for similar reason.

@tvalentyn
Copy link
Contributor Author

Ok, this should now be ready for review, @damccorm

@tvalentyn tvalentyn added this to the 2.45.0 Release milestone Feb 8, 2023
@@ -122,7 +122,16 @@ def __init__(self, engine: trt.ICudaEngine):
self.outputs = []
self.gpu_allocations = []
self.cpu_allocations = []
"""Setup I/O bindings."""

# TODO(https://github.com/NVIDIA/TensorRT/issues/2557):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have an issue on our end to track this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, opened: #25387

'pandas>=1.0,<1.6,!=1.5.0,!=1.5.1;python_version<"3.10"',
'pandas>=1.4.3,<1.6,!=1.5.0,!=1.5.1;python_version>="3.10"'
'pandas<1.4.0;python_version=="3.7"',
'pandas>=1.4.3,!=1.5.0,!=1.5.1,<1.6;python_version>="3.8"',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this isn't even a year old (its from June 2022) so I imagine we'll be booting some users off of older versions of Pandas. I think that's probably worth it though.

Copy link
Contributor Author

@tvalentyn tvalentyn Feb 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. The last 1.3.x version was released Dec 2021 though. The patch versions of 1.4.x are probably interchangeable to users.

Also customers could try using an older version of Beam. We can make a change to release notes if you think it would help make this option more visible.

Copy link
Contributor

@damccorm damccorm Feb 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM - release notes might be nice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants