Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement NumPy 2.0 migration rule #7702

Merged
merged 5 commits into from Nov 3, 2023
Merged

Conversation

mtsokol
Copy link
Contributor

@mtsokol mtsokol commented Sep 28, 2023

Summary

Hi! Currently NumPy Python API is undergoing a cleanup process that will be delivered in NumPy 2.0 (release is planned for the end of the year).
Most changes are rather simple (renaming, removing or moving a member of the main namespace to a new place), and they could be flagged/fixed by an additional ruff rule for numpy (e.g. changing occurrences of np.float_ to np.float64).

Would you accept such rule?

I named it NPY201 in the existing group, so people will receive a heads-up for changes arriving in 2.0 before actually migrating to it.

This is still a draft PR. I'm not an expert in rust so if any part of code can be done better please share!

NumPy 2.0 migration guide: https://numpy.org/devdocs/numpy_2_0_migration_guide.html
NEP 52: https://numpy.org/neps/nep-0052-python-api-cleanup.html
NumPy cleanup tracking issue: numpy/numpy#23999

Test Plan

A unit test is provided that checks all rule's fix cases.

@charliermarsh
Copy link
Member

Happy to include something like this. I need to think a bit more on whether this should be its own rule, or whether we should just extend deprecated_function (and add a deprecated_constant for the remaining rules).

As of what NumPy version are these fixes safe? As in, are there older, supported versions of NumPy for which applying these rules would cause issues? (Ultimately asking: will Ruff need to know the NumPy version?)

@mtsokol
Copy link
Contributor Author

mtsokol commented Sep 28, 2023

Thank you for the feedback! All these breaking changes will be shipped in 2.0.0 version that is planned for this December. The latest release of numpy is 1.26.0.

Only nightly 2.0.0.dev0 and future 2.0.0 versions are safe to apply these changes.
For past releases these are incompatible ones, therefore we thought to establish it as a separate rule. Downstream users that are still on 1.x would upgrade ruff and get a heads-up errors, which they can suppress by disabling the rule.
Once they decide to upgrade to 2.0 they can run ruff with --fix and get most of changes done (and these renames can be mundane to do manually - I did them for numpy, scipy, etc. and it was magnitude of hundreds of lines of changed code).

If ruff would know numpy version then it could be used to determine whether to apply this rule.

@charliermarsh
Copy link
Member

Thanks @mtsokol. Sorry if I'm misunderstanding, but some of these fixes would be safe to apply in older versions, right? For example, the rules that suggest using np.inf would be safe to apply, since np.inf exists in older versions, etc.

@mtsokol
Copy link
Contributor Author

mtsokol commented Oct 1, 2023

@charliermarsh Ah yes! Actually most of changes from the NumPy 2.0 migration guide can be applied to the previous versions. (My last response was confusing - only a few are backward incompatible, e.g. renaming np.int_ to np.long (numpy/numpy#24794) or moving np.byte_bounds to np.lib.array_utils. Right now I'm working on these renames mostly and I forgot to mention that it's just a fraction of all changes).

np.Infinity -> np.inf, np.float_ -> np.float64 etc. are all backward compatible and are safe to apply to previous versions.

@mtsokol
Copy link
Contributor Author

mtsokol commented Oct 2, 2023

The PR's blueprint is ready from my side - it consist of three types of AST fixes:

  • AutoImport: the deprecated member of the numpy namespace can be replaced with another, e.g. np.float_ with np.float64 or np.INF with np.inf.
  • AutoPurePython: the deprecated member can be replaced with python expression, e.g. np.NZERO with -0.0 or np.issubclass_ with issubclass.
  • Manual: can't be automatically fixed - a guideline is provided on how to fix it.

If the code's shape is correct I will provision the rest of entries and exhaustive unit test. Please share your feedback!

P.S. I get some errors in the CI that I can't reproduce locally:

  |
1 | use ruff_diagnostics::{AutofixKind, Diagnostic, Edit, Fix, Violation};
  |                        ^^^^^^^^^^^ no `AutofixKind` in the root

I followed implementation of other numpy rules. What am I missing?

@mtsokol mtsokol marked this pull request as ready for review October 2, 2023 19:03
@charliermarsh
Copy link
Member

@mtsokol - We recently renamed AutofixKind to AutofixAvailability, so you might be running into something like: GitHub is rebasing your PR on main prior to testing, and so the name doesn't exist anymore.

@mtsokol
Copy link
Contributor Author

mtsokol commented Oct 2, 2023

@mtsokol - We recently renamed AutofixKind to AutofixAvailability, so you might be running into something like: GitHub is rebasing your PR on main prior to testing, and so the name doesn't exist anymore.

Thank you! Now CI tests are passing.

@codspeed-hq
Copy link

codspeed-hq bot commented Oct 2, 2023

CodSpeed Performance Report

Merging #7702 will not alter performance

Comparing mtsokol:npy201 (b86abcd) with main (f64c389)

Summary

✅ 25 untouched benchmarks

@github-actions
Copy link

github-actions bot commented Oct 2, 2023

PR Check Results

Ecosystem

ℹ️ ecosystem check detected changes. (+3, -1, 0 error(s))

airflow (+3, -1)

- [*] 16167 fixable with the `--fix` option (6240 hidden fixes can be enabled with the `--unsafe-fixes` option).
+ [*] 16169 fixable with the `--fix` option (6240 hidden fixes can be enabled with the `--unsafe-fixes` option).
+ airflow/serialization/serializers/numpy.py:78:13: NPY201 [*] `np.float_` will be removed in the NumPy 2.0. Use `numpy.float64` instead.
+ airflow/serialization/serializers/numpy.py:78:60: NPY201 [*] `np.complex_` will be removed in the NumPy 2.0. Use `numpy.complex128` instead.

Rules changed: 1
Rule Changes Additions Removals
NPY201 2 2 0

@Hoxbro
Copy link
Contributor

Hoxbro commented Oct 6, 2023

It looks like you are not changing NaN to nan. Per https://numpy.org/devdocs/numpy_2_0_migration_guide.html

@mtsokol
Copy link
Contributor Author

mtsokol commented Oct 6, 2023

It looks like you are not changing NaN to nan. Per https://numpy.org/devdocs/numpy_2_0_migration_guide.html

We're still finishing adding new changes but here it's just a matter of adding new entries to match in the rule (I will add the rest soon!). I was curious if the overall design of PR and three types of fixes proposed looks correct.

@mtsokol mtsokol force-pushed the npy201 branch 3 times, most recently from 87a96b4 to 6b7cca6 Compare October 24, 2023 09:09
@mtsokol
Copy link
Contributor Author

mtsokol commented Oct 24, 2023

Hi @charliermarsh @Hoxbro,

We finished Python API refactoring for NumPy 2.0. I updated this PR - now the new rule contains all changes introduced to the main namespace (i.e. members' removals).

Out of all 52 cases in the rule only one is backward incompatible (np.byte_bounds). 51 remaining ones are relevant for both NumPy 1.x and 2.0 releases and contribute to cleaner codebase regardless of NumPy version used. There are 29 cases that can be auto-fixed and 23 that require manual intervention.

I added a unit test that checks all rule's fix cases.

@charliermarsh
Copy link
Member

Thanks @mtsokol -- will queue this up for review and try to get to it shortly.

@charliermarsh
Copy link
Member

(Sorry for the delay on this -- we shipped our formatter earlier this week and I've been busy with follow-ups from the release. This is still on my list.)

@charliermarsh charliermarsh enabled auto-merge (squash) November 3, 2023 03:44
@charliermarsh
Copy link
Member

This looks great, thanks @mtsokol! I like the way you structured the rule -- it reads well.

I took some liberties in editing the rule documentation and tweaking a few of the migration messages, with the goal of making them a little more consistent with the copy and voice we use in other rules. If you have any feedback or issues with the changes, just let me know, happy to address in a follow-up PR.

@charliermarsh charliermarsh added the rule Implementing or modifying a lint rule label Nov 3, 2023
@charliermarsh charliermarsh merged commit d04d964 into astral-sh:main Nov 3, 2023
16 checks passed
Copy link

github-actions bot commented Nov 3, 2023

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+3 -0 violations, +0 -0 fixes in 41 projects)

apache/airflow (+3 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --select ALL --preview

+ airflow/providers/salesforce/hooks/salesforce.py:263:34: NPY201 [*] `np.NaN` will be removed in NumPy 2.0. Use `numpy.nan` instead.
+ airflow/serialization/serializers/numpy.py:78:13: NPY201 [*] `np.float_` will be removed in NumPy 2.0. Use `numpy.float64` instead.
+ airflow/serialization/serializers/numpy.py:78:60: NPY201 [*] `np.complex_` will be removed in NumPy 2.0. Use `numpy.complex128` instead.

Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
NPY201 3 3 0 0 0

@mtsokol mtsokol deleted the npy201 branch November 3, 2023 07:50
@Hoxbro
Copy link
Contributor

Hoxbro commented Nov 3, 2023

Shouldn't the new numpy imports happen at the same level as the existing numpy import?

@mtsokol
Copy link
Contributor Author

mtsokol commented Nov 3, 2023

@charliermarsh Hi, all tweaks look good to me - thank you for a review!

@pllim
Copy link

pllim commented Nov 3, 2023

Hello! As a user who does not know much about ruff internals, how do I try this out downstream? Do I only have to modify https://github.com/astropy/astropy/blob/main/.ruff.toml ? If so, how? Is this documented somewhere?

@charliermarsh
Copy link
Member

@Hoxbro - Can you give an example of where this is happening?

@charliermarsh
Copy link
Member

@pllim -- You shouldn't need to modify your configuration, but you would need to build from source, which just means that you need to have Rust installed. If you have Rust installed, then you should be able to run something like:

cargo run -p ruff_linter -- check ../path/to/astropy --select NPY --preview

Comment on lines +16 to +18
1 |+from numpy.lib import add_docstring
1 2 | def func():
2 3 | import numpy as np
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@charliermarsh, here is an example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that instead we should focus on trying to get this to use np.lib.add_docstring here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a great way and will only change the line where changes are needed.

Just needs to be sure that the different submodules are initialized when running import numpy as np.

@pllim
Copy link

pllim commented Nov 3, 2023

but you would need to build from source

Oh, so this isn't one of those code I can just add to the TOML? Or maybe I can in the future when this feature is released?

Not sure if I want to install rust and build this from source on a CI for a Python package.

= help: Use `numpy.lib.array_utils.byte_bounds` instead.

ℹ Fix
1 |+from numpy.lib.array_utils import byte_bounds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be so noisy after the PR has been merged.

I can't do this import in numpy 1.26.0. It is possible to do from numpy.lib.utils import byte_bounds

Copy link
Contributor Author

@mtsokol mtsokol Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! This is the only backward incompatible change in the rule, mentioned in the comment: #7702 (comment). It will be available in array_utils from 2.0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I'm really sorry about the noise 🙂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should probably make this a suggested fix with a clear message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #8474

@charliermarsh
Copy link
Member

@pllim -- It'll be available in the next release (it was just merged last night), so later today or early next week.

@pllim
Copy link

pllim commented Nov 3, 2023

It'll be available in the next release

And after that, I can just edit my .ruff.toml to use this?

@zanieb
Copy link
Member

zanieb commented Nov 3, 2023

@pllim yep!

@pllim
Copy link

pllim commented Nov 3, 2023

What codes are available? Is there a page where I can look this up? Thanks!!

@charliermarsh
Copy link
Member

Everything on the rules page is available in the latest release. (We only update the documentation when we release.)

@pllim
Copy link

pllim commented Nov 3, 2023

Great, thanks! Looking forward to the release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants