BUG: Distinguish exact vs. equivalent dtype for C type aliases. #21995

eirrgang · 2022-07-16T18:24:19Z

For asarray and for the dtype equality operator,
equivalent dtype aliases were considered exact matches.
This change ensures that the returned array has a descriptor
that exactly matches the requested dtype.

Note: Intended behavior of np.dtype('i') == np.dtype('l')
is to test compatibility, not identity. This change does not
affect the behavior of PyArray_EquivTypes(), and the
__eq__ operator for dtype continues to map to
PyArray_EquivTypes().

Fixes #1468.

For `asarray` and for the `dtype` equality operator, equivalent dtype aliases were considered exact matches. This change ensures that the returned array has a descriptor that exactly matches the requested dtype. Note: Intended behavior of `np.dtype('i') == np.dtype('l')` is to test compatibility, not identity. This change does not affect the behavior of `PyArray_EquivTypes()`, and the `__eq__` operator for `dtype` continues to map to `PyArray_EquivTypes()`. Fixes numpy#1468.

numpy/core/src/multiarray/multiarraymodule.c

Use the correct API call to actually set the base object.

Fix a false mismatch. Separate dtype objects, even if equivalent, cause distinct array views to be created.

eirrgang · 2022-07-16T21:12:05Z

The updates that were necessary for the tests indicate that this change will surprise some users. Some sort of release note is warranted, and perhaps some sort of update to docs related to the dtype kwarg.

Suggestions for specific docs files to update?

numpy/array_api/tests/test_asarray.py

numpy/ma/tests/test_core.py

* Improve comments/docs. * Improve descriptiveness of variable names. * Add additional test expressions that would not pass without this patch.

Shorten some lines.

numpy/core/src/multiarray/multiarraymodule.c

numpy/array_api/tests/test_asarray.py

eirrgang · 2022-07-17T18:15:51Z

The updates that were necessary for the tests indicate that this change will surprise some users. Some sort of release note is warranted, and perhaps some sort of update to docs related to the dtype kwarg.

Suggestions for specific docs files to update?

I couldn't find any docs that were specific enough to require updating, but I added a release notes doc.

seberg · 2022-07-17T19:38:51Z

Thanks this is awesome. Those are long docs, if we know a place to put them, we probably should... For the release note, it would be nice to make it much shorter unfortuntely!

EDIT: I (or someone else from us) can take a stab at that though!

doc/release/upcoming_changes/21995.compatibility.rst

eirrgang · 2022-08-26T12:11:54Z

Thanks this is awesome. Those are long docs, if we know a place to put them, we probably should... For the release note, it would be nice to make it much shorter unfortuntely!

EDIT: I (or someone else from us) can take a stab at that though!

Is my proposed edit better? Or is there an alternative proposal?

mattip · 2022-09-07T16:55:45Z

Thanks @eirrgang

eirrgang

Sorry. I was away for the weekend and just got caught up. Looks good, but I think there is a typo.

doc/release/upcoming_changes/21995.compatibility.rst

mattip · 2022-09-08T07:02:30Z

See #22227

InessaPawson · 2022-09-08T18:14:26Z

Hi-five on merging your first pull request to NumPy, @eirrgang! We hope you stick around! Your choices aren’t limited to programming – you can review pull requests, help us stay on top of new and old issues, develop educational material, work on our website, add or improve graphic design, create marketing materials, translate website content, write grant proposals, and help with other fundraising initiatives. For more info, check out: https://numpy.org/contribute
Also, consider joining our mailing list. This is a great way to connect with other cool people in our community and be part of important conversations that affect the development of NumPy: https://mail.python.org/mailman/listinfo/numpy-discussion

rgommers · 2022-09-09T07:54:54Z

As reported in gh-22233, there is a serious issue with this PR, likely a memory leak that is causing issues for downstream projects like SciPy and MNE-Python.

I will revert this PR. @eirrgang it would be great if you could resubmit this as a new PR, and then we can merge it again once the problem is understood and addressed. Maybe pytest-leaks or Valgrind will be helpful here in pointing to the root cause.

rgommers · 2022-09-09T07:55:34Z

Ugh, GitHub: Sorry, this pull request couldn’t be reverted automatically. It may have already been reverted, or the content may have changed since it was merged.

seberg · 2022-09-09T07:55:45Z

numpy/core/src/multiarray/multiarraymodule.c

+                            PyArray_FLAGS(oparr),
+                            op,
+                            op
+                            );


Oh, either I just didn't think of it, or expected that this steals a reference to op for the base. We are missing a DECREF, I will make a PR.

Yes, I think I originally tried PyArray_NewFromDescr with op for *data and nullptr for *obj, and PyArray_SetBaseObject or something that steals the reference for the base. Anyway, I guess we were sloppy when switching to PyArray_NewFromDescrAndBase. Thanks for the quick catch!

The new path to preserve dtypes provided by creating a view got the reference counting wrong, because it also hit the incref path that was needed for returning the identity. This fixes up numpygh-21995 Closes numpygh-22233

leofang · 2022-09-12T18:34:28Z

It is incorrect to add numpy/array_api/tests/test_asarray.py. The test itself is not of concern to Array API. It should be moved elsewhere.

(This is why we need to set up array_api namespace owners... cc: @seberg @rgommers)

seberg · 2022-09-12T18:40:50Z

Yes, sorry, I missed where this test was, it should be elsewhere. I had just thought it looked reasonably thorough. It might also be split up, but I don't care.

In any case, we need to move this test into /core/tests/test_<something> not here. @eirrgang do you want to have a look into that.

eirrgang · 2022-09-12T19:21:28Z

Yes, sorry, I missed where this test was, it should be elsewhere. I had just thought it looked reasonably thorough. It might also be split up, but I don't care.

In any case, we need to move this test into /core/tests/test_<something> not here. @eirrgang do you want to have a look into that.

Sure. I'll make a new PR. As I recall, there wasn't an obvious good place for the test. I'll look again, and try to come up with something reasonable.

seberg · 2022-09-12T19:32:19Z

Way too long, but test_multiarray.py with TestCreation might be an option. Otherwise test_array_coercion.py has somewhat related tests so you could just add it in that file. The only real problem here is that it doesn't belong under array_api/tests, but core/tests

As noted at #21995 (comment), the new test from #21995 was placed in a directory intended for the Array API, and unrelated to the change. * Consolidate test_dtype_identity into an existing test file. Remove `test_asarray.py`. Create a new `TestAsArray` suite in `test_array_coercion.py` * Linting. Wrap some comments that got too long after function became a method (with additional indentation).

WarrenWeckesser · 2022-09-16T14:33:27Z

This appears to have broken some tests in scipy.sparse. Ultimately, the problem comes down to the dtype check requiring that the dtypes are the same object, instead of being equal in the Python sense.

This shows the essence of the problem:

In [27]: a = np.array([10, 20, 30, 40])

In [28]: b = np.asarray(a, dtype=a.dtype.newbyteorder('native'))

In [29]: a is b  # Before this PR, this would be true.
Out[29]: False

In [30]: a.dtype == b.dtype
Out[30]: True

a.dtype.newbyteorder('native') creates a new dtype instance, but the new dtype is equal (i.e. ==) to a.dtype. So the dtypes of a and b are equal, but asarray() still created a view. There are tests in scipy.sparse that expect a is b to be True.

eirrgang · 2022-09-16T14:42:54Z

a.dtype.newbyteorder('native') creates a new dtype instance, but the new dtype is equal to (i.e. ==) a.dtype. So the dtypes of a and b are equal, but asarray() still created a view. There are tests in scipy.sparse that expect a is b to be True.

Yes. My understanding from @seberg is that this is the intended behavior, and it is reflected in the updates to the numpy tests and release notes. Please advise if you find contradictions in the documentation. Some documentation was trimmed, so you might find the commit history of the PR informative. See also https://github.com/numpy/numpy/pull/21995/files#r922726648

seberg · 2022-09-19T09:24:50Z

@WarrenWeckesser, yeah, that was intentional but that doesn't mean it was the best thing to do. Please do followup if you think we should consider undoing (parts) of this.

github-actions bot added the 00 - Bug label Jul 16, 2022

eirrgang commented Jul 16, 2022

View reviewed changes

numpy/core/src/multiarray/multiarraymodule.c Show resolved Hide resolved

eirrgang added 2 commits July 16, 2022 13:48

Use the correct pointer type.

a3eedb0

Coerce the returned pointer type.

78cd6b0

eirrgang commented Jul 16, 2022

View reviewed changes

numpy/core/src/multiarray/multiarraymodule.c Show resolved Hide resolved

eirrgang added 4 commits July 16, 2022 15:27

Get the base object.

45281b8

Use the correct API call to actually set the base object.

Add unit testing.

81b9760

Don't regenerate the descriptor unnecessarily.

871a1f9

Fix a false mismatch. Separate dtype objects, even if equivalent, cause distinct array views to be created.

Update comment and obey formatting requirements.

5651445

eric-wieser reviewed Jul 16, 2022

View reviewed changes

numpy/array_api/tests/test_asarray.py Outdated Show resolved Hide resolved

eric-wieser reviewed Jul 16, 2022

View reviewed changes

numpy/ma/tests/test_core.py Show resolved Hide resolved

eirrgang added 2 commits July 17, 2022 10:49

Expand test_asarray.py.

e286f46

* Improve comments/docs. * Improve descriptiveness of variable names. * Add additional test expressions that would not pass without this patch.

Lint.

01438a8

Shorten some lines.

eirrgang commented Jul 17, 2022

View reviewed changes

numpy/core/src/multiarray/multiarraymodule.c Show resolved Hide resolved

eirrgang commented Jul 17, 2022

View reviewed changes

numpy/array_api/tests/test_asarray.py Outdated Show resolved Hide resolved

Add release note and further clarify tests.

b1a8ff8

eirrgang commented Jul 20, 2022

View reviewed changes

doc/release/upcoming_changes/21995.compatibility.rst Outdated Show resolved Hide resolved

DOC: Take a stab at shortening the release note

235b75e

mattip merged commit 65c10c1 into numpy:main Sep 7, 2022

eirrgang commented Sep 7, 2022

View reviewed changes

doc/release/upcoming_changes/21995.compatibility.rst Show resolved Hide resolved

mattip mentioned this pull request Sep 8, 2022

DOC: fix up release note #22227

Merged

eirrgang deleted the mei-1468 branch September 8, 2022 14:06

larsoner mentioned this pull request Sep 8, 2022

BUG: Drastic memory usage increase in recent builds #22233

Closed

rgommers added this to the 1.24.0 release milestone Sep 9, 2022

seberg reviewed Sep 9, 2022

View reviewed changes

seberg mentioned this pull request Sep 9, 2022

BUG: Fix incorrect refcounting in new asarray path #22236

Merged

eirrgang mentioned this pull request Sep 12, 2022

TST: Move new asarray test to a more appropriate place. #22251

Merged

WarrenWeckesser mentioned this pull request Sep 16, 2022

New CI failures in sparse with nightly numpy scipy/scipy#17033

Closed

ngoldbaum mentioned this pull request Aug 21, 2023

BUG: np.asarray return a copy with shared memory #24478

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Distinguish exact vs. equivalent dtype for C type aliases. #21995

BUG: Distinguish exact vs. equivalent dtype for C type aliases. #21995

eirrgang commented Jul 16, 2022

eirrgang commented Jul 16, 2022

eirrgang commented Jul 17, 2022

seberg commented Jul 17, 2022 •

edited

eirrgang commented Aug 26, 2022

mattip commented Sep 7, 2022

eirrgang left a comment

mattip commented Sep 8, 2022

InessaPawson commented Sep 8, 2022

rgommers commented Sep 9, 2022

rgommers commented Sep 9, 2022

seberg Sep 9, 2022

eirrgang Sep 9, 2022

leofang commented Sep 12, 2022

seberg commented Sep 12, 2022

eirrgang commented Sep 12, 2022

seberg commented Sep 12, 2022

WarrenWeckesser commented Sep 16, 2022 •

edited

eirrgang commented Sep 16, 2022

seberg commented Sep 19, 2022

BUG: Distinguish exact vs. equivalent dtype for C type aliases. #21995

BUG: Distinguish exact vs. equivalent dtype for C type aliases. #21995

Conversation

eirrgang commented Jul 16, 2022

eirrgang commented Jul 16, 2022

eirrgang commented Jul 17, 2022

seberg commented Jul 17, 2022 • edited

eirrgang commented Aug 26, 2022

mattip commented Sep 7, 2022

eirrgang left a comment

Choose a reason for hiding this comment

mattip commented Sep 8, 2022

InessaPawson commented Sep 8, 2022

rgommers commented Sep 9, 2022

rgommers commented Sep 9, 2022

seberg Sep 9, 2022

Choose a reason for hiding this comment

eirrgang Sep 9, 2022

Choose a reason for hiding this comment

leofang commented Sep 12, 2022

seberg commented Sep 12, 2022

eirrgang commented Sep 12, 2022

seberg commented Sep 12, 2022

WarrenWeckesser commented Sep 16, 2022 • edited

eirrgang commented Sep 16, 2022

seberg commented Sep 19, 2022

seberg commented Jul 17, 2022 •

edited

WarrenWeckesser commented Sep 16, 2022 •

edited