BUG: Zero-dimensional numpy arrays within records decay to scalars #9442

NeilGirdhar · 2017-07-20T14:12:13Z

First, a shape (4,) numpy array within a record can be assigned:

In [33]: a = np.zeros((10,), dtype=[('k', '<u8'), ('t', '<f4'), ('d', np.bool, (4,))])

In [34]: b = np.ones((), dtype=np.bool)

In [35]: a[0][2][3] = b

But, a shape () cannot:

In [36]: a = np.zeros((10,), dtype=[('k', '<u8'), ('t', '<f4'), ('d', np.bool, ())])

In [37]: a[0][2][()] = b
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-37-075e4ba95b60> in <module>()
----> 1 a[0][2][()] = b

TypeError: 'numpy.bool_' object does not support item assignment

But if it's not in a record, it works just fine:

In [38]: a = np.zeros((), np.bool)

In [39]: a[()] = b

The text was updated successfully, but these errors were encountered:

eric-wieser · 2017-07-20T14:40:42Z

There're two parts to this. The error you're getting is the same as this one:

b = np.zeros(3, np.bool)
b[0][()] = np.ones((), dtype=np.bool)

Which is because a np.bool_ and a 0d array of np.bool_ are different things. This part is not a bug, and is very much by design

The other problem is that your dtype is being ignored:

>>> a.dtype
dtype([('k', '<u8'), ('t', '<f4'), ('d', '?')])

In general, it seems to be impossible to specify a 0d subdtype:

>>> np.dtype((np.bool_, (2,))).subdtype
(dtype('bool'), (2,))
>>> np.dtype((np.bool_, ())).subdtype
None

This is arguably a bug

eric-wieser · 2017-07-20T14:47:21Z

Here's the code that deliberately ignores a shape of ()

eric-wieser · 2017-07-20T14:52:38Z

Even removing that doesn't help though, as huge amounts of numpy functions end along the lines of

if arr.ndim == 0:
    arr = arr[()]
return arr

eric-wieser · 2017-07-20T14:53:43Z

A workaround would be to index as a['d'][0,...][the_index] = b, which would then work for any shape of field

NeilGirdhar · 2017-07-20T14:57:41Z

Thanks for linking the code. I was going to ask you if I could help, but I'm very busy right now. I may come back to this. I don't understand your workaround in the context of my second definition of a.

eric-wieser · 2017-07-20T14:59:07Z

Here it is for each of your two definitions of a:

a['d'][0,...][3] = b
a['d'][0,...][()] = b

The workaround works by first avoiding indexing a np.void, which doesn't support preserving dimension. At any rate, indexing by field name is much more readable.

... in an index means "always return a view, never a scalar"

NeilGirdhar · 2017-07-20T15:02:23Z

That works, but it doesn't make any sense to me 😄

eric-wieser · 2017-07-20T15:05:01Z

Step by step:

i = ()  # or 3, in your other example
a[0][2][i]   # original
a[0]['d'][i] # index by field name, not field index
a['d'][0][i] # field name index can go anywhere
a['d'][0,...][i] # adding ... makes this return a 0d array, not an np.bool_

a[0,...]['d'][i] works too

NeilGirdhar · 2017-07-20T15:06:51Z

Got it, thanks.

eric-wieser · 2017-07-20T15:50:14Z

To summarize, I think the underlying bug here is a total lack of support for distinguishing 0d fields and scalar fields in a subdtype.

I don't think there's an easy fix, because part of the problem is that subdtypes are not first-class citizens in the dtype world, as they decay very quickly into extra dimensions.

NeilGirdhar · 2017-07-20T15:54:00Z

I don't know enough about how they're implemented, but I guess you can't just remove that condition in the code that you linked?

eric-wieser · 2017-07-20T15:57:05Z

I've tried that, and it doesn't help. The next problem is that a['d'] is indistinguishable for dtypes (bool, ()) and bool, because of how subdtype expansion works - in general, a['field'].shape == a.shape + a.dtype['field'].shape, but a['field'].dtype.shape == ()

hpaulj · 2017-07-20T22:09:49Z

I think that subdtype expansion is an important consideration. Such an expansion is one of the most common ways of using a structured array. Most of the recfunctions work by copying data from one array to another by field name. It would be difficult to define an expansion (and its indexing) that distinguishes between scalar and 0d fields, and at the same time remains consistent with 1d and higher dimensional fields.

eric-wieser · 2017-07-20T22:27:49Z

The type of (compatibility-breaking) change I'm envisaging is:

>>> a_dt = np.dtype([('v', int, (3,))])
>>> a = np.empty(4, a_dt)

>>> x = a['x']  # new behaviour
>>> x.dtype
dtype(('<i4', 3))
>>> x.shape
(4,)

>>> x_old = x.view(int)  # workaround to regain the old behaviour
>>> x_old.dtype
dtype('<i4')
>>> x_old.shape
(4,3)

Of course, that's the type of change we can never make, unless we start using context managers to enable new semantics

NeilGirdhar · 2017-07-20T22:49:48Z

You could keep a list of all of the breaking changes you would like to make, and then if that list gets long enough, one day, implement that context manager? (Because I agree, this is not super-motivating.)

eric-wieser · 2017-07-20T23:34:40Z

Actually, it turns out that context managers over global state are not a safe way to change semantics: #9444

NeilGirdhar · 2017-07-20T23:38:18Z

That's fascinating. I don't know the answer. Please consider posting this to python-ideas to start a discussion about how this is supposed to work.

NeilGirdhar changed the title ~~Zero-dimensional numpy arrays within records support don't support item assignment~~ Zero-dimensional numpy arrays within records don't support item assignment Jul 20, 2017

eric-wieser changed the title ~~Zero-dimensional numpy arrays within records don't support item assignment~~ BUG: Zero-dimensional numpy arrays within records decay to scalars Sep 25, 2017

eric-wieser added the 00 - Bug label Sep 29, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Zero-dimensional numpy arrays within records decay to scalars #9442

BUG: Zero-dimensional numpy arrays within records decay to scalars #9442

NeilGirdhar commented Jul 20, 2017 •

edited

eric-wieser commented Jul 20, 2017 •

edited

eric-wieser commented Jul 20, 2017

eric-wieser commented Jul 20, 2017

eric-wieser commented Jul 20, 2017 •

edited

NeilGirdhar commented Jul 20, 2017

eric-wieser commented Jul 20, 2017 •

edited

NeilGirdhar commented Jul 20, 2017

eric-wieser commented Jul 20, 2017 •

edited

NeilGirdhar commented Jul 20, 2017

eric-wieser commented Jul 20, 2017

NeilGirdhar commented Jul 20, 2017

eric-wieser commented Jul 20, 2017 •

edited

hpaulj commented Jul 20, 2017

eric-wieser commented Jul 20, 2017 •

edited

NeilGirdhar commented Jul 20, 2017 •

edited

eric-wieser commented Jul 20, 2017

NeilGirdhar commented Jul 20, 2017 •

edited

BUG: Zero-dimensional numpy arrays within records decay to scalars #9442

BUG: Zero-dimensional numpy arrays within records decay to scalars #9442

Comments

NeilGirdhar commented Jul 20, 2017 • edited

eric-wieser commented Jul 20, 2017 • edited

eric-wieser commented Jul 20, 2017

eric-wieser commented Jul 20, 2017

eric-wieser commented Jul 20, 2017 • edited

NeilGirdhar commented Jul 20, 2017

eric-wieser commented Jul 20, 2017 • edited

NeilGirdhar commented Jul 20, 2017

eric-wieser commented Jul 20, 2017 • edited

NeilGirdhar commented Jul 20, 2017

eric-wieser commented Jul 20, 2017

NeilGirdhar commented Jul 20, 2017

eric-wieser commented Jul 20, 2017 • edited

hpaulj commented Jul 20, 2017

eric-wieser commented Jul 20, 2017 • edited

NeilGirdhar commented Jul 20, 2017 • edited

eric-wieser commented Jul 20, 2017

NeilGirdhar commented Jul 20, 2017 • edited

NeilGirdhar commented Jul 20, 2017 •

edited

eric-wieser commented Jul 20, 2017 •

edited

eric-wieser commented Jul 20, 2017 •

edited

eric-wieser commented Jul 20, 2017 •

edited

eric-wieser commented Jul 20, 2017 •

edited

eric-wieser commented Jul 20, 2017 •

edited

eric-wieser commented Jul 20, 2017 •

edited

NeilGirdhar commented Jul 20, 2017 •

edited

NeilGirdhar commented Jul 20, 2017 •

edited