Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: review return values for PyArray_DescrNew #20960

Merged
merged 14 commits into from
Feb 2, 2022
Merged

Conversation

mattip
Copy link
Member

@mattip mattip commented Feb 1, 2022

Fixes #19038

Review all the places PyArray_DescrNew and first derivative functions are used, and make sure to check the return value. Functions grepped for:

  • PyArray_DescrNew (API function)
  • PyArray_DescrNewFromType (API function)
  • PyArray_DescrNewByteOrder (API function)
  • PyArray_DESCR_REPLACE (macro)
  • ensure_dtype_nbo (helper function)
  • _descriptor_from_pep3118_format (helper function)
  • arraydescr_newbyteorder (helper function)

An alternative would be to use exit(1) where malloc fails in PyArray_DescrNew since allocating a new descriptor should only fail on out-of-memory, and there is no real way to recover from that. Here are a few recommendations for using exit(1) (thanks @rgommers):

@@ -649,8 +649,7 @@ fromstring_null_term_c_api(PyObject *dummy, PyObject *byte_obj)
if (string == NULL) {
return NULL;
}
descr = PyArray_DescrNewFromType(NPY_FLOAT64);
return PyArray_FromString(string, -1, descr, -1, " ");
return PyArray_FromString(string, -1, NULL, -1, " ");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allow PyArray_FromString to allocate the descriptor:

if (dtype == NULL) {
dtype=PyArray_DescrFromType(NPY_DEFAULT_TYPE);
if (dtype == NULL) {
return NULL;

@mattip mattip changed the title Retval ENH: review return values for PyArray_DescrNew Feb 1, 2022
@mattip mattip added 01 - Enhancement 09 - Backport-Candidate PRs tagged should be backported labels Feb 1, 2022
Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Tricky stuff (although meaningless probability for running into)... a few suggestions that should be good, and 2 places where suggesting didn't do the trick. Let me know if you prefer me applying those changes.

numpy/core/src/multiarray/ctors.c Show resolved Hide resolved
@@ -1382,7 +1382,7 @@ PyArray_DescrNewFromType(int type_num)

old = PyArray_DescrFromType(type_num);
new = PyArray_DescrNew(old);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work PyArray_DescrNew does not accept NULL

numpy/core/src/multiarray/getset.c Outdated Show resolved Hide resolved
numpy/core/src/multiarray/methods.c Show resolved Hide resolved
numpy/core/src/multiarray/methods.c Show resolved Hide resolved
numpy/core/src/multiarray/methods.c Show resolved Hide resolved
numpy/core/src/multiarray/nditer_constr.c Outdated Show resolved Hide resolved
numpy/core/src/multiarray/scalartypes.c.src Outdated Show resolved Hide resolved
numpy/core/src/umath/ufunc_type_resolution.c Outdated Show resolved Hide resolved
@seberg
Copy link
Member

seberg commented Feb 1, 2022

I don't mind accepting NULL in for PyArray_NewFromDescr (passing through the error). I generally prefer if the caller sanitizes, but it seems a good exception to me here.

mattip and others added 2 commits February 1, 2022 20:24
Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
mattip and others added 2 commits February 2, 2022 00:44
Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
@mattip
Copy link
Member Author

mattip commented Feb 1, 2022

PyArray_NewFromDescr is an API function, so it is on us to check the inputs.

if (descr == NULL) {
#ifdef __GNUC__
#pragma GCC diagnostic pop
#endif
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why I had to work so hard to do this check. The function is NUMPY_API, so it is on us to sanitize input. Why does gcc think descr can never be NULL?

Copy link
Member

@seberg seberg Feb 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree that API automatically means we have to sanitize, I do not think the Python C-API does this (EDIT: to be clear, does this always, not often, I think also for Python it is pretty common). But it is fairly common for us to do, and in this case I think the pattern "relying" on it (at least theoretically) is common enough to do it.

The generated header include:

NPY_NO_EXPORT NPY_STEALS_REF_TO_ARG(2) NPY_GCC_NONNULL(1) NPY_GCC_NONNULL(2) PyObject * PyArray_NewFromDescr \
       (PyTypeObject *, PyArray_Descr *, int, npy_intp const *, npy_intp const *, void *, int, PyObject *);

So we mark this as NONNULL explicitly, actually. This is done here:

'PyArray_NewFromDescr': (94, StealRef(2), NonNull([1, 2])),

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can get away with not sanitizing inputs (I think best practices is always to be liberal with input and strict with output), but using NonNull on API interfaces seems to me to be an anti-pattern. I will open an issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #20971

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea about the merit of having this in the public API. I think we should just change it to NonNull([1]) for the sake of this PR, though?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do we can only put it into 1.23

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, isn't that fully ABI/API compatible, since it is just an attribute (and relaxing an attribute at that)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, lets discuss tomorrow or so. I thought we may not have to put any error at all (anywhere), because passing in NULL should only happen if the NULL exists due to an error being already in-progress. And if that is not true, Python will set a SystemError...

Copy link
Member Author

@mattip mattip Feb 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave that for #20971. For this PR, I pushed the check one level down. removed the check

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the check.

mattip and others added 5 commits February 2, 2022 01:21
Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
The dealloc is now part of the Py_DECREF(ret) and handled there.
Doing it here would decref it twice.
It is probably not good to call PyObject_GetIter() if dtype is NULL
and an error is already in progress...
(If we check for it, lets try to do it right.)
Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a small fixup and removed all custom error messages (to be clear, I am OK with re-adding them).

So effectively, this fixes up a bit of shady error checks, and otherwise now semi-officially allows calling a few/most(?) array-creation functions with a NULL dtype. These functions will then return NULL immediately while assuming that an error is already set.

`DescrNewFromType` cannot fail in most cases, but if it does,
DescrNew does not accept NULL as input.
@charris charris merged commit ac87f07 into numpy:main Feb 2, 2022
@charris
Copy link
Member

charris commented Feb 2, 2022

Thanks Matti, thanks Sebastian.

charris pushed a commit to charris/numpy that referenced this pull request Feb 2, 2022
* ENH: review return value from PyArray_DescrNew* calls

* BUG: remove unused variable

* BUG: typo

* Update numpy/core/src/multiarray/methods.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* Update numpy/core/src/multiarray/methods.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* Update numpy/core/src/multiarray/getset.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* Update numpy/core/src/multiarray/methods.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* fixes from review

* Update numpy/core/src/umath/ufunc_type_resolution.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* move check to internal function

* remove check

* Remove unnecessary dealloc

The dealloc is now part of the Py_DECREF(ret) and handled there.
Doing it here would decref it twice.

* MAINT: Remove custom error message (and small cleanup)

It is probably not good to call PyObject_GetIter() if dtype is NULL
and an error is already in progress...
(If we check for it, lets try to do it right.)

* Fixup DescrNewFromType

`DescrNewFromType` cannot fail in most cases, but if it does,
DescrNew does not accept NULL as input.

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Feb 2, 2022
charris added a commit that referenced this pull request Feb 3, 2022
ENH: review return values for PyArray_DescrNew (#20960)
@charris
Copy link
Member

charris commented Feb 3, 2022

I notice that numpy.sort is also part of the CVE, did we fix that?

@seberg
Copy link
Member

seberg commented Feb 3, 2022

yes, it was.

melissawm pushed a commit to melissawm/numpy that referenced this pull request Apr 12, 2022
* ENH: review return value from PyArray_DescrNew* calls

* BUG: remove unused variable

* BUG: typo

* Update numpy/core/src/multiarray/methods.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* Update numpy/core/src/multiarray/methods.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* Update numpy/core/src/multiarray/getset.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* Update numpy/core/src/multiarray/methods.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* fixes from review

* Update numpy/core/src/umath/ufunc_type_resolution.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* move check to internal function

* remove check

* Remove unnecessary dealloc

The dealloc is now part of the Py_DECREF(ret) and handled there.
Doing it here would decref it twice.

* MAINT: Remove custom error message (and small cleanup)

It is probably not good to call PyObject_GetIter() if dtype is NULL
and an error is already in progress...
(If we check for it, lets try to do it right.)

* Fixup DescrNewFromType

`DescrNewFromType` cannot fail in most cases, but if it does,
DescrNew does not accept NULL as input.

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
dbalchev pushed a commit to dbalchev/numpy that referenced this pull request Apr 19, 2022
* ENH: review return value from PyArray_DescrNew* calls

* BUG: remove unused variable

* BUG: typo

* Update numpy/core/src/multiarray/methods.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* Update numpy/core/src/multiarray/methods.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* Update numpy/core/src/multiarray/getset.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* Update numpy/core/src/multiarray/methods.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* fixes from review

* Update numpy/core/src/umath/ufunc_type_resolution.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* move check to internal function

* remove check

* Remove unnecessary dealloc

The dealloc is now part of the Py_DECREF(ret) and handled there.
Doing it here would decref it twice.

* MAINT: Remove custom error message (and small cleanup)

It is probably not good to call PyObject_GetIter() if dtype is NULL
and an error is already in progress...
(If we check for it, lets try to do it right.)

* Fixup DescrNewFromType

`DescrNewFromType` cannot fail in most cases, but if it does,
DescrNew does not accept NULL as input.

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
seberg added a commit to seberg/numpy that referenced this pull request Apr 24, 2022
* ENH: review return value from PyArray_DescrNew* calls

* BUG: remove unused variable

* BUG: typo

* Update numpy/core/src/multiarray/methods.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* Update numpy/core/src/multiarray/methods.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* Update numpy/core/src/multiarray/getset.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* Update numpy/core/src/multiarray/methods.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* fixes from review

* Update numpy/core/src/umath/ufunc_type_resolution.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* move check to internal function

* remove check

* Remove unnecessary dealloc

The dealloc is now part of the Py_DECREF(ret) and handled there.
Doing it here would decref it twice.

* MAINT: Remove custom error message (and small cleanup)

It is probably not good to call PyObject_GetIter() if dtype is NULL
and an error is already in progress...
(If we check for it, lets try to do it right.)

* Fixup DescrNewFromType

`DescrNewFromType` cannot fail in most cases, but if it does,
DescrNew does not accept NULL as input.

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
@mattip mattip deleted the retval branch December 27, 2022 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing return-value validation of the function PyArray_DescrNew
3 participants