-
-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: review return values for PyArray_DescrNew #20960
Conversation
@@ -649,8 +649,7 @@ fromstring_null_term_c_api(PyObject *dummy, PyObject *byte_obj) | |||
if (string == NULL) { | |||
return NULL; | |||
} | |||
descr = PyArray_DescrNewFromType(NPY_FLOAT64); | |||
return PyArray_FromString(string, -1, descr, -1, " "); | |||
return PyArray_FromString(string, -1, NULL, -1, " "); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allow PyArray_FromString to allocate the descriptor:
numpy/numpy/core/src/multiarray/ctors.c
Lines 3743 to 3746 in c65bc21
if (dtype == NULL) { | |
dtype=PyArray_DescrFromType(NPY_DEFAULT_TYPE); | |
if (dtype == NULL) { | |
return NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Tricky stuff (although meaningless probability for running into)... a few suggestions that should be good, and 2 places where suggesting didn't do the trick. Let me know if you prefer me applying those changes.
@@ -1382,7 +1382,7 @@ PyArray_DescrNewFromType(int type_num) | |||
|
|||
old = PyArray_DescrFromType(type_num); | |||
new = PyArray_DescrNew(old); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work PyArray_DescrNew
does not accept NULL
I don't mind accepting |
Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
|
numpy/core/src/multiarray/ctors.c
Outdated
if (descr == NULL) { | ||
#ifdef __GNUC__ | ||
#pragma GCC diagnostic pop | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why I had to work so hard to do this check. The function is NUMPY_API, so it is on us to sanitize input. Why does gcc think descr can never be NULL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't agree that API automatically means we have to sanitize, I do not think the Python C-API does this (EDIT: to be clear, does this always, not often, I think also for Python it is pretty common). But it is fairly common for us to do, and in this case I think the pattern "relying" on it (at least theoretically) is common enough to do it.
The generated header include:
NPY_NO_EXPORT NPY_STEALS_REF_TO_ARG(2) NPY_GCC_NONNULL(1) NPY_GCC_NONNULL(2) PyObject * PyArray_NewFromDescr \
(PyTypeObject *, PyArray_Descr *, int, npy_intp const *, npy_intp const *, void *, int, PyObject *);
So we mark this as NONNULL explicitly, actually. This is done here:
numpy/numpy/core/code_generators/numpy_api.py
Line 140 in 855b5ed
'PyArray_NewFromDescr': (94, StealRef(2), NonNull([1, 2])), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can get away with not sanitizing inputs (I think best practices is always to be liberal with input and strict with output), but using NonNull on API interfaces seems to me to be an anti-pattern. I will open an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in #20971
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea about the merit of having this in the public API. I think we should just change it to NonNull([1])
for the sake of this PR, though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do we can only put it into 1.23
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, isn't that fully ABI/API compatible, since it is just an attribute (and relaxing an attribute at that)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, lets discuss tomorrow or so. I thought we may not have to put any error at all (anywhere), because passing in NULL should only happen if the NULL
exists due to an error being already in-progress. And if that is not true, Python will set a SystemError
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's leave that for #20971. For this PR, I pushed the check one level down. removed the check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the check.
Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
The dealloc is now part of the Py_DECREF(ret) and handled there. Doing it here would decref it twice.
It is probably not good to call PyObject_GetIter() if dtype is NULL and an error is already in progress... (If we check for it, lets try to do it right.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a small fixup and removed all custom error messages (to be clear, I am OK with re-adding them).
So effectively, this fixes up a bit of shady error checks, and otherwise now semi-officially allows calling a few/most(?) array-creation functions with a NULL
dtype. These functions will then return NULL
immediately while assuming that an error is already set.
`DescrNewFromType` cannot fail in most cases, but if it does, DescrNew does not accept NULL as input.
Thanks Matti, thanks Sebastian. |
* ENH: review return value from PyArray_DescrNew* calls * BUG: remove unused variable * BUG: typo * Update numpy/core/src/multiarray/methods.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * Update numpy/core/src/multiarray/methods.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * Update numpy/core/src/multiarray/getset.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * Update numpy/core/src/multiarray/methods.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * fixes from review * Update numpy/core/src/umath/ufunc_type_resolution.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * move check to internal function * remove check * Remove unnecessary dealloc The dealloc is now part of the Py_DECREF(ret) and handled there. Doing it here would decref it twice. * MAINT: Remove custom error message (and small cleanup) It is probably not good to call PyObject_GetIter() if dtype is NULL and an error is already in progress... (If we check for it, lets try to do it right.) * Fixup DescrNewFromType `DescrNewFromType` cannot fail in most cases, but if it does, DescrNew does not accept NULL as input. Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
ENH: review return values for PyArray_DescrNew (#20960)
I notice that numpy.sort is also part of the CVE, did we fix that? |
yes, it was. |
* ENH: review return value from PyArray_DescrNew* calls * BUG: remove unused variable * BUG: typo * Update numpy/core/src/multiarray/methods.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * Update numpy/core/src/multiarray/methods.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * Update numpy/core/src/multiarray/getset.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * Update numpy/core/src/multiarray/methods.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * fixes from review * Update numpy/core/src/umath/ufunc_type_resolution.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * move check to internal function * remove check * Remove unnecessary dealloc The dealloc is now part of the Py_DECREF(ret) and handled there. Doing it here would decref it twice. * MAINT: Remove custom error message (and small cleanup) It is probably not good to call PyObject_GetIter() if dtype is NULL and an error is already in progress... (If we check for it, lets try to do it right.) * Fixup DescrNewFromType `DescrNewFromType` cannot fail in most cases, but if it does, DescrNew does not accept NULL as input. Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
* ENH: review return value from PyArray_DescrNew* calls * BUG: remove unused variable * BUG: typo * Update numpy/core/src/multiarray/methods.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * Update numpy/core/src/multiarray/methods.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * Update numpy/core/src/multiarray/getset.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * Update numpy/core/src/multiarray/methods.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * fixes from review * Update numpy/core/src/umath/ufunc_type_resolution.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * move check to internal function * remove check * Remove unnecessary dealloc The dealloc is now part of the Py_DECREF(ret) and handled there. Doing it here would decref it twice. * MAINT: Remove custom error message (and small cleanup) It is probably not good to call PyObject_GetIter() if dtype is NULL and an error is already in progress... (If we check for it, lets try to do it right.) * Fixup DescrNewFromType `DescrNewFromType` cannot fail in most cases, but if it does, DescrNew does not accept NULL as input. Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
* ENH: review return value from PyArray_DescrNew* calls * BUG: remove unused variable * BUG: typo * Update numpy/core/src/multiarray/methods.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * Update numpy/core/src/multiarray/methods.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * Update numpy/core/src/multiarray/getset.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * Update numpy/core/src/multiarray/methods.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * fixes from review * Update numpy/core/src/umath/ufunc_type_resolution.c Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net> * move check to internal function * remove check * Remove unnecessary dealloc The dealloc is now part of the Py_DECREF(ret) and handled there. Doing it here would decref it twice. * MAINT: Remove custom error message (and small cleanup) It is probably not good to call PyObject_GetIter() if dtype is NULL and an error is already in progress... (If we check for it, lets try to do it right.) * Fixup DescrNewFromType `DescrNewFromType` cannot fail in most cases, but if it does, DescrNew does not accept NULL as input. Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
Fixes #19038
Review all the places
PyArray_DescrNew
and first derivative functions are used, and make sure to check the return value. Functions grepped for:PyArray_DescrNew
(API function)PyArray_DescrNewFromType
(API function)PyArray_DescrNewByteOrder
(API function)PyArray_DESCR_REPLACE
(macro)ensure_dtype_nbo
(helper function)_descriptor_from_pep3118_format
(helper function)arraydescr_newbyteorder
(helper function)An alternative would be to use
exit(1)
where malloc fails inPyArray_DescrNew
since allocating a new descriptor should only fail on out-of-memory, and there is no real way to recover from that. Here are a few recommendations for usingexit(1)
(thanks @rgommers):https://titanwolf.org/Network/Articles/Article?AID=55f68680-740b-4f0b-ad05-a991894108bb.
https://eli.thegreenplace.net/2009/10/30/handling-out-of-memory-conditions-in-c