-
-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
np.dtype('float64').__class__
not picklable in numpy master
#16692
Comments
Not as good as a bisect but that may help a little bit. I looked at the dev weels available at https://pypi.anaconda.org/scipy-wheels-nightly/simple/numpy/ So the behaviour change seems to appear a little time ago, between commit daffccf (May 24 last wheel without the issue) and cbde478 (May 31 first wheel with the issue).
|
Oh, that makes sense, hmmm, I did not think about pickles, have to figure out how to fix that... I guess I may have to implement actual pickling manually. Basically the pickling for this this has to use |
@lesteve curious, can you explain what the quirks you are working around are and how they interact. On master The quickest fix would be to change the name of the class to some underscored version for now I guess. Although I like the printing of |
Hmmm, there is one other problem (or semi-problem), in that whatever I add now, has to remain supported. Plus, it is impossible to make this work in a forward compatible way. If you pickle on master, and send it to an older NumPy, that cannot work, since the old NumPy doesn't have a notion for this. And I do not think there is any way to achieve even a nice error message. But, I guess pickling |
@lesteve how pressing is this for you guys? The only easy solution to this right now would be to give all of these a valid fully qualified name (which technically has to remain valid forever). I want to add |
@seberg can we switch to the underscores now and switch to heaptypes before the next release or is that too aggressive/disruptive ? |
Yeah, I think it is not a big problem to switch to HeapTypes... There are some gory details to figure out (creating HeapType classes from a metaclass is not something the Python C-API makes straight forward). So I thought this was much simpler for now. Just not sure we will have it done next week :). We still have to make |
Maybe we can rewrite the code in scikit-learn to avoid pickling dtype classes and avoid the problem? |
That would be great, although in the mid-term, we have to fix it anyway, I think. Its weird not to be able to pickle a class. |
Sorry I completely dropped the ball on this one ...
I was involved in writing these lines 4-5 years ago, I browsed the associated issues and I remember that they were needed to fix some hairy bugs in joblib. I created joblib/joblib#1080 to discuss whether it is possible to avoid relying on
My biased opinion as a downstream numpy consumer is that it would be great if this could be fixed in numpy. If I put my maintainer hat, I do feel the pain of needing to maintain something where you mix Python C-API, metaclass and gory details in the same sentence to make life easier for users 😉 ... |
I have a workaround ready in joblib/joblib#1082 if numpy 1.20 ships without fixing the picklability of dtype classes. |
Good to know. Honestly, pickling the class in this way seems a bit shady to begin with, and the reason why you are doing it is likely a bigger issue than whether or not the class can be pickled. So at least for joblib, we should rather make sure that unpickling gives you the singleton instance when appropriate. |
@BvB93 we don't have any ambition putting the DType classes into |
I'm going to push this off as I doubt a fix will be available for 1.20.0. |
I just realized that there is after all a simple solution to this. Sorry about this, we will have a fix in 1.20.1 I think, and that is currently planned to be released by next week. |
Introducing the metaclass without a canonical top-level name broke pickling of `type(np.dtype(...))` (which was always just `np.dtype`). While a better solution will likely be possible by making the DTypes HeapTypes and this solution may not work for all imaginable cases (i.e. it is plausible for a dtype to not have a scalar type associated), using a `copyreg` registration for the metaclass surprisingly works without any issues and seems like the simplest solution right now. Closes numpygh-16692, numpygh-18325
@BvB93, yeah, thanks for looking. I had not thought of this approach/realized it works for static "builtin" types. I am not certain it will solve all issues around this, since joblib (and similar) might do more than just pickle the class (for one try to actually instantiate it with some arguments ;)... outrageous, I know!). |
Just a FYI in joblib, we decided to change slightly the way our hashing work to avoid the problem: joblib/joblib#1136. Just trying to be more explicit: if the only complaint about this problem was from joblib then one option could be to not fix this issue. Reading the comments above, it seems that Dask may be affected by this from dask/dask#7170 (but not 100% clear yet) |
Introducing the metaclass without a canonical top-level name broke pickling of `type(np.dtype(...))` (which was always just `np.dtype`). While a better solution will likely be possible by making the DTypes HeapTypes and this solution may not work for all imaginable cases (i.e. it is plausible for a dtype to not have a scalar type associated), using a `copyreg` registration for the metaclass surprisingly works without any issues and seems like the simplest solution right now. Closes numpygh-16692, numpygh-18325
Removed commits included in branch by imperfect branch rebasing on assembling hickle-5-RC branch. Numpy has introduced with version 1.20 a disrupting change which prevents numpy.dtype type objects like np.int16 be pickled. Trying to pickle numpy.dtypes causes pickling error in python 3.7 and python 3.8 which utilize numpy > 1.20. As this error will according to numpy issue numpy/numpy#16692 not before numpy 1.21 is released. To fix for now set numpy requirement to be less than 1.20. Recheck required when numpy 1.21 is released. If fixed update requirement that all numpy 1.20 versions will be properly excluded from beeing supported by hickle.
Removed commits included in branch by imperfect branch rebasing on assembling hickle-5-RC branch. Numpy has introduced with version 1.20 a disrupting change which prevents numpy.dtype type objects like np.int16 be pickled. Trying to pickle numpy.dtypes causes pickling error in python 3.7 and python 3.8 which utilize numpy > 1.20. As this error will according to numpy issue numpy/numpy#16692 not before numpy 1.21 is released. To fix for now set numpy requirement to be less than 1.20. Recheck required when numpy 1.21 is released. If fixed update requirement that all numpy 1.20 versions will be properly excluded from beeing supported by hickle. Latest cryptography packages required by tox changed their dependency upon rust compiler. This change makes installation of virtualenv fail if not like for pip its latest versoin is used. Therefor virtualenv is now included in pip --update line Deviating from former plans missing indexer on regex match class in python 3.5 fixed. Latest HDF5 versions (>= 1.12) seem to not support anymor windows 32 bit platform. This causes all appveyor builds on windows 32 bit to fail exempt for python 3.5. It seems h5py 2.10 seems to be the last supporting python 3.5. Added TOXENV entry to appveyor yaml as tox_appveryor seems not anymore limit environments to be tested to the selected testing environment.
Removed commits included in branch by imperfect branch rebasing on assembling hickle-5-RC branch. Numpy has introduced with version 1.20 a disrupting change which prevents numpy.dtype type objects like np.int16 be pickled. Trying to pickle numpy.dtypes causes pickling error in python 3.7 and python 3.8 which utilize numpy > 1.20. As this error will according to numpy issue numpy/numpy#16692 not before numpy 1.21 is released. To fix for now set numpy requirement to be less than 1.20. Recheck required when numpy 1.21 is released. If fixed update requirement that all numpy 1.20 versions will be properly excluded from beeing supported by hickle. Latest cryptography packages required by tox changed their dependency upon rust compiler. This change makes installation of virtualenv fail if not like for pip its latest versoin is used. Therefor virtualenv is now included in pip --update line Deviating from former plans missing indexer on regex match class in python 3.5 fixed. Latest HDF5 versions (>= 1.12) seem to not support anymor windows 32 bit platform. This causes all appveyor builds on windows 32 bit to fail exempt for python 3.5. It seems h5py 2.10 seems to be the last supporting python 3.5. Added TOXENV entry to appveyor yaml as tox_appveryor seems not anymore limit environments to be tested to the selected testing environment.
Removed commits included in branch by imperfect branch rebasing on assembling hickle-5-RC branch. Numpy has introduced with version 1.20 a disrupting change which prevents numpy.dtype type objects like np.int16 be pickled. Trying to pickle numpy.dtypes causes pickling error in python 3.7 and python 3.8 which utilize numpy > 1.20. As this error will according to numpy issue numpy/numpy#16692 not before numpy 1.21 is released. To fix for now set numpy requirement to be less than 1.20. Recheck required when numpy 1.21 is released. If fixed update requirement that all numpy 1.20 versions will be properly excluded from beeing supported by hickle. Latest cryptography packages required by tox changed their dependency upon rust compiler. This change makes installation of virtualenv fail if not like for pip its latest versoin is used. Therefor virtualenv is now included in pip --update line Deviating from former plans missing indexer on regex match class in python 3.5 fixed. Latest HDF5 versions (>= 1.12) seem to not support anymor windows 32 bit platform. This causes all appveyor builds on windows 32 bit to fail exempt for python 3.5. It seems h5py 2.10 seems to be the last supporting python 3.5. Added TOXENV entry to appveyor yaml as tox_appveryor seems not anymore limit environments to be tested to the selected testing environment.
Removed commits included in branch by imperfect branch rebasing on assembling hickle-5-RC branch. Numpy has introduced with version 1.20 a disrupting change which prevents numpy.dtype type objects like np.int16 be pickled. Trying to pickle numpy.dtypes causes pickling error in python 3.7 and python 3.8 which utilize numpy > 1.20. As this error will according to numpy issue numpy/numpy#16692 not before numpy 1.21 is released. To fix for now set numpy requirement to be less than 1.20. Recheck required when numpy 1.21 is released. If fixed update requirement that all numpy 1.20 versions will be properly excluded from beeing supported by hickle. Latest cryptography packages required by tox changed their dependency upon rust compiler. This change makes installation of virtualenv fail if not like for pip its latest versoin is used. Therefor virtualenv is now included in pip --update line Deviating from former plans missing indexer on regex match class in python 3.5 fixed. Latest HDF5 versions (>= 1.12) seem to not support anymor windows 32 bit platform. This causes all appveyor builds on windows 32 bit to fail exempt for python 3.5. It seems h5py 2.10 seems to be the last supporting python 3.5. Added TOXENV entry to appveyor yaml as tox_appveryor seems not anymore limit environments to be tested to the selected testing environment.
We are seeing some errors in scikit-learn CI using numpy dev recently see scikit-learn/scikit-learn#17707 (comment) for some context.
After debugging the problem it comes from the fact that
np.dtype('float64').__class__
is not picklable in numpy development version. I haven't done a git bisect but I can do one if that would help.For completeness the reason we care about this lies within
joblib
:https://github.com/joblib/joblib/blob/ac0f1528ffec9cd53cabbafaca887ac880697862/joblib/hashing.py#L222-L235
Reproducing code example:
Output numpy development version:
Ouptut numpy 1.19.0:
The text was updated successfully, but these errors were encountered: