Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEP: Next step in scalar type alias deprecations/futurewarnings #22607

Merged
merged 5 commits into from Nov 21, 2022

Conversation

seberg
Copy link
Member

@seberg seberg commented Nov 17, 2022

Finalizes the scalar type alias deprecations making them an error.
However, at the same time adds a FutureWarning that new aliases
will be introduced in the future.
(They would eventually be preferred over the str_, etc. version.)

It may make sense, that this FutureWarning is already propelled soon
since it interacts with things such as changing the representation of
strings to np.str_("") if the preferred alias becomes np.str.

It also introduces a new deprecation to remove the 0 sized bit-aliases
and the bitsize bool8 alias. (Unfortunately, these are here still allowed
as part of the np.sctypeDict).

xref gh-22021

Finalizes the scalar type alias deprecations making them an error.
However, at the same time adds a `FutureWarning` that new aliases
will be introduced in the future.
(They would eventually be preferred over the `str_`, etc. version.)

It may make sense, that this FutureWarning is already propelled soon
since it interacts with things such as changing the representation of
strings to `np.str_("")` if the preferred alias becomes `np.str`.

It also introduces a new deprecation to remove the 0 sized bit-aliases
and the bitsize `bool8` alias.  (Unfortunately, these are here still allowed
as part of the `np.sctypeDict`).
@seberg
Copy link
Member Author

seberg commented Nov 17, 2022

@rgommers in the spirit of having a chance to get this into the next release (since you marked the issue for that), maybe you can have a look/review?

I did feel like adding that object0 deprecation, but could pull it out, since it is somewhat different. Also, I added long to the list of new aliases. This has a bit of a point, because I feel it may help navigating the transitioning the default integer eventually. (int is a terrible alias for long if we want the default integer to not be long)

@seberg
Copy link
Member Author

seberg commented Nov 17, 2022

@rossbar not sure it matters, but circleCI is showing a lot of these warnings:

reading sources... [ 82%] reference/generated/numpy.show_config .. reference/generated/numpy.ufunc.identity
/home/circleci/repo/venv/lib/python3.8/site-packages/sphinx/ext/autosummary/__init__.py:678: FutureWarning: In the future `np.bytes` will be defined as the corresponding NumPy scalar.  (This may have returned Python scalars in past versions.
  return getattr(mod, name_parts[-1]), mod, modname
/home/circleci/repo/venv/lib/python3.8/site-packages/sphinx/ext/autosummary/__init__.py:701: FutureWarning: In the future `np.bytes` will be defined as the corresponding NumPy scalar.  (This may have returned Python scalars in past versions.
  obj = getattr(obj, obj_name)
/home/circleci/repo/venv/lib/python3.8/site-packages/sphinx/ext/autosummary/__init__.py:678: FutureWarning: In the future `np.str` will be defined as the corresponding NumPy scalar.  (This may have returned Python scalars in past versions.
  return getattr(mod, name_parts[-1]), mod, modname
/home/circleci/repo/venv/lib/python3.8/site-packages/sphinx/ext/autosummary/__init__.py:701: FutureWarning: In the future `np.str` will be defined as the corresponding NumPy scalar.  (This may have returned Python scalars in past versions.
  obj = getattr(obj, obj_name)
reading sources... [ 85%] reference/generated/numpy.ufunc.nargs

I am a bit confused by it, I think they may be coming from the Parameter types, like:

Parameters
----------
    arg : str or None

which seems strange? CI doesn't fail, but should I worry about it?

@rossbar
Copy link
Contributor

rossbar commented Nov 17, 2022

Yeah that is a lot of warnings. I wouldn't think it's related to the parameter descriptions as that's only parsed by numpydoc and these warnings are originating from autosummary. From the warning it looks like this is coming from however autosummary does object discovery, but I'll have to take a closer look.

@rgommers
Copy link
Member

Thanks @seberg!

I did feel like adding that object0 deprecation, but could pull it out, since it is somewhat different.

That looks like a good idea to me.

Also, I added long to the list of new aliases. This has a bit of a point, because I feel it may help navigating the transitioning the default integer eventually. (int is a terrible alias for long if we want the default integer to not be long)

This seems less desirable. np.long is already a deprecated alias now, to np.compat.long. And long itself is a terrible thing in C, good way to get portability issues. I'm not sure I understand why you want it - don't we already have intp for this purpose?

@seberg
Copy link
Member Author

seberg commented Nov 20, 2022

To be clear, long is part of those that would now fail, but also give a FutureWarning that their meaning changes it will be defined in a future version.

@seberg
Copy link
Member Author

seberg commented Nov 20, 2022

Right now, I thought mainly that it may be useful to have the option, but lets see:

  • np.int should probably not exist for a while, it would have to be long which is bad? np.int_ is currently actually not deprecated and the canonical spelling of "long"
  • If np.long is available, then we could get rid of the confusing np.int_ (i.e. why I thought the warning is good).

The big problem I am aware of is Cython, since it has the following (and equivalent unsigned):

ctypedef npy_long       int_t
ctypedef npy_longlong   long_t
ctypedef npy_longlong   longlong_t

And somehow, such use needs to transition. Maybe long_t could just be removed (it is a weird name for something else and np.long is gone).
If we repurpose is quickly np.int_t could be removed? But even then, the numpy "default integer" will have to be spelled as cnp.intp_t potentially in the future, or we change the bit-width of Cython code using np.int_t. OTOH, only on windows, maybe we can get away with it.

@rgommers
Copy link
Member

Hmm, such integer puzzles always hurt my head. Removing long_t in Cython sounds nice.

If np.long is available, then we could get rid of the confusing np.int_ (i.e. why I thought the warning is good).

long is bad too though. I'm still confused by this. We have np.long, we deprecated it, that seems like a good idea to me. We want to get rid of it, and not bring it back.

Some relevant content from the 1.20.0 release notes:

Deprecated name    Identical to  NumPy scalar type names
=================  ============  ==================================================================
``numpy.long``     ``int``       `numpy.int_` (C ``long``), `numpy.longlong` (largest integer type)
  • np.int64 or np.int32 to specify the precision exactly. This ensures that results cannot depend on the computer or operating system.
  • np.int_ or int (the default), but be aware that it depends on the computer and operating system.
  • The C types: np.cint (int), np.int_ (long), np.longlong.
  • np.intp which is 32bit on 32bit machines 64bit on 64bit machines. This can be the best type to use for indexing.

We are still missing a good guide to this I think, I usually have to dig up https://github.com/scikit-learn/scikit-learn/wiki/C-integer-types%3A-the-missing-manual. My understanding of our current situation is:

  • We want to guide users to int32, int64 etc. for almost everything, and intp for indexing.
  • We don't want any aliases for Python types
  • Everything else is secondary, and should be de-emphasized. We have exact mappings to C data types which we must keep, and tons of cruft that we need to deprecate.

So after writing this: your point is that np.long is a better name for the C long type than np.int_? That's probably true. So yes, I agree that ideally that'd be the name.

@seberg
Copy link
Member Author

seberg commented Nov 20, 2022

Yes, that is my point. The old long had to go, because it was the Python int (inheritance from Python 2 long, which must be why Cython is still on long long 😱).

In any case, I think np.int_ sounds like "default integer" (if it isn't C int), which is terrible if we change the default integer to be intp (to get 64bit ints on windows 64bit).

We don't have to follow through, but adding a future-warning opens us up a tiny bit more to that, I think.

@rgommers
Copy link
Member

Okay, makes sense then, +1 from me. Let me review the rest of this PR then.

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, just some minor comments. Removing finfo.machar is the main one; in case of time pressure before release that can also be done in a follow-up.

)

# NumPy 1.22, 2021-10-20
__deprecated_attrs__["MachAr"] = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.finfo().machar was deprecated at the same time, can you remove that as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, was thinking to do a followup.

numpy/__init__.py Outdated Show resolved Hide resolved
numpy/__init__.py Outdated Show resolved Hide resolved

# add mapping for both the bit name and the numarray name
sctypeDict[myname] = info.type
# Add to the main namespace if desired:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what this comment means. allTypes is not in the main namespace, and I think we'd like to keep it that way. Remove?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allTypes is what ends up in the main namespace:

# Now add the types we've determined to this module
for key in allTypes:
    globals()[key] = allTypes[key]
    __all__.append(key)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay, missed that bit.

@rgommers rgommers merged commit 742545f into numpy:main Nov 21, 2022
@rgommers
Copy link
Member

LGTM now, in it goes. Thanks Sebastian

@rgommers rgommers added this to the 1.24.0 release milestone Nov 21, 2022
@dhomeier dhomeier mentioned this pull request Nov 30, 2022
10 tasks
vrabaud added a commit to vrabaud/opencv that referenced this pull request Dec 15, 2022
This change replaces references to a number of deprecated NumPy
type aliases (np.bool, np.int, np.float, np.complex, np.object,
np.str) with their recommended replacement (bool, int, float,
complex, object, str).

Those types were deprecated in 1.20 and are removed in 1.24,
cf numpy/numpy#22607.
vrabaud added a commit to vrabaud/opencv that referenced this pull request Dec 15, 2022
This change replaces references to a number of deprecated NumPy
type aliases (np.bool, np.int, np.float, np.complex, np.object,
np.str) with their recommended replacement (bool, int, float,
complex, object, str).

Those types were deprecated in 1.20 and are removed in 1.24,
cf numpy/numpy#22607.
vrabaud added a commit to vrabaud/opencv that referenced this pull request Dec 19, 2022
This change replaces references to a number of deprecated NumPy
type aliases (np.bool, np.int, np.float, np.complex, np.object,
np.str) with their recommended replacement (bool, int, float,
complex, object, str).

Those types were deprecated in 1.20 and are removed in 1.24,
cf numpy/numpy#22607.
@seberg
Copy link
Member Author

seberg commented Feb 7, 2023

I am not sure I am convinced we shouldn't eventually get rid of the bool8 name. But I am happy with saying that we should delay the deprecation until np.bool itself established (we could maybe hide it from __dir__ for now though).

IMO int8, is very different: just like in C/C++ the bits denote the type (what you can store/behavior) and not just how it is stored.

@rgommers
Copy link
Member

rgommers commented Feb 7, 2023

I am not sure I am convinced we shouldn't eventually get rid of the bool8 name. But I am happy with saying that we should delay the deprecation until np.bool itself established (we could maybe hide it from __dir__ for now though).

That seems fine to me too.

IMO int8, is very different: just like in C/C++ the bits denote the type (what you can store/behavior) and not just how it is stored.

Agreed. The 8 in bool8 is just noise and doesn't have semantic meaning. Plus, if we'd ever want to change the builtin boolean dtype to a 1-bit dtype, we should be free to do so (no plans, and various possible hiccups in practice, but in theory it makes sense). And I've personally never seen bool8 used anywhere. We should definitely hide it, and get rid of it later on.

srowen pushed a commit to apache/spark that referenced this pull request Mar 3, 2023
…ypes

### Problem description
Numpy has started changing the alias to some of its data-types. This means that users with the latest version of numpy they will face either warnings or errors according to the type that they are using. This affects all the users using numoy > 1.20.0
One of the types was fixed back in September with this [pull](#37817) request

[numpy 1.24.0](numpy/numpy#22607): The scalar type aliases ending in a 0 bit size: np.object0, np.str0, np.bytes0, np.void0, np.int0, np.uint0 as well as np.bool8 are now deprecated and will eventually be removed.
[numpy 1.20.0](numpy/numpy#14882): Using the aliases of builtin types like np.int is deprecated

### What changes were proposed in this pull request?
From numpy 1.20.0 we receive a deprecattion warning on np.object(https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations) and from numpy 1.24.0 we received an attribute error:

```
attr = 'object'

    def __getattr__(attr):
        # Warn for expired attributes, and return a dummy function
        # that always raises an exception.
        import warnings
        try:
            msg = __expired_functions__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)

            def _expired(*args, **kwds):
                raise RuntimeError(msg)

            return _expired

        # Emit warnings for deprecated attributes
        try:
            val, msg = __deprecated_attrs__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)
            return val

        if attr in __future_scalars__:
            # And future warnings for those that will change, but also give
            # the AttributeError
            warnings.warn(
                f"In the future `np.{attr}` will be defined as the "
                "corresponding NumPy scalar.", FutureWarning, stacklevel=2)

        if attr in __former_attrs__:
>           raise AttributeError(__former_attrs__[attr])
E           AttributeError: module 'numpy' has no attribute 'object'.
E           `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
```

From numpy version 1.24.0 we receive a deprecation warning on np.object0 and every np.datatype0 and np.bool8
>>> np.object0(123)
<stdin>:1: DeprecationWarning: `np.object0` is a deprecated alias for ``np.object0` is a deprecated alias for `np.object_`. `object` can be used instead.  (Deprecated NumPy 1.24)`.  (Deprecated NumPy 1.24)

### Why are the changes needed?
The changes are needed so pyspark can be compatible with the latest numpy and avoid

- attribute errors on data types being deprecated from version 1.20.0: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
- warnings on deprecated data types from version 1.24.0: https://numpy.org/devdocs/release/1.24.0-notes.html#deprecations

### Does this PR introduce _any_ user-facing change?
The change will suppress the warning coming from numpy 1.24.0 and the error coming from numpy 1.22.0

### How was this patch tested?
I assume that the existing tests should catch this. (see all section Extra questions)

I found this to be a problem in my work's project where we use for our unit tests the toPandas() function to convert to np.object. Attaching the run result of our test:

```

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/local/lib/python3.9/dist-packages/<my-pkg>/unit/spark_test.py:64: in run_testcase
    self.handler.compare_df(result, expected, config=self.compare_config)
/usr/local/lib/python3.9/dist-packages/<my-pkg>/spark_test_handler.py:38: in compare_df
    actual_pd = actual.toPandas().sort_values(by=sort_columns, ignore_index=True)
/usr/local/lib/python3.9/dist-packages/pyspark/sql/pandas/conversion.py:232: in toPandas
    corrected_dtypes[index] = np.object  # type: ignore[attr-defined]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

attr = 'object'

    def __getattr__(attr):
        # Warn for expired attributes, and return a dummy function
        # that always raises an exception.
        import warnings
        try:
            msg = __expired_functions__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)

            def _expired(*args, **kwds):
                raise RuntimeError(msg)

            return _expired

        # Emit warnings for deprecated attributes
        try:
            val, msg = __deprecated_attrs__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)
            return val

        if attr in __future_scalars__:
            # And future warnings for those that will change, but also give
            # the AttributeError
            warnings.warn(
                f"In the future `np.{attr}` will be defined as the "
                "corresponding NumPy scalar.", FutureWarning, stacklevel=2)

        if attr in __former_attrs__:
>           raise AttributeError(__former_attrs__[attr])
E           AttributeError: module 'numpy' has no attribute 'object'.
E           `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

/usr/local/lib/python3.9/dist-packages/numpy/__init__.py:305: AttributeError
```

Although i cannot provide the code doing in python the following should show the problem:
```
>>> import numpy as np
>>> np.object0(123)
<stdin>:1: DeprecationWarning: `np.object0` is a deprecated alias for ``np.object0` is a deprecated alias for `np.object_`. `object` can be used instead.  (Deprecated NumPy 1.24)`.  (Deprecated NumPy 1.24)
123
>>> np.object(123)
<stdin>:1: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/dist-packages/numpy/__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'object'.
`np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
```

I do not have a use-case in my tests for np.object0 but I fixed like the suggestion from numpy

### Supported Versions:
I propose this fix to be included in all pyspark 3.3 and onwards

### JIRA
I know a JIRA ticket should be created I sent an email and I am waiting for the answer to document the case also there.

### Extra questions:
By grepping for np.bool and np.object I see that the tests include them. Shall we change them also? Data types with _ I think they are not affected.

```
git grep np.object
python/pyspark/ml/functions.py:        return data.dtype == np.object_ and isinstance(data.iloc[0], (np.ndarray, list))
python/pyspark/ml/functions.py:        return any(data.dtypes == np.object_) and any(
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[1], np.object)
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[4], np.object)  # datetime.date
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[1], np.object)
python/pyspark/sql/tests/test_dataframe.py:                self.assertEqual(types[6], np.object)
python/pyspark/sql/tests/test_dataframe.py:                self.assertEqual(types[7], np.object)

git grep np.bool
python/docs/source/user_guide/pandas_on_spark/types.rst:np.bool       BooleanType
python/pyspark/pandas/indexing.py:            isinstance(key, np.bool_) for key in cols_sel
python/pyspark/pandas/tests/test_typedef.py:            np.bool: (np.bool, BooleanType()),
python/pyspark/pandas/tests/test_typedef.py:            bool: (np.bool, BooleanType()),
python/pyspark/pandas/typedef/typehints.py:    elif tpe in (bool, np.bool_, "bool", "?"):
python/pyspark/sql/connect/expressions.py:                assert isinstance(value, (bool, np.bool_))
python/pyspark/sql/connect/expressions.py:                elif isinstance(value, np.bool_):
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[2], np.bool)
python/pyspark/sql/tests/test_functions.py:            (np.bool_, [("true", "boolean")]),
```

If yes concerning bool was merged already should we fix it too?

Closes #40220 from aimtsou/numpy-patch.

Authored-by: Aimilios Tsouvelekakis <aimtsou@users.noreply.github.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
srowen pushed a commit to apache/spark that referenced this pull request Mar 3, 2023
…ypes

### Problem description
Numpy has started changing the alias to some of its data-types. This means that users with the latest version of numpy they will face either warnings or errors according to the type that they are using. This affects all the users using numoy > 1.20.0
One of the types was fixed back in September with this [pull](#37817) request

[numpy 1.24.0](numpy/numpy#22607): The scalar type aliases ending in a 0 bit size: np.object0, np.str0, np.bytes0, np.void0, np.int0, np.uint0 as well as np.bool8 are now deprecated and will eventually be removed.
[numpy 1.20.0](numpy/numpy#14882): Using the aliases of builtin types like np.int is deprecated

### What changes were proposed in this pull request?
From numpy 1.20.0 we receive a deprecattion warning on np.object(https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations) and from numpy 1.24.0 we received an attribute error:

```
attr = 'object'

    def __getattr__(attr):
        # Warn for expired attributes, and return a dummy function
        # that always raises an exception.
        import warnings
        try:
            msg = __expired_functions__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)

            def _expired(*args, **kwds):
                raise RuntimeError(msg)

            return _expired

        # Emit warnings for deprecated attributes
        try:
            val, msg = __deprecated_attrs__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)
            return val

        if attr in __future_scalars__:
            # And future warnings for those that will change, but also give
            # the AttributeError
            warnings.warn(
                f"In the future `np.{attr}` will be defined as the "
                "corresponding NumPy scalar.", FutureWarning, stacklevel=2)

        if attr in __former_attrs__:
>           raise AttributeError(__former_attrs__[attr])
E           AttributeError: module 'numpy' has no attribute 'object'.
E           `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
```

From numpy version 1.24.0 we receive a deprecation warning on np.object0 and every np.datatype0 and np.bool8
>>> np.object0(123)
<stdin>:1: DeprecationWarning: `np.object0` is a deprecated alias for ``np.object0` is a deprecated alias for `np.object_`. `object` can be used instead.  (Deprecated NumPy 1.24)`.  (Deprecated NumPy 1.24)

### Why are the changes needed?
The changes are needed so pyspark can be compatible with the latest numpy and avoid

- attribute errors on data types being deprecated from version 1.20.0: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
- warnings on deprecated data types from version 1.24.0: https://numpy.org/devdocs/release/1.24.0-notes.html#deprecations

### Does this PR introduce _any_ user-facing change?
The change will suppress the warning coming from numpy 1.24.0 and the error coming from numpy 1.22.0

### How was this patch tested?
I assume that the existing tests should catch this. (see all section Extra questions)

I found this to be a problem in my work's project where we use for our unit tests the toPandas() function to convert to np.object. Attaching the run result of our test:

```

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/local/lib/python3.9/dist-packages/<my-pkg>/unit/spark_test.py:64: in run_testcase
    self.handler.compare_df(result, expected, config=self.compare_config)
/usr/local/lib/python3.9/dist-packages/<my-pkg>/spark_test_handler.py:38: in compare_df
    actual_pd = actual.toPandas().sort_values(by=sort_columns, ignore_index=True)
/usr/local/lib/python3.9/dist-packages/pyspark/sql/pandas/conversion.py:232: in toPandas
    corrected_dtypes[index] = np.object  # type: ignore[attr-defined]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

attr = 'object'

    def __getattr__(attr):
        # Warn for expired attributes, and return a dummy function
        # that always raises an exception.
        import warnings
        try:
            msg = __expired_functions__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)

            def _expired(*args, **kwds):
                raise RuntimeError(msg)

            return _expired

        # Emit warnings for deprecated attributes
        try:
            val, msg = __deprecated_attrs__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)
            return val

        if attr in __future_scalars__:
            # And future warnings for those that will change, but also give
            # the AttributeError
            warnings.warn(
                f"In the future `np.{attr}` will be defined as the "
                "corresponding NumPy scalar.", FutureWarning, stacklevel=2)

        if attr in __former_attrs__:
>           raise AttributeError(__former_attrs__[attr])
E           AttributeError: module 'numpy' has no attribute 'object'.
E           `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

/usr/local/lib/python3.9/dist-packages/numpy/__init__.py:305: AttributeError
```

Although i cannot provide the code doing in python the following should show the problem:
```
>>> import numpy as np
>>> np.object0(123)
<stdin>:1: DeprecationWarning: `np.object0` is a deprecated alias for ``np.object0` is a deprecated alias for `np.object_`. `object` can be used instead.  (Deprecated NumPy 1.24)`.  (Deprecated NumPy 1.24)
123
>>> np.object(123)
<stdin>:1: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/dist-packages/numpy/__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'object'.
`np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
```

I do not have a use-case in my tests for np.object0 but I fixed like the suggestion from numpy

### Supported Versions:
I propose this fix to be included in all pyspark 3.3 and onwards

### JIRA
I know a JIRA ticket should be created I sent an email and I am waiting for the answer to document the case also there.

### Extra questions:
By grepping for np.bool and np.object I see that the tests include them. Shall we change them also? Data types with _ I think they are not affected.

```
git grep np.object
python/pyspark/ml/functions.py:        return data.dtype == np.object_ and isinstance(data.iloc[0], (np.ndarray, list))
python/pyspark/ml/functions.py:        return any(data.dtypes == np.object_) and any(
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[1], np.object)
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[4], np.object)  # datetime.date
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[1], np.object)
python/pyspark/sql/tests/test_dataframe.py:                self.assertEqual(types[6], np.object)
python/pyspark/sql/tests/test_dataframe.py:                self.assertEqual(types[7], np.object)

git grep np.bool
python/docs/source/user_guide/pandas_on_spark/types.rst:np.bool       BooleanType
python/pyspark/pandas/indexing.py:            isinstance(key, np.bool_) for key in cols_sel
python/pyspark/pandas/tests/test_typedef.py:            np.bool: (np.bool, BooleanType()),
python/pyspark/pandas/tests/test_typedef.py:            bool: (np.bool, BooleanType()),
python/pyspark/pandas/typedef/typehints.py:    elif tpe in (bool, np.bool_, "bool", "?"):
python/pyspark/sql/connect/expressions.py:                assert isinstance(value, (bool, np.bool_))
python/pyspark/sql/connect/expressions.py:                elif isinstance(value, np.bool_):
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[2], np.bool)
python/pyspark/sql/tests/test_functions.py:            (np.bool_, [("true", "boolean")]),
```

If yes concerning bool was merged already should we fix it too?

Closes #40220 from aimtsou/numpy-patch.

Authored-by: Aimilios Tsouvelekakis <aimtsou@users.noreply.github.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(cherry picked from commit b3c26b8)
Signed-off-by: Sean Owen <srowen@gmail.com>
srowen pushed a commit to apache/spark that referenced this pull request Mar 3, 2023
…ypes

### Problem description
Numpy has started changing the alias to some of its data-types. This means that users with the latest version of numpy they will face either warnings or errors according to the type that they are using. This affects all the users using numoy > 1.20.0
One of the types was fixed back in September with this [pull](#37817) request

[numpy 1.24.0](numpy/numpy#22607): The scalar type aliases ending in a 0 bit size: np.object0, np.str0, np.bytes0, np.void0, np.int0, np.uint0 as well as np.bool8 are now deprecated and will eventually be removed.
[numpy 1.20.0](numpy/numpy#14882): Using the aliases of builtin types like np.int is deprecated

### What changes were proposed in this pull request?
From numpy 1.20.0 we receive a deprecattion warning on np.object(https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations) and from numpy 1.24.0 we received an attribute error:

```
attr = 'object'

    def __getattr__(attr):
        # Warn for expired attributes, and return a dummy function
        # that always raises an exception.
        import warnings
        try:
            msg = __expired_functions__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)

            def _expired(*args, **kwds):
                raise RuntimeError(msg)

            return _expired

        # Emit warnings for deprecated attributes
        try:
            val, msg = __deprecated_attrs__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)
            return val

        if attr in __future_scalars__:
            # And future warnings for those that will change, but also give
            # the AttributeError
            warnings.warn(
                f"In the future `np.{attr}` will be defined as the "
                "corresponding NumPy scalar.", FutureWarning, stacklevel=2)

        if attr in __former_attrs__:
>           raise AttributeError(__former_attrs__[attr])
E           AttributeError: module 'numpy' has no attribute 'object'.
E           `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
```

From numpy version 1.24.0 we receive a deprecation warning on np.object0 and every np.datatype0 and np.bool8
>>> np.object0(123)
<stdin>:1: DeprecationWarning: `np.object0` is a deprecated alias for ``np.object0` is a deprecated alias for `np.object_`. `object` can be used instead.  (Deprecated NumPy 1.24)`.  (Deprecated NumPy 1.24)

### Why are the changes needed?
The changes are needed so pyspark can be compatible with the latest numpy and avoid

- attribute errors on data types being deprecated from version 1.20.0: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
- warnings on deprecated data types from version 1.24.0: https://numpy.org/devdocs/release/1.24.0-notes.html#deprecations

### Does this PR introduce _any_ user-facing change?
The change will suppress the warning coming from numpy 1.24.0 and the error coming from numpy 1.22.0

### How was this patch tested?
I assume that the existing tests should catch this. (see all section Extra questions)

I found this to be a problem in my work's project where we use for our unit tests the toPandas() function to convert to np.object. Attaching the run result of our test:

```

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/local/lib/python3.9/dist-packages/<my-pkg>/unit/spark_test.py:64: in run_testcase
    self.handler.compare_df(result, expected, config=self.compare_config)
/usr/local/lib/python3.9/dist-packages/<my-pkg>/spark_test_handler.py:38: in compare_df
    actual_pd = actual.toPandas().sort_values(by=sort_columns, ignore_index=True)
/usr/local/lib/python3.9/dist-packages/pyspark/sql/pandas/conversion.py:232: in toPandas
    corrected_dtypes[index] = np.object  # type: ignore[attr-defined]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

attr = 'object'

    def __getattr__(attr):
        # Warn for expired attributes, and return a dummy function
        # that always raises an exception.
        import warnings
        try:
            msg = __expired_functions__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)

            def _expired(*args, **kwds):
                raise RuntimeError(msg)

            return _expired

        # Emit warnings for deprecated attributes
        try:
            val, msg = __deprecated_attrs__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)
            return val

        if attr in __future_scalars__:
            # And future warnings for those that will change, but also give
            # the AttributeError
            warnings.warn(
                f"In the future `np.{attr}` will be defined as the "
                "corresponding NumPy scalar.", FutureWarning, stacklevel=2)

        if attr in __former_attrs__:
>           raise AttributeError(__former_attrs__[attr])
E           AttributeError: module 'numpy' has no attribute 'object'.
E           `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

/usr/local/lib/python3.9/dist-packages/numpy/__init__.py:305: AttributeError
```

Although i cannot provide the code doing in python the following should show the problem:
```
>>> import numpy as np
>>> np.object0(123)
<stdin>:1: DeprecationWarning: `np.object0` is a deprecated alias for ``np.object0` is a deprecated alias for `np.object_`. `object` can be used instead.  (Deprecated NumPy 1.24)`.  (Deprecated NumPy 1.24)
123
>>> np.object(123)
<stdin>:1: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/dist-packages/numpy/__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'object'.
`np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
```

I do not have a use-case in my tests for np.object0 but I fixed like the suggestion from numpy

### Supported Versions:
I propose this fix to be included in all pyspark 3.3 and onwards

### JIRA
I know a JIRA ticket should be created I sent an email and I am waiting for the answer to document the case also there.

### Extra questions:
By grepping for np.bool and np.object I see that the tests include them. Shall we change them also? Data types with _ I think they are not affected.

```
git grep np.object
python/pyspark/ml/functions.py:        return data.dtype == np.object_ and isinstance(data.iloc[0], (np.ndarray, list))
python/pyspark/ml/functions.py:        return any(data.dtypes == np.object_) and any(
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[1], np.object)
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[4], np.object)  # datetime.date
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[1], np.object)
python/pyspark/sql/tests/test_dataframe.py:                self.assertEqual(types[6], np.object)
python/pyspark/sql/tests/test_dataframe.py:                self.assertEqual(types[7], np.object)

git grep np.bool
python/docs/source/user_guide/pandas_on_spark/types.rst:np.bool       BooleanType
python/pyspark/pandas/indexing.py:            isinstance(key, np.bool_) for key in cols_sel
python/pyspark/pandas/tests/test_typedef.py:            np.bool: (np.bool, BooleanType()),
python/pyspark/pandas/tests/test_typedef.py:            bool: (np.bool, BooleanType()),
python/pyspark/pandas/typedef/typehints.py:    elif tpe in (bool, np.bool_, "bool", "?"):
python/pyspark/sql/connect/expressions.py:                assert isinstance(value, (bool, np.bool_))
python/pyspark/sql/connect/expressions.py:                elif isinstance(value, np.bool_):
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[2], np.bool)
python/pyspark/sql/tests/test_functions.py:            (np.bool_, [("true", "boolean")]),
```

If yes concerning bool was merged already should we fix it too?

Closes #40220 from aimtsou/numpy-patch.

Authored-by: Aimilios Tsouvelekakis <aimtsou@users.noreply.github.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(cherry picked from commit b3c26b8)
Signed-off-by: Sean Owen <srowen@gmail.com>
a-sajjad72 pushed a commit to a-sajjad72/opencv that referenced this pull request Mar 30, 2023
This change replaces references to a number of deprecated NumPy
type aliases (np.bool, np.int, np.float, np.complex, np.object,
np.str) with their recommended replacement (bool, int, float,
complex, object, str).

Those types were deprecated in 1.20 and are removed in 1.24,
cf numpy/numpy#22607.
snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Jun 20, 2023
…ypes

### Problem description
Numpy has started changing the alias to some of its data-types. This means that users with the latest version of numpy they will face either warnings or errors according to the type that they are using. This affects all the users using numoy > 1.20.0
One of the types was fixed back in September with this [pull](apache#37817) request

[numpy 1.24.0](numpy/numpy#22607): The scalar type aliases ending in a 0 bit size: np.object0, np.str0, np.bytes0, np.void0, np.int0, np.uint0 as well as np.bool8 are now deprecated and will eventually be removed.
[numpy 1.20.0](numpy/numpy#14882): Using the aliases of builtin types like np.int is deprecated

### What changes were proposed in this pull request?
From numpy 1.20.0 we receive a deprecattion warning on np.object(https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations) and from numpy 1.24.0 we received an attribute error:

```
attr = 'object'

    def __getattr__(attr):
        # Warn for expired attributes, and return a dummy function
        # that always raises an exception.
        import warnings
        try:
            msg = __expired_functions__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)

            def _expired(*args, **kwds):
                raise RuntimeError(msg)

            return _expired

        # Emit warnings for deprecated attributes
        try:
            val, msg = __deprecated_attrs__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)
            return val

        if attr in __future_scalars__:
            # And future warnings for those that will change, but also give
            # the AttributeError
            warnings.warn(
                f"In the future `np.{attr}` will be defined as the "
                "corresponding NumPy scalar.", FutureWarning, stacklevel=2)

        if attr in __former_attrs__:
>           raise AttributeError(__former_attrs__[attr])
E           AttributeError: module 'numpy' has no attribute 'object'.
E           `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
```

From numpy version 1.24.0 we receive a deprecation warning on np.object0 and every np.datatype0 and np.bool8
>>> np.object0(123)
<stdin>:1: DeprecationWarning: `np.object0` is a deprecated alias for ``np.object0` is a deprecated alias for `np.object_`. `object` can be used instead.  (Deprecated NumPy 1.24)`.  (Deprecated NumPy 1.24)

### Why are the changes needed?
The changes are needed so pyspark can be compatible with the latest numpy and avoid

- attribute errors on data types being deprecated from version 1.20.0: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
- warnings on deprecated data types from version 1.24.0: https://numpy.org/devdocs/release/1.24.0-notes.html#deprecations

### Does this PR introduce _any_ user-facing change?
The change will suppress the warning coming from numpy 1.24.0 and the error coming from numpy 1.22.0

### How was this patch tested?
I assume that the existing tests should catch this. (see all section Extra questions)

I found this to be a problem in my work's project where we use for our unit tests the toPandas() function to convert to np.object. Attaching the run result of our test:

```

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/local/lib/python3.9/dist-packages/<my-pkg>/unit/spark_test.py:64: in run_testcase
    self.handler.compare_df(result, expected, config=self.compare_config)
/usr/local/lib/python3.9/dist-packages/<my-pkg>/spark_test_handler.py:38: in compare_df
    actual_pd = actual.toPandas().sort_values(by=sort_columns, ignore_index=True)
/usr/local/lib/python3.9/dist-packages/pyspark/sql/pandas/conversion.py:232: in toPandas
    corrected_dtypes[index] = np.object  # type: ignore[attr-defined]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

attr = 'object'

    def __getattr__(attr):
        # Warn for expired attributes, and return a dummy function
        # that always raises an exception.
        import warnings
        try:
            msg = __expired_functions__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)

            def _expired(*args, **kwds):
                raise RuntimeError(msg)

            return _expired

        # Emit warnings for deprecated attributes
        try:
            val, msg = __deprecated_attrs__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)
            return val

        if attr in __future_scalars__:
            # And future warnings for those that will change, but also give
            # the AttributeError
            warnings.warn(
                f"In the future `np.{attr}` will be defined as the "
                "corresponding NumPy scalar.", FutureWarning, stacklevel=2)

        if attr in __former_attrs__:
>           raise AttributeError(__former_attrs__[attr])
E           AttributeError: module 'numpy' has no attribute 'object'.
E           `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

/usr/local/lib/python3.9/dist-packages/numpy/__init__.py:305: AttributeError
```

Although i cannot provide the code doing in python the following should show the problem:
```
>>> import numpy as np
>>> np.object0(123)
<stdin>:1: DeprecationWarning: `np.object0` is a deprecated alias for ``np.object0` is a deprecated alias for `np.object_`. `object` can be used instead.  (Deprecated NumPy 1.24)`.  (Deprecated NumPy 1.24)
123
>>> np.object(123)
<stdin>:1: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/dist-packages/numpy/__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'object'.
`np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
```

I do not have a use-case in my tests for np.object0 but I fixed like the suggestion from numpy

### Supported Versions:
I propose this fix to be included in all pyspark 3.3 and onwards

### JIRA
I know a JIRA ticket should be created I sent an email and I am waiting for the answer to document the case also there.

### Extra questions:
By grepping for np.bool and np.object I see that the tests include them. Shall we change them also? Data types with _ I think they are not affected.

```
git grep np.object
python/pyspark/ml/functions.py:        return data.dtype == np.object_ and isinstance(data.iloc[0], (np.ndarray, list))
python/pyspark/ml/functions.py:        return any(data.dtypes == np.object_) and any(
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[1], np.object)
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[4], np.object)  # datetime.date
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[1], np.object)
python/pyspark/sql/tests/test_dataframe.py:                self.assertEqual(types[6], np.object)
python/pyspark/sql/tests/test_dataframe.py:                self.assertEqual(types[7], np.object)

git grep np.bool
python/docs/source/user_guide/pandas_on_spark/types.rst:np.bool       BooleanType
python/pyspark/pandas/indexing.py:            isinstance(key, np.bool_) for key in cols_sel
python/pyspark/pandas/tests/test_typedef.py:            np.bool: (np.bool, BooleanType()),
python/pyspark/pandas/tests/test_typedef.py:            bool: (np.bool, BooleanType()),
python/pyspark/pandas/typedef/typehints.py:    elif tpe in (bool, np.bool_, "bool", "?"):
python/pyspark/sql/connect/expressions.py:                assert isinstance(value, (bool, np.bool_))
python/pyspark/sql/connect/expressions.py:                elif isinstance(value, np.bool_):
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[2], np.bool)
python/pyspark/sql/tests/test_functions.py:            (np.bool_, [("true", "boolean")]),
```

If yes concerning bool was merged already should we fix it too?

Closes apache#40220 from aimtsou/numpy-patch.

Authored-by: Aimilios Tsouvelekakis <aimtsou@users.noreply.github.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(cherry picked from commit b3c26b8)
Signed-off-by: Sean Owen <srowen@gmail.com>
sanjibansg added a commit to sanjibansg/root that referenced this pull request Aug 23, 2023
….19 or >=1.24

Because of the changed behavior of np.bool and similar aliases for builtin
data types, we need to restrict the numpy version to the stated range for sonnet.

For more information, refer here:
numpy/numpy#14882
numpy/numpy#22607
sanjibansg added a commit to sanjibansg/root that referenced this pull request Aug 23, 2023
….19 or >=1.24

Because of the changed behavior of np.bool and similar aliases for builtin
data types, we need to restrict the numpy version to the stated range for sonnet.

For more information, refer here:
numpy/numpy#14882
numpy/numpy#22607
sanjibansg added a commit to sanjibansg/root that referenced this pull request Aug 25, 2023
….19 or >=1.24

Because of the changed behavior of np.bool and similar aliases for builtin
data types, we need to restrict the numpy version to the stated range for sonnet.

For more information, refer here:
numpy/numpy#14882
numpy/numpy#22607
sanjibansg added a commit to sanjibansg/root that referenced this pull request Aug 30, 2023
…ed within <=1.19 or >=1.24

Because of the changed behavior of np.bool and similar aliases for builtin
data types, we need to restrict the numpy version to the stated range for sonnet.

For more information, refer here:
numpy/numpy#14882
numpy/numpy#22607

fix: definition of OutputGenerated in RModel_Base
sanjibansg added a commit to sanjibansg/root that referenced this pull request Sep 1, 2023
…s) and restricting numpy version

avoid trying to load sonnet and graph_nets if not installed

Co-Authored-By: moneta <lorenzo.moneta@cern.ch>

[tmva][sofie-gnn] numpy version for sofie-gnn test should be restricted within <=1.19 or >=1.24

Because of the changed behavior of np.bool and similar aliases for builtin
data types, we need to restrict the numpy version to the stated range for sonnet.

For more information, refer here:
numpy/numpy#14882
numpy/numpy#22607

fix: definition of OutputGenerated in RModel_Base

[tmva][sofie-gnn] Suppress warnings for cases other than .dat file in method WriteInitializedTensorsToFile in RModel
sanjibansg added a commit to sanjibansg/root that referenced this pull request Sep 1, 2023
…s) and restricting numpy version

avoid trying to load sonnet and graph_nets if not installed

Co-Authored-By: moneta <lorenzo.moneta@cern.ch>

[tmva][sofie-gnn] numpy version for sofie-gnn test should be restricted within <=1.19 or >=1.24

Because of the changed behavior of np.bool and similar aliases for builtin
data types, we need to restrict the numpy version to the stated range for sonnet.

For more information, refer here:
numpy/numpy#14882
numpy/numpy#22607

fix: definition of OutputGenerated in RModel_Base

[tmva][sofie-gnn] Suppress warnings for cases other than .dat file in method WriteInitializedTensorsToFile in RModel

[tmva][sofie-gnn] Fix node update in GNN and size of global features in GraphIndependent

[tmva][sofie-gnn] Fix node update in RModel_GNN generated code

[tmva][sofie-gnn] Fix for correct size of global features in GraphIndependent

fix also the way the computation of output features in RModel_GNN

Fix dimension of global feature tensor during node update

If the number of nodes is larger than the edges the tensor storing the global feature needs to be resize to the correct number of nodes * number of feature

[tmva][sofie-gnn] Fix importing _gnn if python version is less than 3.8

Improve also gnn test and address some of the Vincenzo's comments

Changes addressing comments by @vepadulano

Co-authored-by: moneta <lorenzo.moneta@cern.ch>
sanjibansg added a commit to sanjibansg/root that referenced this pull request Sep 1, 2023
…s) and restricting numpy version

avoid trying to load sonnet and graph_nets if not installed

[tmva][sofie-gnn] numpy version for sofie-gnn test should be restricted within <=1.19 or >=1.24

Because of the changed behavior of np.bool and similar aliases for builtin
data types, we need to restrict the numpy version to the stated range for sonnet.

For more information, refer here:
numpy/numpy#14882
numpy/numpy#22607

fix: definition of OutputGenerated in RModel_Base

[tmva][sofie-gnn] Suppress warnings for cases other than .dat file in method WriteInitializedTensorsToFile in RModel

[tmva][sofie-gnn] Fix node update in GNN and size of global features in GraphIndependent

[tmva][sofie-gnn] Fix node update in RModel_GNN generated code

[tmva][sofie-gnn] Fix for correct size of global features in GraphIndependent

fix also the way the computation of output features in RModel_GNN

Fix dimension of global feature tensor during node update

If the number of nodes is larger than the edges the tensor storing the global feature needs to be resize to the correct number of nodes * number of feature

[tmva][sofie-gnn] Fix importing _gnn if python version is less than 3.8

Improve also gnn test and address some of the Vincenzo's comments

Changes addressing comments by @vepadulano

Co-authored-by: moneta <lorenzo.moneta@cern.ch>
lmoneta added a commit to root-project/root that referenced this pull request Sep 4, 2023
…s) and restricting numpy version

avoid trying to load sonnet and graph_nets if not installed

[tmva][sofie-gnn] numpy version for sofie-gnn test should be restricted within <=1.19 or >=1.24

Because of the changed behavior of np.bool and similar aliases for builtin
data types, we need to restrict the numpy version to the stated range for sonnet.

For more information, refer here:
numpy/numpy#14882
numpy/numpy#22607

fix: definition of OutputGenerated in RModel_Base

[tmva][sofie-gnn] Suppress warnings for cases other than .dat file in method WriteInitializedTensorsToFile in RModel

[tmva][sofie-gnn] Fix node update in GNN and size of global features in GraphIndependent

[tmva][sofie-gnn] Fix node update in RModel_GNN generated code

[tmva][sofie-gnn] Fix for correct size of global features in GraphIndependent

fix also the way the computation of output features in RModel_GNN

Fix dimension of global feature tensor during node update

If the number of nodes is larger than the edges the tensor storing the global feature needs to be resize to the correct number of nodes * number of feature

[tmva][sofie-gnn] Fix importing _gnn if python version is less than 3.8

Improve also gnn test and address some of the Vincenzo's comments

Changes addressing comments by @vepadulano

Co-authored-by: moneta <lorenzo.moneta@cern.ch>
Simon-Lopez added a commit to BRGM/ComPASS that referenced this pull request Oct 2, 2023
This is a temporary workaround and is related to:
numpy/numpy#22607
maksgraczyk pushed a commit to maksgraczyk/root that referenced this pull request Jan 12, 2024
…s) and restricting numpy version

avoid trying to load sonnet and graph_nets if not installed

[tmva][sofie-gnn] numpy version for sofie-gnn test should be restricted within <=1.19 or >=1.24

Because of the changed behavior of np.bool and similar aliases for builtin
data types, we need to restrict the numpy version to the stated range for sonnet.

For more information, refer here:
numpy/numpy#14882
numpy/numpy#22607

fix: definition of OutputGenerated in RModel_Base

[tmva][sofie-gnn] Suppress warnings for cases other than .dat file in method WriteInitializedTensorsToFile in RModel

[tmva][sofie-gnn] Fix node update in GNN and size of global features in GraphIndependent

[tmva][sofie-gnn] Fix node update in RModel_GNN generated code

[tmva][sofie-gnn] Fix for correct size of global features in GraphIndependent

fix also the way the computation of output features in RModel_GNN

Fix dimension of global feature tensor during node update

If the number of nodes is larger than the edges the tensor storing the global feature needs to be resize to the correct number of nodes * number of feature

[tmva][sofie-gnn] Fix importing _gnn if python version is less than 3.8

Improve also gnn test and address some of the Vincenzo's comments

Changes addressing comments by @vepadulano

Co-authored-by: moneta <lorenzo.moneta@cern.ch>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants