Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic dtype typing support to the main numpy namespace #19252

Open
BvB93 opened this issue Jun 15, 2021 · 5 comments
Open

Add basic dtype typing support to the main numpy namespace #19252

BvB93 opened this issue Jun 15, 2021 · 5 comments

Comments

@BvB93
Copy link
Member

BvB93 commented Jun 15, 2021

Back in #17719 the first steps were taken into introducing static typing support for array dtypes.

Since the dtype has a substantial effect on the semantics of an array, there is a lot of type-safety
to be gained if the various function-annotations in numpy can actually utilize this information.
Examples of this would be the rejection of string-arrays for arithmetic operations, or inferring the
output dtype of mixed float/integer operations.

The Plan

With this in mind I'd ideally like to implement some basic dtype support throughout the main numpy
namespace (xref #16546) before the release of 1.22.

Now, what does "basic" mean in this context? Namely, any array-/dtype-like that can be parametrized
w.r.t. np.generic. Notably this excludes builtin scalar types and character codes (literal strings), as the
only way of implementing the latter two is via excessive use of overloads.

With this in mind, I realistically only expect dtype-support for builtin scalar types (e.g. func(..., dtype=float))
to-be added with the help of a mypy plugin, e.g. via injecting a type-check-only method into the likes of
builtins.int that holds some sort of explicit reference to np.int_.

Examples

Two examples wherein the dtype can be automatically inferred:

from typing import TYPE_CHECKING
import numpy as np

AR_1 = np.array(np.float64(1))
AR_2 = np.array(1, dtype=np.float64)

if TYPE_CHECKING:
    reveal_type(AR_1)  # note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.floating*[numpy.typing._64Bit*]]]"
    reveal_type(AR_2)  # note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.floating*[numpy.typing._64Bit*]]]"

Three examples wherein dtype-support is substantially more difficult to implement.

AR_3 = np.array(1.0)
AR_4 = np.array(1, dtype=float)
AR_5 = np.array(1, dtype="f8")

if TYPE_CHECKING:
    reveal_type(AR_3)  # note: Revealed type is "numpy.ndarray[Any, numpy.dtype[Any]]"
    reveal_type(AR_4)  # note: Revealed type is "numpy.ndarray[Any, numpy.dtype[Any]]"
    reveal_type(AR_5)  # note: Revealed type is "numpy.ndarray[Any, numpy.dtype[Any]]"

In the latter three cases one can always manually declare the dtype of the array:

import numpy.typing as npt

AR_6: npt.NDArray[np.float64] = np.array(1.0)

if TYPE_CHECKING:
    reveal_type(AR_6)  # note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.floating*[numpy.typing._64Bit*]]]"
@BvB93 BvB93 added this to the 1.22.0 release milestone Jun 15, 2021
@BvB93 BvB93 changed the title Add basic dtype support typing to the main numpy namespace. Add basic dtype support typing to the main numpy namespace Jun 15, 2021
@BvB93 BvB93 changed the title Add basic dtype support typing to the main numpy namespace Add basic dtype typing support to the main numpy namespace Jun 15, 2021
@BvB93 BvB93 removed this from the 1.22.0 release milestone Oct 21, 2021
@FrancescElies
Copy link
Contributor

We are interested in this one and following this issue :), are there any ideas on supporting this when one needs to specify the byteorder?

Following your previous example:

arr = np.array(1, dtype=np.float64)  # (1) mypy should eventually detect the right array dtype.
arr2 = np.array(1, dtype="<f8")  # (2) same byteorder as (1) on littleendian machines
arr3 = np.array(1, dtype=">f8")  # (3)

How would you write (3) so that mypy understands what it is without having to explicitly define the type for variable arr3?

@BvB93
Copy link
Member Author

BvB93 commented Nov 3, 2021

How would you write (3) so that mypy understands what it is without having to explicitly define the type for variable arr3?

it's more contrived then I'd like it to be, but you could take of advantage of the np.dtype constructor here,
as it contains overloads for all ~40 (!) non-flexible string-literals. I'm not sure this is a better alternative to an
explicit type annotation though...

arr3 = np.array(1, dtype=np.dtype(">f8"))

@FrancescElies
Copy link
Contributor

Though np.dtype requires a bit more typing I like it better than explicit type annotation as I only have one place where I define the type, avoiding wrong definitions like the following.

import numpy.typing as npt
arr: npt.NDArray[np.uint8] = np.array(1, dtype=(">f8"))  # prone to mistakes

Thanks for the suggestion

@NeilGirdhar
Copy link
Contributor

Is this the right issue for tracking the following problem?

from typing import Any
import numpy as np
import numpy.typing as npt

RealArray = npt.NDArray[np.floating[Any]]
x: RealArray = np.zeros(10)
reveal_type(x)
reveal_type(np.square(x))
reveal_type(np.square(x) + np.square(x))

gives

❯ pyright a.py
  /home/neil/src/efax/a.py:8:13 - information: Type of "x" is "ndarray[Any, dtype[floating[_64Bit]]]"
  /home/neil/src/efax/a.py:9:13 - information: Type of "np.square(x)" is "ndarray[Any, dtype[Any]]"
  /home/neil/src/efax/a.py:10:13 - information: Type of "np.square(x) + np.square(x)" is "ndarray[Any, dtype[bool_]]"
0 errors, 0 warnings, 3 informations 
Completed in 0.547sec
❯ mypy a.py
a.py:8: note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.floating[Any]]]"
a.py:9: note: Revealed type is "numpy.ndarray[Any, numpy.dtype[Any]]"
a.py:10: note: Revealed type is "Any"

@NeilGirdhar
Copy link
Contributor

@BvB93 Should I file a new issue for the above problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants