ENH: numpy.typing for type checking, documentation and Numpy compilers #26380

paugier · 2024-05-03T09:49:17Z

Proposed new feature or change:

Type annotations in code using Numpy (and other Python array libraries) can be used for 3 purposes:

Python-Numpy compilers (for example Cython or Pythran)
documentation
type checking

Currently numpy.typing is more oriented towards type checking. It would be nice if numpy.typing could also be used to easily add information useful for documentation and Python-Numpy compilers. It seems to me that the needs are a bit different from what currently supports Mypy.

For some projects (namely fluidsim, fluidfft, fluidimage), we already use type annotations so that Transonic can automatically produce Pythran, Numba and Cython code. For these projects, I now often feel the need to add type annotations only for documentation even for functions/classes that are not compiled. Unfortunately, numpy.typing is really not yet suitable for theses needs.

Moreover, I see some discussions about enhancing numpy.typing (for example #16544, see also https://github.com/ramonhagenaars/nptyping) and the solutions proposed seem quite complicated and not very suitable for documentation and Python-Numpy compilers. I mean I see nothing simple and short for something like Array["2d", Type(np.float32, np.float64), "C"] (I guess one can guess what it means).

For these purposes (doc and compilers), some very common type information that can be given are about

the number of dimensions of an array (sometimes fused, i.e. ndim 2 or 3),
dtypes (sometimes fused, i.e. float64 or complex128) and
memory contiguity/strides.

It is also very useful and common to specify that a function is limited to some particular arrays (only C contiguous for example, or only ndim equal to 2 or 3).

Specifying the number of elements in one dimension (#16544) can also be useful but it is less common that specifying the number of dimensions of an array.

Numpy compilers have their own way to describe arrays, often inspired by C notations:

With Transonic, one can use annotations with C or Python styles,

from transonic import Array, Type, NDim

A2D = "float32[:,:]"
# equivalent
A2Dbis = Array["2d", np.float32]

Afused = Array[NDim(2, 3), Type(np.float32, np.float64)]

I'm not saying that numpy.typing should support such things but it seems to me that it is important when designing numpy.typing to consider the different purposes of type annotations in code using Numpy and not to be mostly focus on what is currently supported by Mypy.

Simple things like specifying that an array is a one or two-dimensional array of float64 should be simple and short with numpy.typing.

I add a short real life example about only documenting code. In Fluidimage, I recently wrote when I rediscovered and refactored code written by other developers:

class ThinPlateSplineSubdom:

    num_centers: int
    tps_matrices: List["float[:,:]"]
    norm_coefs: "float[:]"
    norm_coefs_domains: List["float[:]"]

    num_new_positions: int
    ind_new_positions_domains: List["np.int64[:]"]
    norm_coefs_new_pos: "float[:]"
    norm_coefs_new_pos_domains: List["float[:]"]

It would be nice if I could replace that by elegant annotations using numpy.typing.

The text was updated successfully, but these errors were encountered:

rgommers · 2024-05-22T08:43:21Z

Thanks for the nice write-up and suggestions @paugier. I completely agree with the gist of what you wrote, and would like type annotations to be useful for documentation and Python compilers as well.

I mean I see nothing simple and short for something like Array["2d", Type(np.float32, np.float64), "C"] (I guess one can guess what it means).

Dtype parametrization exists:

import numpy as np
import numpy.typing as npt

def func_return_float64(x: npt.NDArray[Any]) -> npt.NDArray[np.float64]:
    # more complex combinations of allowed dtype inputs or outputs are also supported
    return x.astype(np.float64)

Shape support was blocked until very recently, and there's a lot of interest in (and relevant discussion on) gh-16544. So hopefully this will materialize soon.

Contiguity is very much a niche special case compared to shape and dtype, so let's leave that one aside for now. It shouldn't be hard, but also it's something that end user code shouldn't have to worry about in 99.x% of cases (yes, some compilers do, but that's internals).

For these projects, I now often feel the need to add type annotations only for documentation even for functions/classes that are not compiled. Unfortunately, numpy.typing is really not yet suitable for theses needs.

It should be, although the limitations are often on the Sphinx side. The typical problem is that for type annotations to be correct, they should contain unions and protocols that are complex. For documentation purposes, what is needed is really solid support for type aliases so that the ugliness of the large unions gets hidden correctly, and you could have things like x : ArrayLike | int | float render as the understandable type in html docs.

Simple things like specifying that an array is a one or two-dimensional array of float64 should be simple and short with numpy.typing.

Agreed. It's be great if it looked something like:

NDArray[np.float64, Any]
NDArray[np.float64, npt.3D]

rgommers added 01 - Enhancement Static typing labels May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: numpy.typing for type checking, documentation and Numpy compilers #26380

ENH: numpy.typing for type checking, documentation and Numpy compilers #26380

paugier commented May 3, 2024 •

edited

rgommers commented May 22, 2024

ENH: numpy.typing for type checking, documentation and Numpy compilers #26380

ENH: numpy.typing for type checking, documentation and Numpy compilers #26380

Comments

paugier commented May 3, 2024 • edited

Proposed new feature or change:

rgommers commented May 22, 2024

paugier commented May 3, 2024 •

edited