Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing np.get_string_function (counterpart of np.set_string_function) #11266

Open
boeddeker opened this issue Jun 7, 2018 · 8 comments
Open

Comments

@boeddeker
Copy link
Contributor

Is it possible to implement a np.get_string_function as counterpart of np.set_string_function?

I found the code for printing

PyArray_SetStringFunction(PyObject *op, int repr)
{
if (repr) {
/* Dispose of previous callback */
Py_XDECREF(PyArray_ReprFunction);
/* Add a reference to new callback */
Py_XINCREF(op);
/* Remember new callback */
PyArray_ReprFunction = op;
}
else {
/* Dispose of previous callback */
Py_XDECREF(PyArray_StrFunction);
/* Add a reference to new callback */
Py_XINCREF(op);
/* Remember new callback */
PyArray_StrFunction = op;
}
}

and it looks like it is currently impossible to get the PyArray_ReprFunction and PyArray_StrFunction in python.
Since my C skills are limited, maybe someone that is familiar with the numpy source code can implement such a function?

My use case is that I have my own pprint and simply reset the string_function at the end, but the correct way would be to restore the previous one:

def pprint(obj):
    from pprint import pprint
    np.set_string_function(
        lambda a: f"array(shape={a.shape}, dtype={a.dtype})"
    )
    original_pprint(obj)
    np.set_string_function(None)  # Wrong, when the user initially changed the string_function
@eric-wieser
Copy link
Member

eric-wieser commented Jun 7, 2018

I can think of another way to solve your problem, but it involves accessing private internals of pprint:

def _pprint_ndarray(self, object, stream, indent, allowance, context, level):
    stream.write('array(shape={})'.format(object.shape))
pprint.PrettyPrinter._dispatch[np.ndarray.__repr__] = _pprint_ndarray

Which results in:

In [43]: pprint.pprint(np.arange(100))
array(shape=(100,))

In [44]: pprint.pprint(np.arange(10))
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Since pprint only uses an unusual repr if the default is too long

@boeddeker
Copy link
Contributor Author

Thanks for the suggestion, with that I can solve my particular problem.
But I would prefare to solve this without private objects and a solution on numpy site.
For example IPython has also a pprint and for a workaround there is also an access to a private object nessesary.

Here the code that is nessesary to solve my earlier denoted problem:

import pprint
import IPython

def my_pprint(obj):
    def _pprint_ndarray(self, object, stream, indent, allowance, context, level):
        stream.write(f'{object.__class__.__name__}(shape={object.shape})')
        
    if np.ndarray.__repr__ in pprint.PrettyPrinter._dispatch:
        restore = True
        old = pprint.PrettyPrinter._dispatch[np.ndarray.__repr__]
    else:
        restore = False
    
    pprint.PrettyPrinter._dispatch[np.ndarray.__repr__] = _pprint_ndarray
    pprint.pprint(obj)
    
    if restore:
        pprint.PrettyPrinter._dispatch[np.ndarray.__repr__] = old
    else:
        del pprint.PrettyPrinter._dispatch[np.ndarray.__repr__]
        
def my_ipy_pprint(obj):
    def _ipy_pprint_ndarray(obj, p, cycle):
        p.text(f'{obj.__class__.__name__}(shape={obj.shape})')

    if np.ndarray in IPython.lib.pretty._type_pprinters:
        restore = True
        old = IPython.lib.pretty._type_pprinters[np.ndarray]
    else:
        restore = False
    
    IPython.lib.pretty._type_pprinters[np.ndarray] = _ipy_pprint_ndarray
    IPython.lib.pretty.pprint(obj)
    
    if restore:
        IPython.lib.pretty._type_pprinters[np.ndarray] = old
    else:
        del IPython.lib.pretty._type_pprinters[np.ndarray]
    
def test_print(obj):
    print('\npprint:')
    pprint.pprint(obj)
    print('\nmy_pprint:')
    my_pprint(obj)

    print('\nipy_pprint:')
    IPython.lib.pretty.pprint(obj)
    print('\nmy_ipy_pprint:')
    my_ipy_pprint(obj)
    
    print()

test_print(np.arange(10))
test_print(np.arange(30))
test_print({'a': np.arange(10)})
test_print({'a': np.arange(30)})

The output:

pprint:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

my_pprint:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

ipy_pprint:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

my_ipy_pprint:
ndarray(shape=(10,))


pprint:
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29])

my_pprint:
ndarray(shape=(30,))

ipy_pprint:
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29])

my_ipy_pprint:
ndarray(shape=(30,))


pprint:
{'a': array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])}

my_pprint:
{'a': array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])}

ipy_pprint:
{'a': array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])}

my_ipy_pprint:
{'a': ndarray(shape=(10,))}


pprint:
{'a': array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29])}

my_pprint:
{'a': ndarray(shape=(30,))}

ipy_pprint:
{'a': array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
        17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29])}

my_ipy_pprint:
{'a': ndarray(shape=(30,))}

@eric-wieser
Copy link
Member

eric-wieser commented Jun 7, 2018

You can do better than patching pprint, if you want a temporarily different printer:

class MyPrettyPrinter(pprint.PrettyPrinter):
    _dispatch = collections.ChainMap({}, pprint.PrettyPrinter._dispatch)
    def _pprint_ndarray(...): ...
    _dispatch[np.ndarray.__repr__] = _pprint_ndarray

my_pprint = MyPrettyPrinter().pprint

This is better than either your solution above or one involving get_string_function, as it touches no global state, so is threadsafe.

I'm not saying that get_string_function isn't a sensible addition, I just don't think it solves your problem well.

@boeddeker
Copy link
Contributor Author

You are right, your suggestion is a really nice solution for my particular problem and solves it better than a solution with get_string_function (except the access of a private member).

Nevertheless, it would be nice to set string_function with a context manager. In this way it can be applied to all prints without a specific implementation for the print function (e.g. python print, pprint, ipython pprint, ...)

@eric-wieser
Copy link
Member

eric-wieser commented Jun 8, 2018

it would be nice to set string_function with a context manager

Unfortunately that approach is neither thread-safe nor asyncio safe (#9444)

But I suppose we already have that problem elsewhere in numpy, so that's not a good argument against adding the function

@boeddeker
Copy link
Contributor Author

I know that it is not thread-safe, but print should be in general handled as not thread safe.
i.e. do not use it in asyncio or threads. Or when you do it expect damaged output. (e.g. when using a threaded progress bar)

Also set_string_function is not thread-safe.

@eric-wieser
Copy link
Member

but print should be in general handled as not thread safe

Good point - but repr / pformat should be threadsafe

@Carreau
Copy link
Contributor

Carreau commented Mar 20, 2023

I would suggest in addition to provide a context manager, even maybe set_string_function returning the context manage that on exist restore the right string_function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants