Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cosmo: process array-like inputs #12169

Merged
merged 1 commit into from
Sep 24, 2021
Merged

Conversation

nstarman
Copy link
Member

@nstarman nstarman commented Sep 15, 2021

Summary:

The preferred input types to Cosmology methods are 1) numbers (e.g. float) and 2) ndarray / Quantity / ndarray ducktypes.
All other input types are converted to an ndarray.

Also did 1 + z -> z + 1 to avoid int.__add__ failure triggering int.__radd__. Now it starts with the passing ndarray.__add__.

Motivation:
speed, simplicity, supporting ducktypes, and eventual vectorization.

Checklist for package maintainer(s)

This checklist is meant to remind the package maintainer(s) who will review this pull request of some common things to look for. This list is not exhaustive.

  • Do the proposed changes actually accomplish desired goals?
  • Do the proposed changes follow the Astropy coding guidelines?
  • Are tests added/updated as required? If so, do they follow the Astropy testing guidelines?
  • Are docs added/updated as required? If so, do they follow the Astropy documentation guidelines?
  • Is rebase and/or squash necessary? If so, please provide the author with appropriate instructions. Also see "When to rebase and squash commits".
  • Did the CI pass? If no, are the failures related? If you need to run daily and weekly cron jobs as part of the PR, please apply the Extra CI label.
  • Is a change log needed? If yes, did the change log check pass? If no, add the no-changelog-entry-needed label. If this is a manual backport, use the skip-changelog-checks label unless special changelog handling is necessary.
  • Is a milestone set? Milestone must be set but astropy-bot check might be missing; do not let the green checkmark fool you.
  • At the time of adding the milestone, if the milestone set requires a backport to release branch(es), apply the appropriate backport-X.Y.x label(s) before merge.

@nstarman nstarman added this to the v5.0 milestone Sep 15, 2021
@nstarman nstarman added this to In progress in Cosmology, the Expansion via automation Sep 15, 2021
@nstarman nstarman changed the title Cosmo not arraylike Cosmo restrict input types Sep 15, 2021
@nstarman nstarman force-pushed the cosmo_not_arraylike branch 2 times, most recently from 37bf309 to 400a821 Compare September 15, 2021 04:05
Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nstarman - I think this looks good. Some inline comments, but perhaps also a suggestion that may make this less of a breaking change (I'm a bit worried by how many tests you had to adjust...): I think that almost everywhere, one uses z+1.0. If you were to make that into a little function, one could keep the iterables. E.g.,

def zp1(z):
    try:
        return z + 1.0
    except TypeError:  # Maybe list?
        return u.Quantity(z, redshift_unit).value + 1.0

This doesn't cost more than the present isiterable check for float, and for everything else the extra time for the function call is negligible. Of course, in principle this also allows to add a deprecation warning if you feel strongly one should end up being strict.

In the meantime, I will approve, as I think this is really your call (and I'll be out of contact for a bit starting tomorrow).

@@ -737,8 +695,7 @@ def de_density_scale(self, z):
# The integral we actually use (the one given in Linder)
# is rewritten in terms of z, so looks slightly different than the
# one in the documentation string, but it's the same thing.

if isiterable(z):
if np.shape(z):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd just keep this as isiterable - np.shape is 4 times slower for non-ndarray. Alternatively, if not isinstance(z, float) and np.shape(z):?

Copy link
Member Author

@nstarman nstarman Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
Did some experimenting, getattr(z, "shape", ())) is faster yet.
numbers.Number and scalar ndarray => False
array => True

astropy/cosmology/flrw.py Show resolved Hide resolved
astropy/cosmology/flrw.py Outdated Show resolved Hide resolved
@@ -1151,8 +1085,10 @@ def _integral_comoving_distance_z1z2(self, z1, z2):
d : `~astropy.units.Quantity` ['length']
Comoving distance in Mpc between each input redshift.
"""
z1 = z1 if not hasattr(z1, "unit") else u.Quantity(z1, cu.redshift).value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again a bit confused why this is needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are needed because of ufunc fun. They fail on Quantities, so must be degraded to numpy arrays.
This is more a bug fix from the redshift units PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When #11893 and followup are in, I think this can be avoided.

astropy/cosmology/flrw.py Outdated Show resolved Hide resolved
astropy/cosmology/flrw.py Outdated Show resolved Hide resolved
Cosmology, the Expansion automation moved this from In progress to Reviewer approved Sep 15, 2021
@nstarman nstarman force-pushed the cosmo_not_arraylike branch 2 times, most recently from 33eb9ae to a68ea84 Compare September 15, 2021 19:42
@nstarman
Copy link
Member Author

nstarman commented Sep 15, 2021

@nstarman - I think this looks good. Some inline comments, but perhaps also a suggestion that may make this less of a breaking change (I'm a bit worried by how many tests you had to adjust...):

Thanks for the review.
About the tests, all of them where just changing list to numpy.array (or Quantity to add some tests of redshift).
This PR is less of a breaking change than I initially thought. Fixing any break is as easy as np.array(<list>)

I think that almost everywhere, one uses z+1.0. If you were to make that into a little function, one could keep the iterables. E.g.,
This doesn't cost more than the present isiterable check for float, and for everything else the extra time for the function call is negligible. Of course, in principle this also allows to add a deprecation warning if you feel strongly one should end up being strict.

I like this idea. Lists and tuples will be deprecated as of v5.0 and removed in v5.2.

This means I will revert all the changes to the test file (and put a deprecation warning test beforehand).

@mhvk
Copy link
Contributor

mhvk commented Sep 15, 2021

If we go the zp1(z) route, we can in principle also just not deprecate at all. While in principle I like just ducktyping, in practice numpy has set the precedent that lists and other iterables are OK, and the naive expectation that code will do something like z = u.Quantity(z, cu.redshift) would also allow a list.

Anyway, as said before, this really is up to you! One option is a FutureWarning or PendingDeprecationWarning, which I think is for "we're thinking about deprecating this, let us know if you think it is a bad idea".

@astrofrog
Copy link
Member

astrofrog commented Sep 16, 2021

I'm somewhat concerned about the breaking changes here - even just having to give an array instead of a list is going to frustrate a lot of users. Think about how painful it has been for users to 'just' have to adjust all their scripts to add parentheses in print statements!

What is the motivation for the breaking change? Cosmology is one of the oldest sub packages and we have marked it as mature for many versions here:

https://docs.astropy.org/en/stable/stability.html

We have to have really good reasons at this point to break things such as passing a list of values to the cosmology functions which seems like something people might do especially if we had it in our docs and tests ourselves? It doesn't strike me as a rare corner case.

Another way to put it is, what is the downside of doing np.asarray or the Quantity approach mentioned by @mhvk inside the functions? Does it significantly hurt performance? Does it complicate the code a lot?

If there is a benefit to the user from this change, we should be sure to mention it in the changelog. In any case, we should definitely have a deprecation period rather than breaking this in 5.0.

@nstarman
Copy link
Member Author

nstarman commented Sep 18, 2021

@astrofrog. Thanks for the input. As @mhvk suggested, I'm implementing this with a function

from numbers import Number
import astropy.units as u
import astropy.cosmology.units as cu

def aszarr(z):
    if isinstance(z, Number) or hasattr(z, "shape"):  # scalar or numpy duck-type
        return z
    # not a Number or ndarray-like
    return u.Quantity(z, cu.redshift).value

def method(self, z):
    z = aszarr(z)
   ...

Now the Cosmology methods can always assume that operations like z + 1 won't error and I can simplify and speed up the code in a number of places. This aszarr check is faster than isiterable and allows for NumPy duck-typing, so cupy and dask can (maybe) work. That is a longer term goal and this a first step.

As you say, API stability is important so I'm not going to deprecate lists, etc. in this PR or even raise a warning. Another long-term goal is to speed up all of cosmology by some c backend (either numpy gufunc stuff and/or mypyc transpiling). This move might require deprecating some input types, but until I'm certain and have a PR for this major speedup, I'll refrain from deprecating any input types.

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the zp1() function is not there yet, so I looked at this too early. A few somewhat random comments on the way...

astropy/coordinates/distances.py Outdated Show resolved Hide resolved
astropy/cosmology/flrw.py Outdated Show resolved Hide resolved
if np.isscalar(z):
return prefac * self._Neff
else:
if getattr(z, "shape", ()):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this was here before, but why insist on returning something with the same shape? Broadcasting will take care of it anyway. If you do want it, maybe np.full(np.shape(z), prefac * self._Neff)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely agree.
I'm working on a followup PR to address all the output shapes / types / etc.
For here and now, I'm trying not to change any outputs, only how the inputs are handled.

@nstarman nstarman changed the title Cosmo restrict input types Cosmo: process array-like inputs Sep 23, 2021
Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! My only question is in-line, but it is small, so happy with whatever you decide on that.

@@ -166,7 +167,7 @@ They also accept arrays of redshifts:

.. doctest-requires:: scipy

>>> cosmo.age([0.5, 1, 1.5]).value # doctest: +FLOAT_CMP
>>> cosmo.age(np.array([0.5, 1, 1.5])).value # doctest: +FLOAT_CMP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if you still want to use np.array here - I guess it is best practice, but perhaps then encourage using Quantity and write as [0.5, 1, 1.5] * cu.redshift?

Separately, I know it was already here, but why the .value here? Would seem better to show the Quantity output.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM.
Pushed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passed.
Merging.

Signed-off-by: Nathaniel Starkman (@nstarman) <nstarkman@protonmail.com>
@nstarman nstarman added the 💤 merge-when-ci-passes Do not use: We have auto-merge option now. label Sep 24, 2021
@nstarman nstarman merged commit cc4e648 into astropy:main Sep 24, 2021
Cosmology, the Expansion automation moved this from Reviewer approved to Done Sep 24, 2021
@nstarman nstarman deleted the cosmo_not_arraylike branch September 24, 2021 19:59
@nstarman nstarman mentioned this pull request Sep 28, 2021
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

3 participants