Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor quantity_input #10655

Closed
wants to merge 3 commits into from
Closed

Conversation

nstarman
Copy link
Member

@nstarman nstarman commented Aug 14, 2020

Description

This is a big set of changes, PLEASE HELP.

This DRAFT pull request is to coordinate a number of PRs improving quantity_input and unit-aware static typing:

The goals are:

  1. Use quantity_input with statically typed code
  2. Allow SpecificTypeQuantity subclasses, like Distance or Angle, to be used instead of Quantity in the input verification of quantity_input
  3. Speed up calling quantity_input-wrapped functions by pre-computing as much of the conversion mechanics at wrapper initialization.
  4. a) Allow inputs to be cast to specific units (or physical types), not just verified
    b) post-cast, to pass input to function as .to_value() so that functions can work with ndarrays guaranteed to be in the correct units
  5. Expose all of these improvements to the user in a python-standard way, so that they can be changed even after the function has been initialized.

Select Ancillary Benefits:
See whole comment for details

  • unit-aware static type hints can be used independently of quantity_input
  • UnitSpec classes can be used independently of quantity_input to provide specific unit/physical-type validation/conversion/etc.
  • User-defined UnitSpec subclasses are automatically registered and usable by UnitSpec and quantity_input

Summary of each PR

  1. add unit-aware quantity annotations. #10662 : Introduce unit aware annotations.
  • set Annotated[Quantity, <unit>] as official unit-aware Quantity annotation
  • add __class_getitem__ method to Quantity to shortcut creating.
    Caveat: py3.7+
  • create stand-in Annotated class. Not valid for static typing. Run-time only. But same API as static implementation.
  • work with quantity_input
  1. Rewrite quantity_input internals so that all information is stored in wrapper.__annotations__, converting valid inputs to Quantity[<unit or pt>] from PR#1. This is still not valid static typing, but it will trivially solve Allow return annotations other than Quantity #10156 by having a) non-unit / physicaltype return annotations ignored and b) allowing quantity_input(return=None) to store None in wrapper.__annotations__ instead of Quantity[<unit or pt>]

  2. UnitSpec to solve Decorator for converting quantity to arrays in specific units? #7368 and multiple annotations

  • Extend Unit to your UnitSpec so that quantity_input becomes more powerful (addressing Decorator for converting quantity to arrays in specific units? #7368)
  • Change quantity_input so that it uses the actual class, making Angle and Distance possible (maybe will be easiest if done together with (2)).
    have quantity_input convert this to Annotated[Quantity, UnitSpec(<unit>), other] for internal use.
    Add method to Quantity to create UnitSpec annotations
  1. switch to valid static typing, having had time to discuss this more broadly with the rest of Astropy.




Overall Summary

Subject to change.

Static Typing

Current State:

There is no means to create unit-aware type annotations. For instance, the annotations used by quantity_input, like u.km or distance, are not compliant with static typing since the inputs are of (sub)type Quantity.

Proposed Change:

This problem can be solved by the typing.Annotated class, from this PEP which allows valid static type annotations to include metadata. Now, unit-aware type annotations will be of the form Annotated[Quantity, <unit>, other_annotations], where the unit must be the first annotation.

A __class_getitem__ method will be added to Quantity to shortcut creating unit aware annotations.
Quantity[<unit>, other_annotations] == Annotated[Quantity, <unit>, other_annotation].
As a class method, this will similarly work for all subclasses of Quantity.

quantity_input will accept unit-aware quantity annotations of this type. Both the type (eg Quantity) and unit must be valid inputs.

Implementation Details:

typing_extensions is the official backport for Annotated and works on any py3.5+. Until py3.9+ is the minimum python version, when Annotated is in the built-in typing library, we use the typing_extension backport. Until the typing_extension backport is added as an astropy dependency we emulate API for unsupported features.


Pre-Computation and User Accessibility

Current State:

quantity_input uses function annotations to determine the required argument units and since it also uses functools.wraps those annotations are propagated to the wrapper returned by quantity_input. However, other inputs to quantity_input, such as keyword arguments are not similarly stored.

Proposed Change:

The wrapper's __annotations__ will store all unit/physical_type information, merging the information provided from the keyword arguments with the original function's __annotations__. The original function will be unchanged and accessible in the __wrapped__ attribute. Since decorators can be applied as functions, not just with the "pie" syntax, this will allow multiple different versions of the same function to be created with different unit specifications. Not only that, but storing all information in __annotations__ means that the desired units are pre-computed, increasing function call speed, and are also editable by knowledgeable users in a python-standard format.


UnitSpec

For quantity_input, the Annotated[Quantity, <unit>, other_annotations] is not specific enough, since it only annotates which argument should be processed, not what to actually do with them. We introduce a UnitSpec (abbreviation for "unit specification") class to do both.

quantity_input will accept the standard Annotated[Quantity, <unit>, other_annotations], but also accept the more detailed annotation Annotated[Quantity, UnitSpec(<unit>), other_annotations]. Internally, any of non-unitspec annotations will be converted.

The UnitSpec.__call__ will control how quantity_input interacts with each input — if it just verifies unit / physical_type, or casts to unit / physical_type, or does to_value, etc. UnitSpec will enable different treatment of each input: for a function func(x, y), x might just be verified, but y = y.to_value(u.km).

UnitSpec will have the following pre-specified options for the __call__ output, set with the "action" attribute.

  • "validate" : to only validate physical type
  • "convert" : to convert quantities to specific units / physical type
  • "to_value" : takes .to_value after conversion
  • "from_value" : reassigns units if none present & verifies units if present.

The following subclasses of UnitSpec will also be implemented:

  • UnitSpecValidate : which can only validate physical type
  • UnitSpecConvert : which will convert quantities to specific units / physical type
  • UnitSpecValue : a subclass of UnitSpecConvert which takes .to_value after conversion
  • UnitSpecAssign : which reassigns units if none present & verifies units if present.

Using __init_subclass__ these and any user-created subclasses of UnitSpec will be added to a registry. The "action" kwarg of UnitSpec must be a key in this registry and will be used by UnitSpec to set the behavior of UnitSpec.__call__. In this way UnitSpec can stand in for any of its subclasses, which are only really needed to make a behavior immutable.

UnitSpec will drastically simplify a few things in quantity_input, and potentially elsewhere.

For instance, in quantity_input, all that's needed to check a multiply-annotated argument and then perform the validation is

if isAnnotated(antn):
    md = [isinstance(x, UnitSpecBase) for x in antn.__metadata__]
    if not any(md):  # skip to next argument
        continue
    elif sum(md) > 1 :  # more than one UnitSpec
        raise Exception()

    unitspec = antn.__metadata__[np.nonzero(md)]
    if name == "return":  # save the return casting until actually call function.
        return_spec = unitspec
    else:
        ba.arguments[name] = unitspec(ba.arguments[name])

quantity_input implementation

Order of precedence of unit specifications:

  1. Kwargs into quantity_input:
    quantity_input(x=u.hour)(foo).
    This allows quantity_input to wrap functions from external libraries without modifying the function.
  2. function annotations
    So that calling quantity_input as a function overrides the default unit specifications.
  3. units from argument default values
    To respect type hints

Argument-specific behavior in quantity_input is allowed by mixing UnitSpec subclasses in wrapper.__annotations__.

For simple means of ensuring uniform UnitSpec behavior, there will be a few subclasses of QuantityInput which convert any user-created unit-aware Quantity annotations to the appropriate UnitSpec subclasses. The complexities of the wrapper construction can happen in super and only ever need to be written once. The UnitSpec auto-registry very easily allows user-created UnitSpec and therefore user-created QuantityInput subclasses.

  • ValidateQuantityInput : converts UnitSpec to UnitSpecValidate
  • ConvertQuantityInput : converts UnitSpec to UnitSpecConvert
  • DeQuantityInput : converts UnitSpec to UnitSpecValue

For example, to make a to_value flavor of quantity_input is a simple for-loop replacing UnitSpec with UnitSpecValue, the subclass that performs conversion and takes the value. While this is the default behavior for UnitSpec and the subclass used by UnitSpec(...action = 'value'), the subclass is immutable and more explicit.

class DeQuantityInput(QuantityInput):
    """QuantityInput, with ``to_value(unit)`` on all specified inputs."""

    def __call__(self, wrapped_function):

        # use all the machinery from superclass
        wrapper = super().__call__(wrapped_function)

        # but now need to ensure that everything is correct UnitSpec
        _antns_ = dict()
        for name, antn in wrapper.__annotations__.items():

            if isAnnotated(antn):
                unitspec = antn.__metadata__[0]
                if isinstance(unitspec, UnitSpecBase):
                    if name == "return":
                        _antns_[name] = UnitSpec(unitspec, action="from_value")
                    else:
                        _antns_[name] = UnitSpecValue(unitspec)

        wrapper.__annotations__.update(_antns_)

        return wrapper

Examples

Assuming the following imports

import typing_extensions as T
from typing_extensions import Annotated
import astropy.units as u
from astropy.units import quantity_input, Quantity, UnitSpec
import astropy.coordinates as coord

Example 1)

@quantity_input(v="speed")
def foo(x: u.m, v: u.Quantity, t=2 * u.s, y: T.Optional[u.Quantity[u.km]]=None, flag: T.Optional[str]=None) -> u.Quantity["distance"]:
     return x + v * t

# annotations
foo.__annotations__

{
"x": Annotated[Quantity, UnitSpec(u.m | validate Quantity)],
"v": Annotated[Quantity, UnitSpec("speed" | validate Quantity)],
"t": Annotated[Quantity, UnitSpec(u.s | validate Quantity)],
"y": T.Union[Annotated[Quantity, UnitSpec(u.km | validate Quantity)], NoneType]
"flag": T.Union[str, NoneType],
"return": Annotated[Quantity, UnitSpec("distance" | validate Quantity)]
}

# and the original function
foo.__wrapped__.__annotations__

{
"x": u.m, "v": Quantity, "flag": T.Optional[str],
"y": T.Union[Annotated[Quantity, u.km], NoneType], "return": "distance"
}

To force x input to be Distance or subclass

foo.__annotations__["x"].__origin__ = coord.Distance
foo.__annotations__["x"].__metadata__[0].dtype = coord.Distance
foo.__annotations__["x"]

Annotated[Distance, UnitSpec(u.m | validate Distance)]

To change t to action "convert"

foo.__annotations__["t"].__metadata_[0].action = "convert"
foo.__annotations__["t"]

Annotated[Quantity, UnitSpec(u.s | convert Quantity)]

To make a to_value flavor of quantity_input is a simple for-loop replacing UnitSpec with UnitSpecValue, the subclass that performs conversion and takes the value. While this is the default behavior for UnitSpec and the subclass used by UnitSpec(...action = 'value'), the subclass is immutable and more explicit.

class DeQuantityInput(QuantityInput):
    """QuantityInput, with ``to_value(unit)`` on all specified inputs."""

    def __call__(self, wrapped_function):

        # use all the machinery from superclass
        wrapper = super().__call__(wrapped_function)

        # but now need to ensure that everything is correct UnitSpec
        _antns_ = dict()
        for name, antn in wrapper.__annotations__.items():

            if isAnnotated(antn):
                unitspec = antn.__metadata__[0]
                if isinstance(unitspec, UnitSpecBase):
                    if name == "return":
                        _antns_[name] = UnitSpec(unitspec, action="from_value")
                    else:
                        _antns_[name] = UnitSpecValue(unitspec)

        wrapper.__annotations__.update(_antns_)

        return wrapper

Fixes #10156, #5606, #7368

@astropy-bot astropy-bot bot added the units label Aug 14, 2020
@pep8speaks
Copy link

pep8speaks commented Aug 14, 2020

Hello @nstarman 👋! It looks like you've made some changes in your pull request, so I've checked the code again for style.

There are no PEP8 style issues with this pull request - thanks! 🎉

Comment last updated at 2022-01-03 11:56:11 UTC

@mhvk
Copy link
Contributor

mhvk commented Aug 14, 2020

I like very much where this is going!

But as with other rather big PRs, I worry this is of a size where it will be difficult (at least for me) to find time to properly review... (I'm supposed to work on resubmission of a paper right now!).

Of course, I want to be sure not to cause too much work either, but might it be an idea to try to break this down into a couple of steps, which we can judge separately (and ideally each are useful)? E.g.,

  1. Introduce Quantity.__class_getitem__ and the basic Quantity[Unit()] type, which is then used inside quantity_input, but otherwise as yet without any other change to the code, and without any worry about other annotations for quantity inputs (which no current user of quantity_input can be using), except allowing to ignore the annotation for the return type, thus fixing Allow return annotations other than Quantity #10156.
  2. Extend Unit to your UnitSpec so that quantity_input becomes more powerful (addressing Decorator for converting quantity to arrays in specific units? #7368) (or perhaps going the subclassing of quantity_input route; it is not quite obvious to me UnitSpec is truly needed - it seems a bit contrary to the general typing idea).
  3. Change quantity_input so that it uses the actual class, making Angle and Distance possible (maybe will be easiest if done together with (2)).
  4. Worry about dealing with other annotations that might be present on the function (i.e., introduce dependency on typing_extensions only here).
  5. Annotate the code added.

An advantage of this order is that we go first towards enhancing quantity_input (for which review by @Cadair and me will suffice), and leave adding a new run-time dependency and general annotation for later (new dependencies are always somewhat contentious; and general annotation is something for which I'd definitely like more general input of the astropy developers since it may include choices that affect other parts too; I do know there is interest in typing).

What do you think? I hope that the above would not make much more work... If we go this route, it might be easiest to open a new PR, and leave this one to give an overall idea of where this is going (your description on top is very helpful!).

p.s. Final friendly request: if possible, try to minimize code style changes; it makes review easier.

@nstarman
Copy link
Member Author

nstarman commented Aug 16, 2020

Thanks for the detailed response. I agree that this PR might be best split into a few pieces. It kind of ballooned as I was writing the description. While I have much of the code written, as they say: the first 80% takes only 20% of the time. Getting that last 20% will be time consuming.

I've been thinking a while about the paradigm of coordinating everything through annotated typing and if there were an elegant alternative. Since we have the essential long-term goal of allowing static typing, I do not think there is any reasonable alternative. Therefore, I think the best incremental approach is to mimic much of the static typing approach, without actually producing or using static typing. Specifically, instead of using the typing_extensions.Annotated, which requires typing_extensions as a dependency, we just copy / exactly mimic that class for use in Quantity.__class_getitem__. When Astropy supports static typing all we need to do is deprecate our Annotated class in favor of the valid static type version.

My proposed series of PRs is:

  1. "Introduce Quantity.__class_getitem__ and the basic Quantity[Unit()] type, which is then used inside quantity_input, but otherwise as yet without any other change to the code, and without any worry about other annotations for quantity inputs (which no current user of quantity_input can be using)"

  2. Rewrite quantity_input internals so that all information is stored in wrapper.__annotations__, converting valid inputs to Quantity[<unit or pt>] from PR#1. This is still not valid static typing, but it will trivially solve Allow return annotations other than Quantity #10156 by having a) non-unit / physicaltype return annotations ignored and b) allowing quantity_input(return=None) to store None in wrapper.__annotations__ instead of Quantity[<unit or pt>]

  3. UnitSpec to solve Decorator for converting quantity to arrays in specific units? #7368 and multiple annotations

I think a good fix to the question of UnitSpec is to have Quantity[<unit or pt>] create the more minimal Annotated[Quantity, <unit>, other] and have quantity_input convert this to Annotated[Quantity, UnitSpec(<unit>), other] for internal use.

I would suggest, though this is optional, to add another method to Quantity that directly creates the UnitSpec version. See below for reasons.

  1. switch to valid static typing, having had time to discuss this more broadly with the rest of Astropy.

How are these types of multiple PRs generally coordinated?


WIth regards to the necessity of UnitSpec, no it is not strictly necessary. However, it drastically simplifies a few things.

For instance, in quantity_input, all that's needed to check a multiply-annotated argument and then perform the validation is

if isAnnotated(antn):
    md = [isinstance(x, UnitSpecBase) for x in antn.__metadata__]
    if not any(md):  # skip to next argument
        continue
    elif sum(md) > 1 :  # more than one UnitSpec
        raise Exception()

    unitspec = antn.__metadata__[np.nonzero(md)]
    if name == "return":  # save the return casting until actually call function.
        return_spec = unitspec
    else:
        ba.arguments[name] = unitspec(ba.arguments[name])

Also, to subclass QuantityInput for a to_value flavor, one need only change the UnitSpec. The complexities of the wrapper construction can happen in super and only ever need to be written once. The UnitSpec auto-registry very easily allows user-created UnitSpec and therefore user-created QuantityInput subclasses.

For example, to make the to_value flavor of quantity_input is a simple for-loop replacing UnitSpec with UnitSpecValidate, the subclass that only performs unit / physical type validation. While this is the default behavior for UnitSpec and the subclass used by UnitSpec(...action = 'validate'), the subclass is immutable and more explicit.

class DeQuantityInput(QuantityInput):
    """QuantityInput, with ``to_value(unit)`` on all specified inputs."""

    def __call__(self, wrapped_function):

        # use all the machinery from superclass
        wrapper = super().__call__(wrapped_function)

        # but now need to ensure that everything is correct UnitSpec
        _antns_ = dict()
        for name, antn in wrapper.__annotations__.items():

            if isAnnotated(antn):
                unitspec = antn.__metadata__[0]
                if isinstance(unitspec, UnitSpecBase):
                    if name == "return":
                        _antns_[name] = UnitSpec(unitspec, action="from_value")
                    else:
                        _antns_[name] = UnitSpecValue(unitspec)

        wrapper.__annotations__.update(_antns_)

        return wrapper

Furthermore, argument-specific behavior in quantity_input is allowed by mixing UnitSpec subclasses in wrapper.__annotations__.

@mhvk
Copy link
Contributor

mhvk commented Aug 16, 2020

@nstarman - excellent summary, it looks like we are on the same page for the schedule. The arguments for UnitSpec also make sense, as it neatly separates the analysis of arguments (quantity_input) and what to actually do with them (which in principle might even be different for different arguments).

@nstarman
Copy link
Member Author

nstarman commented Aug 17, 2020

PR #10662 might be delayed to Astropy 5.0, when py3.6 EOLs.
I don't think this changes the direction of the following PRs as Quantity[<unit>] was a convenience method to create Annotated[Quantity, <unit>]. It will be a little less convenient for the user to write out, but certainly not prohibitive.

@pllim
Copy link
Member

pllim commented Aug 17, 2020

Well, the delay is up for discussions. 😬 I am just thinking that if we can avoid much custom code if we wait for bump to Python 3.7, there is no reason not to wait? Anyways, other ideas welcome!

@nstarman
Copy link
Member Author

nstarman commented Aug 18, 2020

I completely agree about avoiding a metaclass for Quantity just to accommodate Python 3.6. It's especially not a problem as the proposed Quantity[<unit>] is only a convenience method.

@Cadair
Copy link
Member

Cadair commented Aug 18, 2020

This looks awesome, I will try and find enough time to really dive into it soon 🚀

@pllim
Copy link
Member

pllim commented Oct 22, 2020

Hello, given the draft status of this PR and the impending feature freeze tomorrow, I am moving the milestone. Thank you for your understanding.

@pllim pllim modified the milestones: v4.2, v4.3 Oct 22, 2020
@nstarman
Copy link
Member Author

nstarman commented Oct 22, 2020 via email

@pllim
Copy link
Member

pllim commented Oct 22, 2020

Thanks for your patience, @nstarman ! Please hold off on moving the change log until new section exists. 😄

@mhvk mhvk mentioned this pull request Apr 19, 2021
10 tasks
@nstarman
Copy link
Member Author

@pllim, plz push to next milestone :)

@mhvk mhvk modified the milestones: v4.3, v5.0 Apr 19, 2021
@adrn
Copy link
Member

adrn commented Sep 27, 2021

It's that time again @nstarman :) -- any sense of what the next steps here would be?

@nstarman
Copy link
Member Author

nstarman commented Sep 29, 2021

@adrn It's mostly about figuring out what todo in #10662. NumPy has plans that complicate that PR somewhat. Once that's in I have most of this PR finished.

Edit: #10662 is ready for review. Once merged, I can return to this.

Signed-off-by: Nathaniel Starkman (@nstarman) <nstarkman@protonmail.com>
Signed-off-by: Nathaniel Starkman (@nstarman) <nstarkman@protonmail.com>

style fixes

Signed-off-by: Nathaniel Starkman (@nstarman) <nstarkman@protonmail.com>
Signed-off-by: Nathaniel Starkman (@nstarman) <nstarkman@protonmail.com>

style fix

Signed-off-by: Nathaniel Starkman (@nstarman) <nstarkman@protonmail.com>
@github-actions
Copy link

github-actions bot commented Jun 3, 2022

Hi humans 👋 - this pull request hasn't had any new commits for approximately 4 months. I plan to close this in 30 days if the pull request doesn't have any new commits by then.

In lieu of a stalled pull request, please consider closing this and open an issue instead if a reminder is needed to revisit in the future. Maintainers may also choose to add keep-open label to keep this PR open but it is discouraged unless absolutely necessary.

If this PR still needs to be reviewed, as an author, you can rebase it to reset the clock.

If you believe I commented on this pull request incorrectly, please report this here.

@nstarman
Copy link
Member Author

nstarman commented Jun 4, 2022

Sigh. I have a working implementation, but I'm unhappy with the large code burden it adds.

@github-actions
Copy link

github-actions bot commented Jul 4, 2022

I'm going to close this pull request as per my previous message. If you think what is being added/fixed here is still important, please remember to open an issue to keep track of it. Thanks!

If this is the first time I am commenting on this issue, or if you believe I closed this issue incorrectly, please report this here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow return annotations other than Quantity
6 participants