Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add arguments when subcommands are used #89

Open
wookayin opened this issue Nov 23, 2023 · 10 comments
Open

Add arguments when subcommands are used #89

wookayin opened this issue Nov 23, 2023 · 10 comments

Comments

@wookayin
Copy link

wookayin commented Nov 23, 2023

Example: https://brentyi.github.io/tyro/examples/02_nesting/05_subcommands_func/

I would like to have a global option to the program, not options to specific subcommands, say app.py --verbose, app.py --version, or app.py --config foo.bar whilest subcommands app.py checkout --branch=... app.py commit --message ... are still working.

What I tried:

 from typing import Union, Annotated, Any
 import tyro
+import dataclasses
 
 def checkout(branch: str) -> None:
     """Check out a branch."""
     print(f"{branch=}")
 
 
 def commit(message: str, all: bool = False) -> None:
     """Make a commit."""
     print(f"{message=} {all=}")
 
+@dataclasses.dataclass
+class Arg:
+    foo: int = 1
+

 if __name__ == "__main__":
     tyro.cli(
         Union[
             Annotated[
                 Any,
                 tyro.conf.subcommand(name="checkout", constructor=checkout),
             ],
             Annotated[
                 Any,
                 tyro.conf.subcommand(name="commit", constructor=commit),
             ],
+            Arg,
        ]
    )

But the arguments defined in the dataclass Arg is always unrecognized Unrecognized arguments: --foo when subcommands are used.

@wookayin
Copy link
Author

wookayin commented Nov 23, 2023

I saw #90, thanks for the fix, but it recognizes Arg as a subcommand rather than arguments (i.e. no --foo exposed). Perhaps the code snippet/semantics above is to create a new subcommand? even when it is not annotated with tyro.conf.subcommand? From the semantics of union, I think this should be treated as if we were doing tyro.cli(Arg) rather than a subcommand.

❯❯❯ python 89.py --help
usage: 89.py [-h] {checkout,commit,arg}

╭─ arguments ─────────────────────────────────────────────╮
│ -h, --help              show this help message and exit │
╰─────────────────────────────────────────────────────────╯
╭─ subcommands ───────────────────────────────────────────╮
│ {checkout,commit,arg}                                   │
│     checkout            Check out a branch.             │
│     commit              Make a commit.                  │
│     arg                                                 │
╰─────────────────────────────────────────────────────────╯
❯❯❯ python 89.py --foo 1
╭─ Unrecognized arguments ───────────────────╮
│ Unrecognized or misplaced arguments: --foo │
│ ────────────────────────────────────────── │
│ Perhaps you meant:                         │
│     --foo INT                              │
│         (default: 1)                       │
│             in 89.py arg --help            │
│ ────────────────────────────────────────── │
│ For full helptext, run 89.py --help        │
╰────────────────────────────────────────────╯

@brentyi
Copy link
Owner

brentyi commented Nov 23, 2023

Hi! I've been meaning to reply but am still ironing out a few more edge cases that your example (and the recent introduction of constructor=) has exposed.

I appreciate you filing this issue; to clarify, tyro.conf.subcommand isn't necessary to create subcommands.

For the Union semantics: I can see why it's confusing, but what x: Union[A, B, C] means in Python's type system is that x has the type of one of {A,B,C}. This is a union at the type level, not a union of the arguments/fields that exist in A, B, and C.

So when we write:

@dataclasses.dataclass
class Args1:
    x: int

@dataclasses.dataclass
class Args2:
    y: int

what tyro does when tyro.cli(Union[Args1, Args2]) is called is provide the option of instantiating either Args1 or Args2:

image

To clarify your example, when we write the union what we mean is:

# Choose and instantiate one of...
         Union[
             # (1) An `Any` type, using `checkout` as a constructor.
             Annotated[
                 Any,
                 tyro.conf.subcommand(name="checkout", constructor=checkout),
             ],
             # (2) An `Any` type, using `commit` as a constructor.
             Annotated[
                 Any,
                 tyro.conf.subcommand(name="commit", constructor=commit),
             ],
             # (3) An `Arg` type, using its default constructor.
            Arg,
        ]

...which results in 3 subcommands.

Does that make sense to you?

@wookayin
Copy link
Author

wookayin commented Nov 23, 2023

Thanks for your reply. But I am not sure if this has the correct semantics.

Simply doing the following without Unions

tyro.cli(Args)

will yield

❯❯❯ python 89_Args_only.py --help
usage: 89_Args_only.py [-h] [--foo INT]

╭─ arguments ───────────────────────────────────────╮
│ -h, --help        show this help message and exit │
│ --foo INT         (default: 1)                    │
╰───────────────────────────────────────────────────╯

It sounds like the semantics of Union in the context of tyro is to always induce subcommands (not by tyro.conf.subcommand), as in https://brentyi.github.io/tyro/examples/02_nesting/02_subcommands/, but I am not sure if this is the best way. I was thinking like the subcommand (subparsers) were achieved by tyro.conf.subcommand, but it feels not and it sounds quite confusing to me. The behavior and semantic of Union[T1, T2, ...] could be clearly documented.

If x is of type Union[A, B, C] then an instance of A is valid for x, not something like Subcommand[A]. What I mean here is if tyro.cli(Args) exposes the global-level argument, then typo.cli(Union[Args, Subcommand1, Subcommand2]) should also do the same. However, changing the semantic of Union will be a breaking change, but I would argue that simply relying on Union for sub-parsers may not be the best possible design. We would need a more explicit semantic or annotations for subparsers, given we have an explicit api for that: tyro.conf.subcommand.

All that said, then how should I do in order to achieve the behavior I would want? (python 89.py --foo where foo is defined in the dataclass Args)

@brentyi
Copy link
Owner

brentyi commented Nov 23, 2023

Sorry in advance for the length here; if anything's unclear please do let me know!


It sounds like the semantics of Union in the context of tyro is to always induce subcommands (not by tyro.conf.subcommand)

Yes, this is basically true and (to me) follows naturally given the meaning of a Union type. tyro always follows Python's official typing semantics where possible; see here for unions. If tyro sees Union[T1, T2] it will always assume that you want either T1 or T2.

In the case of Union[int, str], tyro will assume you want either an int or a string:

def main(x: Union[int, str]) -> None:
    ...

tyro.cli(main)

where --x 5 will set x=5 and --x five will set x="five".

In the case of Union[Args1, Args2], where Args1 and Args2 are dataclasses, tyro will similarly instantiate only one of the two types. These correspond to a pair of mutually exclusive groups of arguments, which is what subcommands are designed to enable choosing between.

tyro.conf.subcommand() exists to configure options for subcommands and are not necessary to create them, the same way tyro.conf.arg() exists to configure options for arguments and are not necessary to create them. You're right that we could/should document all of these things better.


If x is of type Union[A, B, C] then an instance of A is valid for x, not something like Subcommand[A]

I'm not really following this, could you elaborate? In the current setup if tyro.cli() is passed Union[A, B, C], then it will return either an instance of A, and instance of B, or an instance of C. We can choose between these using subcommands; I can't think of an alternative behavior that would be more intuitive. (some other libraries like simple-parsing and datargs have also converged to ~the same behavior)


All that said, then how should I do in order to achieve the behavior I would want? (python 89.py --foo where foo is defined in the dataclass Args)

I think it's not 100% clear to me how exactly you want to use your global arguments, but here are some patterns that might be helpful:

import tyro
from typing import Union

import dataclasses

@dataclasses.dataclass
class Subcommand1:
    a: int

@dataclasses.dataclass
class Subcommand2:
    b: int

@dataclasses.dataclass
class Args:
    subcommand: Union[Subcommand1, Subcommand2] 
    foo: int


tyro.cli(Args)
import dataclasses
from typing import Union

import tyro


@dataclasses.dataclass
class Subcommand1:
    a: int

@dataclasses.dataclass
class Subcommand2:
    b: int

@dataclasses.dataclass
class Args:
    foo: int


def main(args: Args, subcommand: Union[Subcommand1, Subcommand2]) -> None:
    ...

tyro.cli(main)
import dataclasses
from typing import Union

import tyro


@dataclasses.dataclass
class SharedArgs:
    foo: int


@dataclasses.dataclass
class Subcommand1(SharedArgs):
    a: int


@dataclasses.dataclass
class Subcommand2(SharedArgs):
    b: int


subcommand = tyro.cli(Union[Subcommand1, Subcommand2])

@wookayin
Copy link
Author

wookayin commented Nov 24, 2023

Thank you very, very much for your explanation and a very comprehensive, thoughtful answer! I really appreciate your time and efforts on my issue and the great project as well :)

I'm not really following this, could you elaborate?

Just to clarify what I wanted to say here:

  • Let's suppose tyro.cli(f) where f = Union[Arg1, Arg2]
  • The behavior of tyro.cli(Arg1) and tyro.cli(Arg1 | Arg2) are different, which is confusing given that, if isinstance(x, Arg1) then isinstance(x, Arg1 | Arg2) is true.

These correspond to a pair of mutually exclusive groups of arguments, which is what subcommands are designed to enable choosing between.

Yes, I can see why we had such a design and semantics on Union[T1, T2] for subparsers. If we define the semantics of Union as Union always create a subcommand/subparser, that all makes sense. It is implicitly assumed that because T1, T2, etc. should be treated mutually exclusive as they might conflict; we are not like merging T1, T2. I think this is a reasonable design, although the semantics of Union in Python does not necessarily imply that it is always mutually exclusive.

However, in order to implement what is requested in this issue (global arguments) we'll need a bit more sophisticated mechanism.

Let me first elaborate on a use case that I'd like to achieve here. Basically, I'd like to add some "global arguments" that is not tied with any subparsers:

 usage: 89.py [-h] {checkout,commit}
 
 ╭─ arguments ─────────────────────────────────────────────╮
 │ -h, --help              show this help message and exit │
+| --version               print version and exit          │  <--- add here, something like this
 ╰─────────────────────────────────────────────────────────╯
 ╭─ subcommands ───────────────────────────────────────────╮
 │ {checkout,commit,arg}                                   │
 │     checkout            Check out a branch.             │
 │     commit              Make a commit.                  │
 ╰─────────────────────────────────────────────────────────╯

Let's say we are building a git-like program with some subcommands and global options without subcommand:

python git.py --version
python git.py -c core.foo=bar
python git.py -c core.foo=bar commit
python git.py -c core.foo=bar commit -m "hello world"

Some global options like -c would make a sense to be a part of "shared arguments", but some other special handlers such as --version shouldn't be.

(1) One idea is to have a special typing object:

tyro.cli(tyro.Subcommands(
   Arg1, Arg2,
   arguments=GlobalArgs,
])

# .. or

tyro.cli(tyro.Subcommands(
  Union[
    Annotated[Any, tyro.conf.subcommand(name="checkout", constructor=checkout),
    Annotated[Any, tyro.conf.subcommand(name="commit", constructor=commit),
  ],
  arguments=GlobalArgs,
))

which would be more explicit than Union[...]. Or using "Annotated":

tyro.cli(Annotated[
   Union[...],
   tyro.conf.????(arguments=GlobalArgs)
])

although I don't like the use of Annotated very much because it is never explicit.

(2) Or use something like unboxing or unwrapping (out of subcommand), i.e., add a metadata like

tyro.cli(Union[
    Annotated[Any, tyro.conf.subcommand(name="checkout", constructor=checkout),
    Annotated[Any, tyro.conf.subcommand(name="commit", constructor=commit),
    # Note the new field unwrap=True (or this can be implicitly assumed by having an "empty" name without the new field)
    Annotated[Any, tyro.conf.subcommand(name="", constructor=global_args, unwrap=True)
])

(though we need a better name that is much clearer than this)

which will not create a subparser but add arguments to the main parser. NOTE:

  • We may want to add a constraint: there should be only one unwrap-ped subcommand.
  • Since this is an argument rather than a subcommand, we should also allow to configure whether to make them mutually exclusive (i.e. execute the global arg and do not invoke subcommands) or allow both.

What do you think? Thanks a lot again!

@brentyi
Copy link
Owner

brentyi commented Nov 24, 2023

the semantics of Union in Python does not necessarily imply that it is always mutually exclusive.

I'm not sure I agree with this. Assuming no inheritance or structural subtyping (typing.Protocol), when we write

x = tyro.cli(Union[Args1, Args2])

it's not possible for both isinstance(x, Args1) and isinstance(x, Args2) to be True, right? To me this implies mutual exclusivity.


I also appreciate you taking the time to write out some suggestions, it was helpful! I think a core question for me is: how would you want to access the global arguments?

If you wrote:

x = tyro.cli(tyro.Subcommands(
   Arg1, Arg2,
   arguments=GlobalArgs,
])

it's not clear to me what reveal_type(x) would resolve to.

I do think the core idea of wanting to instantiate both (1) Arg1 or Arg2 and (2) GlobalArgs is currently very possible.

For example you can use tyro to instantiate a tuple that contains exactly that:

import dataclasses
from typing import Annotated, Tuple, Union

import tyro


@dataclasses.dataclass
class Checkout:
    """Check out a branch."""

    x: int


@dataclasses.dataclass
class Commit:
    """Commit something."""

    y: int


@dataclasses.dataclass
class Args:
    version: bool = False
    """Print version and exit."""


global_args, checkout_or_commit = tyro.cli(
    Tuple[
        Args,
        Union[Checkout, Commit],
    ]
)
print(global_args, checkout_or_commit)

This should also type check correctly in pyright/pylance!

A downside is that nesting things in Tuple[] will prefix arguments with the tuple indices, like --0.version. You can use tyro.conf.arg() to erase these and produce --version:

global_args, checkout_or_commit = tyro.cli(
    Tuple[
        Annotated[Args, tyro.conf.arg(name="")],
        Annotated[Union[Checkout, Commit], tyro.conf.arg(name="")],
    ]
)

Where the end result is:

(py310) ?brent/retain_subcommand_annotations ~/tyro python subcommand_example.py --help
usage: subcommand_example.py [-h] [--version | --no-version] {checkout,commit}

╭─ arguments ──────────────────────────────────────────────────────╮
│ -h, --help              show this help message and exit          │
│ --version, --no-version                                          │
│                         Print version and exit. (default: False) │
╰──────────────────────────────────────────────────────────────────╯
╭─ subcommands ────────────────────────────────────────────────────╮
│ {checkout,commit}                                                │
│     checkout            Check out a branch.                      │
│     commit              Commit something.                        │
╰──────────────────────────────────────────────────────────────────╯

Does this seem consistent with your desired behavior?

@wookayin
Copy link
Author

wookayin commented Nov 25, 2023

Thanks for the response.

it's not possible for both isinstance(x, Args1) and isinstance(x, Args2) to be True, right? To me this implies mutual exclusivity.

But yes, only under some assumptions --- like no inheritance or structural subtyping. In general Union[T1, T2, ...] does NOT have a semantic of mutual-exclusive OR.

Even more simpler example: Union[Literal["foo"], str]: x can be both a Literal["foo"] and a str, though in this case these would not induce subcommands. When it comes to dataclass or functions (concrete types), yes, this will mean mutual-exclusive OR. While the (general) semantic of Union is not necessarily XOR, but in the context of tyro it would be very reasonable to define so (provided clear documentation/clarification).


The idea of using Tuple[] (or even nested dataclass or namedtuple) is brilliant! Here, the semantic is "conjunction" rather than "disjunction" (Union), so this is exactly what I want. The trick is to use name="" to remove the unnecessary tuple index or field for dataclass/namedtuple. The resulting typing seems quite verbose, complex, and not quite intuitive with lots of Annotated, but this does the trick! Except that it actually does not work.

A fully reproducible code
import dataclasses
from pathlib import Path
from typing import Annotated, Tuple, Union

import tyro.conf


@dataclasses.dataclass
class Checkout:
    """Check out a branch."""
    branch: str


@dataclasses.dataclass
class Commit:
    """Commit something."""
    input: tyro.conf.Positional[Path]


@dataclasses.dataclass
class Arg:
    verbose: bool = True


def case1():
    # 1. OK
    o = tyro.cli(
        Union[
            Checkout,
            Commit,
        ],
        args=["commit", "./path.txt"]
    )

    # should be Commit(input=PosixPath('path.txt'))
    print(o)


def case2():
    arg, action = tyro.cli(Tuple[
        Arg,
        Annotated[
            Union[
                Checkout,
                Commit,
            ],
            tyro.conf.arg(name=""),
        ]
    ], args=["commit", "./path.txt"])

    # should be Commit(input=PosixPath('path.txt'))
    assert isinstance(arg, Arg)
    assert isinstance(action, Commit)
    print(action)


def case3():
    # 3. Error
    # tyro._instantiators.UnsupportedTypeAnnotationError:
    # Expected fully type-annotated callable, but <function Singleton.__new__ at 0x103058720>
    # with arguments ('args', 'kwds') has no annotation for 'args'.

    o = tyro.cli(Tuple[
        Annotated[
            Arg,
            tyro.conf.arg(name=""),
        ],
        Annotated[
            Union[
                Annotated[Checkout, tyro.conf.subcommand(name="checkout")],
                Annotated[Commit, tyro.conf.subcommand(name="commit")],
            ],
            tyro.conf.arg(name=""),
        ]
    # ], args=["--help"])
    ], args=["commit", "./path.txt"])
    print(o)


def case4():
    # Weird type eerror
    # tyro._instantiators.UnsupportedTypeAnnotationError:
    # Unsupported type annotation for the field .
    # To suppress this error, assign the field a default value.
    #
    # Tuple[T1] -> maybe this is an invalid type definition?
    o = tyro.cli(Tuple[
        Annotated[
            Union[
                Annotated[Checkout, tyro.conf.subcommand(name="checkout")],
                Annotated[Commit, tyro.conf.subcommand(name="commit")],
            ],
            tyro.conf.arg(name=""),
        ]
    ], args=["commit", "./path.txt"])
    print(o)


if __name__ == '__main__':
    # case1()
    # case2()
    case3()
    # case4()

The error is:

tyro._instantiators.UnsupportedTypeAnnotationError: Expected fully type-annotated callable, but <function Singleton.__new__ at 0x1015a87c0> with arguments ('args', 'kwds') has no annotation for 'args'.

@brentyi
Copy link
Owner

brentyi commented Nov 25, 2023

Got it, agree with what you wrote about union semantics!

I also agree that the Annotated verbosity for achieving the behavior you want is not great, but hopefully the tradeoffs / reasoning for the default behavior makes sense. I suppose if one wants to reduce verbosity for more specific applications it's always possible to write helper functions that wrap tyro.cli() and abstract away the necessary Annotated[] flags.

For avoiding the cryptic tyro.conf.arg(name="") annotations, here's another (still verbose) option you can try with #93 merged:

global_args, checkout_or_commit = tyro.cli(
    Annotated[
        Tuple[
            Union[
                Checkout,
                Commit,
            ],
            Args,
        ],
        tyro.conf.OmitArgPrefixes,
        tyro.conf.OmitSubcommandPrefixes,
    ]
)

@wookayin
Copy link
Author

Thanks for the tip tyro.conf.OmitArgPrefixes and tyro.conf.OmitSubcommandPrefixes. One question -- they do apply recursively to the nested subtree of what is being "annotated", right?

@brentyi
Copy link
Owner

brentyi commented Nov 25, 2023

Yep, all of the tyro.conf.[SomeCapitalizedOption] options are recursively applied. There's also no way to "un-apply" them at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants