Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Add CLI argument parsing for model initialization #209

Open
kschwab opened this issue Jan 18, 2024 · 11 comments · May be fixed by #214
Open

Feature Request: Add CLI argument parsing for model initialization #209

kschwab opened this issue Jan 18, 2024 · 11 comments · May be fixed by #214
Assignees

Comments

@kschwab
Copy link
Contributor

kschwab commented Jan 18, 2024

Hi @hramezani,

I wanted to propose adding CLI argument parsing into Pydantic settings. Under the hood, it is essentially the same as environment variable parsing, so not very difficult to add (relatively speaking). This would be nice to have as it would complete all sources of ingestion at the application level (FILES, ENV, CLI). I am in the process of completing a rough draft and wanted to see if there was interest in bringing something like this into main. Would love to contribute and take feedback on what would be desired if interested.

As with parsing environment variables example, nested settings would take precedence, etc. Essentially, the below:

# your environment
export V0=0
export SUB_MODEL='{"v1": "json-1", "v2": "json-2"}'
export SUB_MODEL__V2=nested-2
export SUB_MODEL__V3=3
export SUB_MODEL__DEEP__V4=v4

would be equivalent to:

# your cli
app --v0=0 --sub_model='{"v1": "json-1", "v2": "json-2"}' --sub_model.v2 nested-2 \
--sub_model.v3 3 --sub_model.deep.v4 v4

Then, within the application it would simply be:

from pydantic import BaseModel

from pydantic_settings import BaseSettings, SettingsConfigDict


class DeepSubModel(BaseModel):  
    v4: str


class SubModel(BaseModel):  
    v1: str
    v2: bytes
    v3: int
    deep: DeepSubModel


class Settings(BaseSettings):
    model_config = SettingsConfigDict(env_nested_delimiter='__')

    v0: str
    sub_model: SubModel


print(Settings.parse_cli().model_dump())
"""
{
    'v0': '0',
    'sub_model': {'v1': 'json-1', 'v2': b'nested-2', 'v3': 3, 'deep': {'v4': 'v4'}},
}
"""

Thoughts?

@kschwab kschwab changed the title Add CLI argument parsing for model initialization Feature Request: Add CLI argument parsing for model initialization Jan 18, 2024
@hramezani
Copy link
Member

Thanks, @kschwab for this feature request.

I think if we want to support CLI parsing, it has to be added as a new setting source class like other source classes. So, I think Settings.parse_cli() is not the right way to go.

@hramezani
Copy link
Member

There was a similar issue before that was closed. at that point, they suggested using typer.

@samuelcolvin @dmontagu what do you think?

@samuelcolvin
Copy link
Member

I think this would be great, as I said on the original issue, it can start small - e.g. fields can only be set via named arguments like --foo 123.

@samuelcolvin
Copy link
Member

I agree with @hramezani that this should be implemented as a separate Source.

PR welcome.

@frederikaalund
Copy link

Shameless plug: cyto contains a CLI-based settings Source for pydantic.

It uses click under the hood to parse the CLI arguments. There is an extensive test suite.

Feel free to copy or take inspiration from cyto (it's under an MIT license).

@kschwab
Copy link
Contributor Author

kschwab commented Jan 19, 2024

I think if we want to support CLI parsing, it has to be added as a new setting source class like other source classes. So, I think Settings.parse_cli() is not the right way to go.

Yep, I agree with this as well. Once I started integration this was the direction I took. The only point here was backwards compatibility. As it currently sits, I introduced CliSettingsSource as a core source in settings_customise_sources, but we can easily move it if desired. IMO it would be nice to have it as a core built in source. For now, it would look like this:

print(Settings(_cli_parse_args=True))
"""
{
    'v0': '0',
    'sub_model': {'v1': 'json-1', 'v2': b'nested-2', 'v3': 3, 'deep': {'v4': 'v4'}},
}
"""

There was a pydantic/pydantic#756 before that was closed. at that point, they suggested using typer.

Thanks for highlighting this thread, I had not seen it. @dmontagu opening statement "I would really like a lightweight CLI-argument parsing class, similar to BaseSettings" captures our interest and use case as well.

I think this would be great, as I said on the original issue, it can start small - e.g. fields can only be set via named arguments like --foo 123.

Perfect, this is where I started. The main points I intend to cover are:

  • Generated --help documentation was a must. This becomes very clean once Extract attribute docstrings for FieldInfo.description pydantic#6563 merges. I've been using that branch locally for testing.
  • Subcommands and Positional args using annotations. I took the stance of single subcommand per model and positional args as the exception not the default.
  • Short option flags, e.g. -f.
  • List[...] fields using JSON format --arg=[1,2] or repeated arguments --arg 1 --arg 2. I think lazy eval would be nice here as well, e.g. --arg=1,2.

Points I have not covered:

  • Dict[...], although this would likely fall under JSON input. I'll take a look.
  • alias commands.
  • I did not cover fancy help text generation with colors etc. I think end users can extend that if they want.

To summarize, it's basically a shim layer over the existing EnvSettingsSource with help text generation. In fact, the final result after argparse looks exactly like an environment variable Dict[str, str] that we hand off to the already established flow. e.g.:

{
    'v0': '0',
    'sub_model.v1': 'json-1',
    'sub_model.v2': b'nested-2',
    'sub_model.v3': '3',
    'sub_mode.deep.v4': 'v4'
}

I still have some local cleanup to do but will push a draft PR once complete.

@samuelcolvin
Copy link
Member

The source should be off by default, that avoids any backwards compatibility issues.

I'm not sure if you need sub-commands or shortened names initially, just populating a model from named arguments is a great start.

@kschwab
Copy link
Contributor Author

kschwab commented Jan 19, 2024

It was just with respect to the settings_customise_sources, adding the additional cli_settings param:

    @classmethod
    def settings_customise_sources(
        cls,
        settings_cls: type[BaseSettings],
        init_settings: PydanticBaseSettingsSource,
        cli_settings: PydanticBaseSettingsSource,
        env_settings: PydanticBaseSettingsSource,
        dotenv_settings: PydanticBaseSettingsSource,
        file_secret_settings: PydanticBaseSettingsSource,
    ) -> tuple[PydanticBaseSettingsSource, ...]:

If we kick that out, then no conflicts. From user perspective it would just mean enabling CLI parsing would be done through override of settings_customise_sources:

    @classmethod
    def settings_customise_sources(
        cls,
        settings_cls: type[BaseSettings],
        init_settings: PydanticBaseSettingsSource,
        env_settings: PydanticBaseSettingsSource,
        dotenv_settings: PydanticBaseSettingsSource,
        file_secret_settings: PydanticBaseSettingsSource,
    ) -> tuple[PydanticBaseSettingsSource, ...]:
        return CliSettingsSource(settings_cls), env_settings, init_settings

@kschwab kschwab linked a pull request Jan 22, 2024 that will close this issue
@kschwab
Copy link
Contributor Author

kschwab commented Jan 22, 2024

@samuelcolvin I punted on the short opts but kept the subcommands. Dictionaries are also included 👍🏾

@mpkocher
Copy link

mpkocher commented Jan 22, 2024

If we kick that out, then no conflicts. From user perspective it would just mean enabling CLI parsing would be done through override of settings_customise_sources:

    @classmethod
    def settings_customise_sources(
        cls,
        settings_cls: type[BaseSettings],
        init_settings: PydanticBaseSettingsSource,
        env_settings: PydanticBaseSettingsSource,
        dotenv_settings: PydanticBaseSettingsSource,
        file_secret_settings: PydanticBaseSettingsSource,
    ) -> tuple[PydanticBaseSettingsSource, ...]:
        return CliSettingsSource(settings_cls), env_settings, init_settings

Does this mean if a user has a model M that has a required field alpha that is defined in another setting source (e.g., dotenv), that --alpha is no longer strictly required at the CLI level?

@kschwab
Copy link
Contributor Author

kschwab commented Jan 22, 2024

@mpkocher that's a good point. I think there are two user groups for this feature. Those that want to use pydantic to create CLIs and those that want to use a CLI to interact with pydantic models.

If you’re in the latter group, the answer is yes. alpha would not be required at the CLI source because it is only one of several potential sources. i.e. you only care that one of the sources provides alpha.

However, if you’re in the former group, you probably do care that alpha is strictly required at the CLI source. In this case, pydantic is primarily used as a definition for your CLI, meaning if something is required you want it required at the CLI. I'll add a flag to enable this behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants