Feature Request: Add CLI argument parsing for model initialization #209

kschwab · 2024-01-18T04:39:39Z

I wanted to propose adding CLI argument parsing into Pydantic settings. Under the hood, it is essentially the same as environment variable parsing, so not very difficult to add (relatively speaking). This would be nice to have as it would complete all sources of ingestion at the application level (FILES, ENV, CLI). I am in the process of completing a rough draft and wanted to see if there was interest in bringing something like this into main. Would love to contribute and take feedback on what would be desired if interested.

As with parsing environment variables example, nested settings would take precedence, etc. Essentially, the below:

# your environment
export V0=0
export SUB_MODEL='{"v1": "json-1", "v2": "json-2"}'
export SUB_MODEL__V2=nested-2
export SUB_MODEL__V3=3
export SUB_MODEL__DEEP__V4=v4

would be equivalent to:

# your cli
app --v0=0 --sub_model='{"v1": "json-1", "v2": "json-2"}' --sub_model.v2 nested-2 \
--sub_model.v3 3 --sub_model.deep.v4 v4

Then, within the application it would simply be:

from pydantic import BaseModel

from pydantic_settings import BaseSettings, SettingsConfigDict


class DeepSubModel(BaseModel):  
    v4: str


class SubModel(BaseModel):  
    v1: str
    v2: bytes
    v3: int
    deep: DeepSubModel


class Settings(BaseSettings):
    model_config = SettingsConfigDict(env_nested_delimiter='__')

    v0: str
    sub_model: SubModel


print(Settings.parse_cli().model_dump())
"""
{
    'v0': '0',
    'sub_model': {'v1': 'json-1', 'v2': b'nested-2', 'v3': 3, 'deep': {'v4': 'v4'}},
}
"""

Thoughts?

hramezani · 2024-01-19T10:15:09Z

Thanks, @kschwab for this feature request.

I think if we want to support CLI parsing, it has to be added as a new setting source class like other source classes. So, I think Settings.parse_cli() is not the right way to go.

hramezani · 2024-01-19T10:33:26Z

There was a similar issue before that was closed. at that point, they suggested using typer.

@samuelcolvin @dmontagu what do you think?

samuelcolvin · 2024-01-19T10:46:05Z

I think this would be great, as I said on the original issue, it can start small - e.g. fields can only be set via named arguments like --foo 123.

samuelcolvin · 2024-01-19T10:47:10Z

I agree with @hramezani that this should be implemented as a separate Source.

PR welcome.

frederikaalund · 2024-01-19T11:47:24Z

Shameless plug: cyto contains a CLI-based settings Source for pydantic.

It uses click under the hood to parse the CLI arguments. There is an extensive test suite.

Feel free to copy or take inspiration from cyto (it's under an MIT license).

kschwab · 2024-01-19T17:45:24Z

I think if we want to support CLI parsing, it has to be added as a new setting source class like other source classes. So, I think Settings.parse_cli() is not the right way to go.

Yep, I agree with this as well. Once I started integration this was the direction I took. The only point here was backwards compatibility. As it currently sits, I introduced CliSettingsSource as a core source in settings_customise_sources, but we can easily move it if desired. IMO it would be nice to have it as a core built in source. For now, it would look like this:

print(Settings(_cli_parse_args=True))
"""
{
    'v0': '0',
    'sub_model': {'v1': 'json-1', 'v2': b'nested-2', 'v3': 3, 'deep': {'v4': 'v4'}},
}
"""

There was a pydantic/pydantic#756 before that was closed. at that point, they suggested using typer.

Thanks for highlighting this thread, I had not seen it. @dmontagu opening statement "I would really like a lightweight CLI-argument parsing class, similar to BaseSettings" captures our interest and use case as well.

I think this would be great, as I said on the original issue, it can start small - e.g. fields can only be set via named arguments like --foo 123.

Perfect, this is where I started. The main points I intend to cover are:

Generated --help documentation was a must. This becomes very clean once Extract attribute docstrings for FieldInfo.description pydantic#6563 merges. I've been using that branch locally for testing.
Subcommands and Positional args using annotations. I took the stance of single subcommand per model and positional args as the exception not the default.
Short option flags, e.g. -f.
List[...] fields using JSON format --arg=[1,2] or repeated arguments --arg 1 --arg 2. I think lazy eval would be nice here as well, e.g. --arg=1,2.

Points I have not covered:

Dict[...], although this would likely fall under JSON input. I'll take a look.
alias commands.
I did not cover fancy help text generation with colors etc. I think end users can extend that if they want.

To summarize, it's basically a shim layer over the existing EnvSettingsSource with help text generation. In fact, the final result after argparse looks exactly like an environment variable Dict[str, str] that we hand off to the already established flow. e.g.:

{
    'v0': '0',
    'sub_model.v1': 'json-1',
    'sub_model.v2': b'nested-2',
    'sub_model.v3': '3',
    'sub_mode.deep.v4': 'v4'
}

I still have some local cleanup to do but will push a draft PR once complete.

samuelcolvin · 2024-01-19T17:53:22Z

The source should be off by default, that avoids any backwards compatibility issues.

I'm not sure if you need sub-commands or shortened names initially, just populating a model from named arguments is a great start.

kschwab · 2024-01-19T18:02:14Z

It was just with respect to the settings_customise_sources, adding the additional cli_settings param:

    @classmethod
    def settings_customise_sources(
        cls,
        settings_cls: type[BaseSettings],
        init_settings: PydanticBaseSettingsSource,
        cli_settings: PydanticBaseSettingsSource,
        env_settings: PydanticBaseSettingsSource,
        dotenv_settings: PydanticBaseSettingsSource,
        file_secret_settings: PydanticBaseSettingsSource,
    ) -> tuple[PydanticBaseSettingsSource, ...]:

If we kick that out, then no conflicts. From user perspective it would just mean enabling CLI parsing would be done through override of settings_customise_sources:

    @classmethod
    def settings_customise_sources(
        cls,
        settings_cls: type[BaseSettings],
        init_settings: PydanticBaseSettingsSource,
        env_settings: PydanticBaseSettingsSource,
        dotenv_settings: PydanticBaseSettingsSource,
        file_secret_settings: PydanticBaseSettingsSource,
    ) -> tuple[PydanticBaseSettingsSource, ...]:
        return CliSettingsSource(settings_cls), env_settings, init_settings

kschwab · 2024-01-22T06:08:34Z

@samuelcolvin I punted on the short opts but kept the subcommands. Dictionaries are also included 👍🏾

mpkocher · 2024-01-22T11:56:19Z

If we kick that out, then no conflicts. From user perspective it would just mean enabling CLI parsing would be done through override of settings_customise_sources:

    @classmethod
    def settings_customise_sources(
        cls,
        settings_cls: type[BaseSettings],
        init_settings: PydanticBaseSettingsSource,
        env_settings: PydanticBaseSettingsSource,
        dotenv_settings: PydanticBaseSettingsSource,
        file_secret_settings: PydanticBaseSettingsSource,
    ) -> tuple[PydanticBaseSettingsSource, ...]:
        return CliSettingsSource(settings_cls), env_settings, init_settings

Does this mean if a user has a model M that has a required field alpha that is defined in another setting source (e.g., dotenv), that --alpha is no longer strictly required at the CLI level?

kschwab · 2024-01-22T16:51:36Z

@mpkocher that's a good point. I think there are two user groups for this feature. Those that want to use pydantic to create CLIs and those that want to use a CLI to interact with pydantic models.

If you’re in the latter group, the answer is yes. alpha would not be required at the CLI source because it is only one of several potential sources. i.e. you only care that one of the sources provides alpha.

However, if you’re in the former group, you probably do care that alpha is strictly required at the CLI source. In this case, pydantic is primarily used as a definition for your CLI, meaning if something is required you want it required at the CLI. I'll add a flag to enable this behavior.

pydantic-hooky bot assigned samuelcolvin Jan 18, 2024

pydantic-hooky bot added the unconfirmed label Jan 18, 2024

kschwab changed the title ~~Add CLI argument parsing for model initialization~~ Feature Request: Add CLI argument parsing for model initialization Jan 18, 2024

samuelcolvin mentioned this issue Jan 19, 2024

Command-line argument parsing pydantic/pydantic#756

Closed

hramezani added feature request and removed unconfirmed labels Jan 19, 2024

kschwab linked a pull request Jan 22, 2024 that will close this issue

Add CLI Settings Source #214

Open

mpkocher mentioned this issue Jan 22, 2024

Compatibility to pydantic >= 2 mpkocher/pydantic-cli#56

Open

LSinev mentioned this issue Mar 11, 2024

Write tests for calling CLI arguments downstream to ensure correctly-returned types EleutherAI/lm-evaluation-harness#1518

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Add CLI argument parsing for model initialization #209

Feature Request: Add CLI argument parsing for model initialization #209

kschwab commented Jan 18, 2024 •

edited

hramezani commented Jan 19, 2024

hramezani commented Jan 19, 2024

samuelcolvin commented Jan 19, 2024

samuelcolvin commented Jan 19, 2024

frederikaalund commented Jan 19, 2024

kschwab commented Jan 19, 2024

samuelcolvin commented Jan 19, 2024

kschwab commented Jan 19, 2024

kschwab commented Jan 22, 2024

mpkocher commented Jan 22, 2024 •

edited

kschwab commented Jan 22, 2024

Feature Request: Add CLI argument parsing for model initialization #209

Feature Request: Add CLI argument parsing for model initialization #209

Comments

kschwab commented Jan 18, 2024 • edited

hramezani commented Jan 19, 2024

hramezani commented Jan 19, 2024

samuelcolvin commented Jan 19, 2024

samuelcolvin commented Jan 19, 2024

frederikaalund commented Jan 19, 2024

kschwab commented Jan 19, 2024

samuelcolvin commented Jan 19, 2024

kschwab commented Jan 19, 2024

kschwab commented Jan 22, 2024

mpkocher commented Jan 22, 2024 • edited

kschwab commented Jan 22, 2024

kschwab commented Jan 18, 2024 •

edited

mpkocher commented Jan 22, 2024 •

edited