Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconsider how multi-valued options combine between the config file and the command-line? #328

Open
jherland opened this issue May 11, 2023 · 1 comment
Labels
CLI later P3 minor: not priorized question Further information is requested research-needed type: feature request

Comments

@jherland
Copy link
Member

How should multi-valued options on the command-line (e.g. --code foo.py bar.py) combine with corresponding multi-valued options from the config file (e.g. tool.fawltydeps.code = ["bar.py", "baz.py"]). (Also don't forget the fawltydeps_* environment variables that exist between the config file and command-line in the configuration cascade...)

The current behavior is to override/replace completely for all Settings members that are defined at each level. No effort is taken to more "smartly" integrate the values at one level (e.g. the config file) with values at another level (e.g. the command line).

Quoting the relevant discussion on #321:

@Nour-Mws writes:

Are we also defaulting to override config with cli arguments for other options that might have multiple values?

@jherland writes:

Yes, here's what happens for e.g. .code (using environment vs. CLI, but the same holds for config file vs. CLI):

$ fawltydeps_code='["fawltydeps/main.py", "fawltydeps/packages.py"]' \
  fawltydeps --list-imports --detailed
fawltydeps/main.py:20: pydantic
fawltydeps/packages.py:20: importlib_metadata
fawltydeps/packages.py:37: tomli

$ fawltydeps_code='["fawltydeps/main.py", "fawltydeps/packages.py"]' \
  fawltydeps --list-imports --detailed --code fawltydeps/types.py fawltydeps/utils.py
fawltydeps/types.py:15: typing_extensions
fawltydeps/utils.py:9: importlib_metadata

When I first wrote the Settings class, I added the following TODO (and it's still there):

    TODO? Currently overrides happen _completely_, for example specifying
    --ignore-undeclared on the command-line will _replace_ an ignore_undeclared
    directive in pyproject.toml. We may want to consider allowing some
    directives to _combine_ (although that will carry further complications).

The "further complications" I allude to here are more complex than one might imagine at first:

  • We would probably need to introspect over the Settings members and/or their type and define different ways of combining them:
    • Most sets (code/deps/pyenvs/custom_mapping_file/ignore_*) combine by union
    • custom_mapping combines by merging dicts (a la accumulate_mappings())
    • How do we want actions to combine? One could argue for either replace or union, but probably replace makes most sense.
    • output_format should likely replace?
    • Same with deps_parser_choice?
    • The verbosity number should probably sum() across levels?
  • When we decide to take the union of multi-valued members, then how do we deal with wanting to remove something that was added at an earlier stage? For example: the config file says code = ["foo.py", "bar.py"], but I want FD to look at code from bar.py and baz.py; I can add baz.py at the command-line, but how do I remove foo.py?
  • We would maybe need to add more options (--remove-code=foo.py or --no-code=foo.py) or more complex option parsing (--code "+baz.py" "-foo.py") to deal with this.
  • How to deal with different options that "cooperate" in unexpected ways? E.g. deps = ["pyproject.toml"] in the config file, but --deps my_reqs.pip --deps-parser-choice requirements.txt on the command line would probably combine into the following nonsense in Settings:
    deps = {"pyproject.toml", "my_reqs.pip"}
    deps_parser_choice = ParserChoice.REQUIREMENTS_TXT

When I first wrote this code in #167, I decided to not open this Pandora's box of complexity, and to leave everything as strictly overriding/replacing for now.

After revisiting it now, I still think this is the sanity-preserving choice! 😅

Tagging this as fairly low-priority for now, given that we do not yet have much external pressure to do anything about this.

@jherland
Copy link
Member Author

From #365 (comment):

@zz1874 writes:
Also, this brings me to think, as our list of development dependencies continues to grow, should we consider implementing functionality that allows users to add other tools to the DEFAULT_IGNORE_UNUSED list without overwriting it entirely? This could provide more flexibility for users while maintaining our core list of dependencies.
Any ideas?

@Nour-Mws writes:
I rather prefer the user takes full responsibility of declaring this list the moment they decide to add a dependency to it. I believe having an in-between state is in the realm of diminishing returns of efforts to functionality.

TL;DR: Yes, I think I agree with @Nour-Mws here.

There are many similar questions of this kind that can be asked:

  • should --ignore-unused on the command-line replace or extend (aka. union with) the DEFAULT_IGNORE_UNUSED set?
  • should ignore_unused in the config file replace or union the DEFAULT_IGNORE_UNUSED set?
  • should --ignore-unused on the command-line replace or union ignore_unused in the config file?
  • same questions for --ignore-undeclared
  • what about actions? If we set actions = ['check_undeclared'] in the config file, and you pass --check_unused on the command-line, do you get both? or only the latter?
  • what about code, deps, pyenvs? Are these to be treated the same as the --ignore-* options? Does it matter whether the default is an empty set or not?
  • some settings combine in complex ways: deps and deps_parser_choice interact in ways that make it challenging to support both deps = ['pyproject.toml'] and deps_parser_choice = 'pyproject.toml' in the config file, combined with --deps-parser-choice requirements.txt on the command-line.
  • what about verbosity? The default here is 0, and when passing --verbose or --quiet options on the command-line, the verbosity level is calculated accordingly (subtracting the number of --quiet from the number og --verbose). If you set verbosity = 3 in the config file, and then pass --quiet on the command-line, should the resulting verbosity be 2 or -1?

Taking a step back, I think of these questions as instances of a pattern of a few, more general questions. Since our settings are implemented as a cascade/layering of [defaults, config_file, environment_vars, command_line_opts], I think the more general questions can be phrased like this:

  1. Should directives later in the cascade always replace corresponding directives from earlier in the cascade?
  2. Should we also allow unioning (instead of replacing) for some of the directives? If so, which?
  3. Even when a setting is not a set (e.g. verbosity), it might still make sense to combine the multiple layers in some other way?
  4. Some settings are booleans (install_deps), and for these, replace is probably the only sane choice?
  5. Does it make sense for some layers to replace (e.g. config file vs defaults), and other layers to union (e.g. command line vs. config file)?

For now, we've answered the questions like this:

  1. Yes. The final value of a directive is taken from the latest layer in the cascade that specifies it, and the value is fully specified at that layer (i.e. we always replace, there is no extending or other combining happening between layers).
  2. No, this is currently not implemented.
  3. N/A. see above.
  4. Yes, this is in line with how we answer Q1.
  5. No. We value consistency in how these directives are interpreted, and having a simple and straightforward policy is valuable both from a user perspective as well as an implementation perspective.

I'm happy for us to continue like this, but I also see that there is some demand for extending/combining some directives instead always/only replacing.

If we were to add functionality for this, I think we should do so very carefully and deliberately. Here is an example of what we could consider doing (a rough sketch and I'm not sure about many things here):

  • Add new command-line options that extend/union, rather than replace:
    • --add-ignore-unused
    • --add-ignore-undeclared
    • --add-pyenv
    • --add-code
    • --add-deps
    • --add-custom-mapping-file
  • We could consider adding corresponding --remove-* options to remove items from the defaults, but this looks like a slippery slope of complexity for very little gain.
  • These are command-line options, and they are only accessible from the command-line, i.e. you cannot use the config file to extend the defaults. Allowing extension/removal at multiple levels of the cascade would explode the complexity, both for our implementation and for our users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLI later P3 minor: not priorized question Further information is requested research-needed type: feature request
Projects
None yet
Development

No branches or pull requests

1 participant