Iterating over `ArgMatches` #1206

neysofu · 2018-03-09T12:04:34Z

Maintainer's notes

This is blocked on Lifetimes of App, Arg, ArgGroup: removing or relaxing #1041 / Consider accepting a Cow instead of Into<&'b str> in App::about #2150 where we move away from opaque, non-public Ids for arguments to something the user can programmatically act on

Hi, I am using Clap in a little side project and I would like to suggest a minor feature. I am ready to implement it myself and send a PR if needed, I just need a few pointers. The feature at hand is an ordered iterator over ArgMatches: as far as my understanding goes, Clap doesn't allow to freely iterate over the parsed arguments. What if my application APIs are dependent upon the order of the arguments? GNU find comes to my mind. Calling index_of for every argument can be expensive if you have a lot of them.

If this feature were to be implemented, what would the return type and name of the method be?

The text was updated successfully, but these errors were encountered:

kbknapp · 2018-03-19T17:58:33Z

Sorry for the wait, I've been on vacation with my kids this past week :)

Do you mean iterating over only positional values, or over all args that have been used? Also, do you envision this iterating over non-matched args as well?

If you'd like to take a stab at implementing I'd be happy to mentor you! This is how I'd attack it:

Draw out how you see this working, including the questions I listed above (including any edge cases you can think of)
I start with some usage code (such as in a test) to make sure the ergonomics are correct/comfortable
Write some tests in tests/matches.rs for the various uses you've drawn up in step one
I then just test that file alone with cargo test --test matches or can further test just a single function in that file with cargo test --test matches -- name_of_function
Start the implementation (this will vary greatly depending on how you've designed it)

For the implementation once we have a good design I can give you some pointers as to where the changes you'll want to make are located / any particulars associated with those changes.

Once all your new tests pass, I'd run a full test suite cargo test --features "yaml vec_map wrap_help unstable" and make sure nothing broke.

If you need to see the debug output while you're building testing a single function use cargo test --test matches --features debug -- name_of_function

neysofu · 2018-03-19T19:54:31Z

First of all, thanks for taking the time to address this! I hope you enjoyed your vacation.

The API I had in mind wouldn't be constrained only to positional arguments, but to all the arguments that have been used. Currently ArgMatches allows you to know which arguments have been used and their index in respect to all other arguments, yet it lacks a way to actually iterate over them.

I mean adding an API to access the order in which options are given. Ex.:

cmd --abc --xyz
cmd --xyz --abc

Let's say my command's behavior is specific to the order in which the options abc and xyz are given. As long as it's only two arguments I can get away with calling index_of for one of the two and figuring out which comes first. But what if I provide many another options? My idea is offer a method over ArgMatches that gives an iterator over the arguments in an ordered fashion:

arg_matches_1.iter_raw() == vec!["abc", "xyz"];
arg_matches_2.iter_raw() == vec!["xyz", "abc"];

This example is just off the top of my head and I am sure it could be improved.

Please correct me if I talked nonsense at any point because I am not very familiar with the API.

kbknapp · 2018-03-19T20:25:28Z

To make sure it's not an XY problem, how does your CLI work with the order of args? I've always seen conflicts/overrides as the main driving factor for when order matters. Both of which handled in the parsing logic.

The only other form I've seen is for things like include/exclude logic where there aren't really hard conflicts/overrides at the parsing level, but further up the execution stack there are. I've only seen this between a handlful of args at the most, and hence iterating over the indices isn't an issue.

This is of course not to say my list is exhaustive, I'm sure there are other use cases out there!

This actually wouldn't be a huge change for clap to implement this, but it would introduce an additional dep (such as indexmap ) and thus would need to be behind a non-default cargo feature.

The change you'd make is using cfg statements to change the hashmap to an indexmap and provide an ArgMatches method to simply return an Iterator of the requested info, perhaps as a tuple? (Such as (name, &[vals])?) This could be similar to how Arg::values_of works.

For the iterator itself, it'd fine to use the Values impl as a guide.

kbknapp · 2018-03-19T20:28:45Z

Another more lengthy option (but arguably more correct) is instead of returning an Iterator of some custom data type tuple, returning the MatchedArg itself. I say more lengthy because you'd have to document MatchedArg and provide the appropriate methods to get at it's internals such as values, occurrences, and indices. However, this is more flexible in the long run as you wouldn't need to add more ArgMatches methods in order to deal with different types of iteration...you'd just let the calling code deal with it has it wants.

neysofu · 2018-03-20T15:22:38Z

My application has a custom version subcommand that works this way:

// Prints human readable version if no options are given
$ cmd version
Cmd 2.42.1-rc1 (fj23o83b)

// Switches to line-by-line format if options are given, to facilitate machine readability. The requested data is printed in the specified order.
$ cmd version --tags --major
rc1
2

However there are many other use cases. Tasks execution order in a Gulp-like application, apply filters in a specific order in GNU find, ecc.

neysofu · 2018-03-20T15:24:41Z

This actually wouldn't be a huge change for clap to implement this, but it would introduce an additional dep (such as indexmap ) and thus would need to be behind a non-default cargo feature.

That's a pity. I thought it would be as simple as adding a method to ArgMatches, with no fundamental changes to the actual program.

kbknapp · 2018-03-21T00:33:11Z

Right now, clap uses a HashMap to hold the matches, which is inherently not in order. Switching to an IndexMap would be basically a drop in replacement, but because it's another dep I'd want people to have to opt-in to it.

So it basically would be just a cfg statement which adds the method on the matches struct and uses an indexmap instead of hashmap. And I'd create a struct to act as the iterator, and impl Iterator for it.

Something like (details left out for brevity):

#[cfg(not(feature = "indexmap"))]
struct ArgMatches {
    args: HashMap
}

#[cfg(feature = "indexmap")]
struct ArgMatches {
    args: IndexMap
}

impl ArgMatches {
    #[cfg(feature = "indexmap")]
    fn iter(&self) -> MatchIter {
        MatchIter { iter: self.args.iter() }
    }
}

#[cfg(feature = "indexmap")]
struct MatchIter {
    iter: // whatever IndexMap::iter returns
}

#[cfg(feature = "indexmap")]
impl Iterator for MatchIter {
    type Item = // whatever args.iter().next() returns

    fn next(&mut self) -> Self::Item {
        self.iter().next()
    }
}

So it's a pretty straight forward change.

neysofu · 2018-03-21T04:44:53Z

Thanks for the tips, I'll start working on a prototype then. 👍

neysofu · 2018-03-24T21:49:51Z

I feel hesitant adding an iter method to ArgMatches; it feels like a duplicate of indexmap's functionality. ArgMatches.args: IndexMap is already public, so what's wrong with the user doing matches.args.iter()?

neysofu · 2018-03-24T23:19:41Z

You can take a look at the prototype here. There is some more work to do, specifically:

write documentation for both the feature and the iter method;
remove the feature from the default set (it was a quick hack to make it build every time without passing arguments);
add some more tests.

Please let me know how you like the API and if you want any changes. Congrats on the codebase BTW, I know this is a small change but still, it is a pleasure to dive in.

kbknapp · 2018-03-25T03:17:56Z

I feel hesitant adding an iter method to ArgMatches; it feels like a duplicate of indexmap's functionality. ArgMatches.args: IndexMap is already public, so what's wrong with the user doing matches.args.iter()?

My current policy is public, but undocumented internals are effectively private and should not be relied on. Sure, I can't stop anyone from using them directly, but there's the caveat that if I change them without warning it's considered a breaking change. Hence the #[doc(hidden)]

Most of the time public but undocumented is because it needs to be public because other mods (or eventually crates) need to touch it, but those mods (or crates) are maintained by myself or other clap members and thus it's essentially just a private API.

kbknapp · 2018-03-25T03:22:03Z

Looks like a good start! It's a small nit, but let's change the feature name to iter_matches because iter is somewhat ambiguous and may conflict with other types of iteration we may do in the future.

Let me know if you have any other questions 👍

neysofu · 2018-03-25T05:18:39Z

I'll create a PR for discussing further details, sounds good?

kbknapp · 2018-03-25T19:21:25Z

Yep, I'd actually prefer it that way because it's easier to comment and review 😄 👍

XX · 2018-04-14T20:07:09Z

I also need this functionality

bradwood · 2020-04-26T19:57:12Z

Hi all, I'm also looking to use something like this. I'd like to process my args in a functional fashion. The PR, however, seems to have died a death. Given that 3.0 is on the horizon how much of a bad idea is it for me to just use matches.args.iter() in 2.x? Also, will this make it into 3.0?

PS, I don't need ordering.

CreepySkeleton · 2020-04-27T07:59:01Z

@bradwood Not as bad as you'd think: 2.x branch is frozen. I'd be really surprised if we change something regarding publicity of those fields because there is plenty of code out there relying on those, ugh. So, if you aren't planning on switching to 3.0 and you're OK with the fact that every time you use private API god kills a kitten, you can do that, yes.

Beware: those will be truly private in 3.0.

I suspect it won't make it into 3.0. The main question is: what are you going to iterate through? You can't have &Arg from ArgMatches because it stores Ids, basically a hash of those. On doesn't simply restore the Arg from the hash.

bradwood · 2020-04-27T12:32:10Z

i am trying to call map() on each arg I find to invoke the appropriate handler function based on the argument passed...

And, as regards the kittens, I'm cool with that! :)

CreepySkeleton · 2020-04-27T12:39:11Z

clap has Arg::validator that is called on every value passed in. And of course, you can always iterate over std::env::args() and map it as you please; you don't really need clap for that.

pksunkara · 2020-04-27T14:48:39Z

I think I can see a certain use case for this where people want to do dependency injection pattern for evaluating their args.

kbknapp · 2020-04-27T15:37:51Z

[I] am trying to call map() on each arg I find to invoke the appropriate handler function based on the argument passed...

For a while I was floating around the idea of adding filter/map capabilities to arguments themselves, in a similar manner to the current validator. In essence it's just a chance for people to call a function when a particular argument/value is used and do something with its value persistently. Some use cases were, "I want my CLI to accept options yes|no but I want the value held my context struct to be some concrete type."

But others also wanted more a "handler" type functionality (which is something some python libraries offer). This can be done already by (hackily) using the validator functionality.

Currently, the mapping of values can be done pretty easily after the parsing in user code manually, but not during parsing.

I'm not 100% sure where I fall on the thinking of this one though, as is the added complexity and public API space worth the small benefit of doing something at parse time instead of manually afterwards?

Also perhaps more importantly, have users thought through what happens if a particular map/filter/handler whatever we'd call it has side affects, but parsing ultimately fails? Maybe that's just a, "hey probably best you don't have side affects, but if you do be aware it can have XYZ consequences." I mean you can already (mis)use the validators for handlers, so I'd be less inclined to include that functionality.

CreepySkeleton · 2020-04-27T17:02:36Z

I want my CLI to accept options yes|no but I want the value held my context struct to be some concrete type

Resolved by derive (and partially by value_of_t* methods).

I'm not sure I get your "handlers" idea, I imagine it's something akin #1185:

a single Arg::custom_parse which accepts a function that takes the arg/value currently being parsed, and returns an Option where None is skip this value (i.e. parse it as a different arg), or Some(T) is "save this to the matches" it could be the value as passed in, or it could be something different entirely (i.e. a map).

Some thoughts:

When is it ran, exactly? Before default values/environment variables were populated or after? What about required/conflicts stuff? How does it interact with old-style validators? With groups? Possible values?

I'm not sure I'm ready to answer these questions.
We've already got the flush-related bug. The problem here that clap exits with sys::exit() which implies that no destructors are being accounted for, no finallizers are being run, the text still awaiting to be processed by IO facilities is lost, etc... When I'm reading the description above, I'm given impression that "heavy" side-effect functions are OK, so will be users.

In nutshell: I think this kind of tool is too vague and too powerful for a command line parsing library to provide. I don't think clap should be trying to accommodate every single CLI in existence, the number of well-supported designs is quite enough already.

bradwood · 2020-04-27T17:38:45Z

FWIW, I'm not needing these map()s to run during arg parsing, I'm happy to pass an iterator of matched arguments into a function pipeline. I'll take a look at the validator, as I've not really looked properly at that. It might work for my use case.

CreepySkeleton · 2020-04-27T17:47:08Z

@bradwood Out of morbid curiosity, what is the core problem you're trying to solve? Why do you think map is a good way to approach this kind of problem?

I suspect you're just "doing it wrong".

kbknapp · 2020-04-27T18:48:41Z

Resolved by derive (and partially by value_of_t* methods).

Exactly. I'm describing a problem/issue that existed prior to derive macros. The "handlers" was just a common theme that popped up every so often, especially when people were moving from Python based CLIs which offer a "when this argument is used, run this function." Although that's a totally different way to build/design a CLI and thus I don't think it's really applicable to how clap runs.

I don't think clap should be trying to accommodate every single CLI in existence, the number of well-supported designs is quite enough already.

100% agree. I'm just providing some historic context 😉

@bradwood I agree with @CreepySkeleton it sounds like you may be approaching the problem from a particular design angle and we may be seeing it as an XY problem.

Some of the principal issues with a feature mapping over arguments (and their values) is:

For us (clap) to provide it, it needs to be general enough to fit all (or most) use cases, and with feature in particular there are a lot of things to consider and edge cases
If we're talking about true map (i.e. T->U), that can essentially already be done easily as @CreepySkeleton mentioned
If we're talking about handlers (i.e. perform some action for each argument); unless these actions are only blind side affects they:
- need some sort of context passed to/from them (other args, maybe application context, etc.)
- cannot rely on deterministic order or hierarchy
- most likely need the concept of fallibility (what happens on failure)

Those final three bullets are some of what making this feature a general clap feature difficult. If you don't have any of those requirements and are truly only wanting blind side affects, I'd question if there may be a better way to represent the CLI.

If you can give some examples of what you're trying to do, we may be able to either point you in a direction or give you a way to accomplish that task with existing features.

antonshkurenko · 2022-01-18T11:59:42Z

Writing this comment because of #3286: absence of iterator blocks our update from 2x to 3x

epage · 2022-01-18T16:25:04Z

@tonyshkurenko can you expand on the use case for why this blocks updating?

antonshkurenko · 2022-01-25T15:26:54Z

I mean we literally used field

clap/src/args/arg_matches.rs

Line 66 in d51ae89

pub args: HashMap<&'a str, MatchedArg>,

Part of our app was based on this. Now it's not available (correct me, if I'm wrong) without any alternative. That's why we can't update

epage · 2022-01-25T15:40:48Z

Yes, I am asking for your use case or why you use the field. What problem are you trying to solve by accessing args?

fosskers · 2022-01-25T18:05:01Z

I was attempting this about a year ago, and my particular use case was either:

to be able to filter back out matched flags under certain circumstances
to be able to convert all the matched flags back into their original string forms

epage · 2022-02-01T22:02:20Z

I was looking at exposing a present_arg_ids() -> impl Iterator but it looks like this is not something we'd fix in clap3 but have to wait for clap4.

We only store an Id for args, so this is the most of what we can return. We can make it so users can check that its a specific arg. The problem is then using this to do any dynamic operations on ArgMatches. The problem is that ArgMatches functions take a impl Key. To properly initialize a new Id from the Key, we have to add a impl Display for Id. Id only has a reasonable Display in debug mode. If we expose Id and impl Display, there is a good chance users will leverage the Display and being able to catch that not working depends on how thorough they test. Alternatively, we could look to drop the Display bounds in release modes and only impl Display for Id in debug mode.

In general, resolving this is getting hairy and something we are unlikely to want to take on. Our current plan is to wait around 6 months at minimum between major breaking releases (which this change would probably be part of), so we are looking around June for resolving this.

epage · 2022-02-01T22:06:54Z

Also, when implementing this, my expectation is that we'll provide something like:

pub fn present_arg_ids(&self) -> Iterator<Item=&Id>

The order will be undefined. If at the time, we are still using IndexMap internally, we should intentionally mix up the values via a HashSet in debug builds.
- Use of IndexMap is an implementation detail and we are considering removing it to reduce our dependencies
- We aren't using IndexMap correctly for ensuring ordering is preserved, so seeing it there is misleading
- Ordering of the first occurrence of each arg id is only meaningful if multiple occurrences is not used but in most schemes where ordering matters, multiple occurrences will likely be used. Providing ordering is a trap for people to not consider this
I suggest the function's example creates a Vec<(Id, &str)> to show how to order all occurrences of all args by index.

omarabid · 2022-02-06T12:54:17Z

So if I want to iterate over matched arguments (flags) how should I approach this today?

epage · 2022-02-07T15:33:08Z

You'll need to define a slice of all of you argument IDs (names) and iterate through that.

omarabid · 2022-02-07T17:12:20Z

You'll need to define a slice of all of you argument IDs (names) and iterate through that.

Does that mean I have to check if the flag was set. Because I have dozens of flags, so it seems to me to be a waste to check for all flags when I only want to know which flags were set.

epage · 2022-02-07T17:19:18Z

Yes. This is just a workaround. Like I said, we cannot fix this until clap 4.0.

A few minor cleanups to make it easier to submit PRs: * migrate to 2021 edition * run `cargo fmt` * bump some dependencies to their latest * Note that Clap is locked to 2.x until clap-rs/clap#1206 is fixed in v4 (?) * spelling fixes TODO for another PR: * fix many clippy [from_over_into](https://rust-lang.github.io/rust-clippy/master/index.html#from_over_into) warnings in the src/map/map_value.rs file. * README.md still has a few bugs. Uncomment a section in the lib.rs to show them during the test run.

nyurik · 2022-06-03T18:11:02Z

@epage thx for all the hard work on this! Is there a PR or some other work being done in this space? Per #2763 (comment), it seems all layered configuration projects need to be able to enumerate over ArgMatches to support Clap3+derive. Thanks!

epage · 2022-06-03T18:19:17Z

From my note I added in the issue

This is blocked on #1041 / #2150 where we move away from opaque, non-public Ids for arguments to something the user can programmatically act on

We cannot make these changes in clap v3 and but this should be available in clap v4.0.0. My expectation is that I'm going to soon wrap up the work I'm doing for clap 3.2, give time to collect and handle any feedback, and then start on v4.0.0. v3.2 is a big change, see #3732. My intention is to make this big change right before 4.0 so we don't have to live a long time with any mistakes identified through feedback.

For now, we are focusing only on iterating over the argument ids and not the values. This provides a building block for more obscure use cases like iterating over argument values, in order. We are not providing it out of the box at the moment both to not overly incentize a less common case, because it would abstract away a performance hit, and because we want to let people experiment with this and if a common path emerges we can consider it then if there is enough users. Fixes clap-rs#1206

kbknapp added T: new feature labels Mar 19, 2018

neysofu mentioned this issue Apr 1, 2018

New feature: 'iter_matches' #1240

Closed

epage added S-triage Status: New; needs maintainer attention. and removed D: medium labels Dec 9, 2021

epage mentioned this issue Jan 12, 2022

List all parsed args #3286

Closed

2 tasks

epage added this to the 4.0 milestone Feb 1, 2022

epage added M-breaking-change Meta: Implementing or merging this will introduce a breaking change. E-medium Call for participation: Experience needed to fix: Medium / intermediate and removed S-triage Status: New; needs maintainer attention. labels Feb 1, 2022

nyurik mentioned this issue May 29, 2022

Lint, cleanup, fmt, clippy, 2021 edition maurocordioli/viperus#5

Open

nyurik mentioned this issue Jun 5, 2022

Example using clap v3 derive storyfeet/clap_conf#4

Open

epage mentioned this issue Jul 16, 2022

Backwards incompatibility for ArgMatches and UnwindSafe #3876

Closed

2 tasks

epage mentioned this issue Aug 12, 2022

fix(parser)!: Store args in a group, rather than values #4072

Merged

epage mentioned this issue Aug 15, 2022

feat(parser): Report what arg ids are present #4080

Merged

epage closed this as completed in #4080 Aug 15, 2022

nyurik mentioned this issue Aug 15, 2022

clap derive support missing/broken bnjjj/twelf#15

Open

epage mentioned this issue Aug 19, 2022

Change visibility of some structures and functions for config-rs integration #4093

Closed

2 tasks

Iterating over ArgMatches #1206

Iterating over ArgMatches #1206

Comments

neysofu commented Mar 9, 2018 • edited by epage

kbknapp commented Mar 19, 2018

neysofu commented Mar 19, 2018 • edited

kbknapp commented Mar 19, 2018

kbknapp commented Mar 19, 2018 • edited

neysofu commented Mar 20, 2018 • edited

neysofu commented Mar 20, 2018

kbknapp commented Mar 21, 2018

neysofu commented Mar 21, 2018

neysofu commented Mar 24, 2018 • edited

neysofu commented Mar 24, 2018 • edited

kbknapp commented Mar 25, 2018

kbknapp commented Mar 25, 2018

neysofu commented Mar 25, 2018

kbknapp commented Mar 25, 2018

XX commented Apr 14, 2018

bradwood commented Apr 26, 2020

CreepySkeleton commented Apr 27, 2020

bradwood commented Apr 27, 2020

CreepySkeleton commented Apr 27, 2020

pksunkara commented Apr 27, 2020

kbknapp commented Apr 27, 2020

CreepySkeleton commented Apr 27, 2020

bradwood commented Apr 27, 2020

CreepySkeleton commented Apr 27, 2020 • edited

kbknapp commented Apr 27, 2020 • edited

antonshkurenko commented Jan 18, 2022

epage commented Jan 18, 2022

antonshkurenko commented Jan 25, 2022

epage commented Jan 25, 2022

fosskers commented Jan 25, 2022

epage commented Feb 1, 2022

epage commented Feb 1, 2022

omarabid commented Feb 6, 2022

epage commented Feb 7, 2022

omarabid commented Feb 7, 2022

epage commented Feb 7, 2022

nyurik commented Jun 3, 2022

epage commented Jun 3, 2022

Iterating over `ArgMatches` #1206

Iterating over `ArgMatches` #1206

neysofu commented Mar 9, 2018 •

edited by epage

neysofu commented Mar 19, 2018 •

edited

kbknapp commented Mar 19, 2018 •

edited

neysofu commented Mar 20, 2018 •

edited

neysofu commented Mar 24, 2018 •

edited

neysofu commented Mar 24, 2018 •

edited

CreepySkeleton commented Apr 27, 2020 •

edited

kbknapp commented Apr 27, 2020 •

edited