Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clap_derive: Vec/Option<Vec> behavior is inconsistent with other types #1772

Closed
CreepySkeleton opened this issue Mar 31, 2020 · 31 comments
Closed
Labels
A-derive Area: #[derive]` macro API C-enhancement Category: Raise on the bar on expectations
Milestone

Comments

@CreepySkeleton
Copy link
Contributor

CreepySkeleton commented Mar 31, 2020

This is actually a summary of TeXitoi/structopt#364

Current behavior

  • Vec<T> and Option<Vec<T>> are not required = true by default.
  • Vec<T> is multiple = true by default which allows not only multiple values (--foo 1 2 3) but also multiple occurrences (--foo 1 --foo 2 3).
  • Option<Vec<T>> additionally allows zero number of values (--foo).

What's wrong

  • The fact that Vec<T> is not required by default is inconsistent with all the other types in clap_derive that are required by default unless they are wrapped in Option (except bool but it's a very special case).
  • The fact that Option<Vec<T>> is different from Vec<T>, different not in "not required" sense, confuses newcomers.
  • The fact that Vec<T> allows multiple occurrences along with values is misleading.

Proposal

  • Use min_values = 1 for both Option<Vec<T>> and Vec<T> instead of multiple = true, allowing only non-zero number of values and disallow multiple occurrences (--foo 1 2 but not --foo nor --foo 1 --foo 2). If a user wants to allow zero values or multiple occurrences as well, they can explicitly specify it via min_values = 0 and multiple = true respectively.
  • Use required = true for Vec<T>.

cc @TeXitoi @Dylan-DPC @pksunkara

@CreepySkeleton CreepySkeleton added T: new feature A-derive Area: #[derive]` macro API C-enhancement Category: Raise on the bar on expectations and removed T: new feature labels Mar 31, 2020
@TeXitoi
Copy link
Contributor

TeXitoi commented Mar 31, 2020

The fact that Vec is not required by default is inconsistent with all the other types in clap_derive that are required by default unless they are wrapped in Option (except bool but it's a very special case).

I strongly disagree on this. A vector can have 0 elements, and that's the most used case. If you need at least one, you can add min_values(1) yourself.

@CreepySkeleton
Copy link
Contributor Author

If we use min_values(0), an empty vec would represent --foo <no values>, but "no --foo" would be disallowed due to required.

Option<Vec<T>> would represent just what it does currently:

command Vec<T> result Option<Vec<T>> result
app --foo 1 2 [1, 2] Some([1, 2])
app --foo [] Some([])
app ERRROR None

@pksunkara pksunkara added this to the 3.0 milestone Apr 9, 2020
@pksunkara
Copy link
Member

The min_values and occurrences issue should be fixed in #1026. Let's leave it out of the discussion here.

The fact that Option<Vec> is different from Vec, different not in "not required" sense, confuses newcomers.

Can you explain what this means? I don't understand the sentence.

I think we should have min_values = 0 and multiple = true for both but required = true for Vec<T>.

@TeXitoi
Copy link
Contributor

TeXitoi commented Apr 13, 2020

I really disagree. Most of the time you're using a Vec, you just want the list of parameters, and you don't really care if an empty parameter was provided. Forcing using an option of vec in the most common case would impact a lot the ergonomics.

@cecton
Copy link
Contributor

cecton commented Apr 17, 2020

Here is a use case where I find structopt very confusing: paritytech/substrate@6b0eed4

When I read the struct I see that listen_addr is a required argument and port is optional. But the structopt parameters shows me that both arguments are conflicting. Before that commit, the parameters weren't declared as conflicting and I had no way to know that listen_addr was actually optional. The only way I could know that is by reading the table.

I personally do think that Vec<_> should mean required (the argument must be provided) and Option<Vec<_>> should mean optional (the argument can be omitted).

I invite you to read the lengthy discussion here.

Forcing using an option of vec in the most common case would impact a lot the ergonomics.

Maybe but it will be consistent with the rest of the API.

@CreepySkeleton
Copy link
Contributor Author

CreepySkeleton commented Apr 18, 2020

Pff, it took me some time to get back to, sorry for the delays guys.

The min_values and occurrences issue should be fixed in #1026. Let's leave it out of the discussion here.

That issue is related here indeed, but we can't leave it out of this discussion because this is about defaults in derive, not about distinguishing between the two.

Forcing using an option of vec in the most common case would impact a lot the ergonomics.

@TeXitoi I see your point about Option<Vec<T>> vs Vec<T> ergonomics. I agree with you, simple Vec is much more handy than Option<Vec<T>> to work with.

But we wouldn't be forcing anything! Users would still be very much able to do #[clap(required = false)] if they want to keep a raw Vec

struct Opts {
    // This Vec is not option so it's required BY DEFAULT
    // because it's not bool and not Option
    required_vec: Vec<u32>,

    // We can override default behavior 
    // while keeping the raw Vec 
    #[clap(required = false)]
    not_required_vec: Vec<u32>,

    // We can use Option<Vec> instead of overriding
    // We lose some of the ergonomics but acquire the 
    // ability to distinguish between <empty list> an <no option at all>
    opt_vec: Option<Vec<u32>> 
}

In other words, this is question of good defaults, and I believe I've found the answer.

I think that our willingness of changing defaults should depend of what the most widespread usage is in practice. I decided to check via grep.app and this has been eye opening for me:

(IMPORTANT: this research assumes that #[structopt] attributes are one-line which is is quite frequently not true. If somebody has an idea how to cover

#[structopt(
 method = name,
 method2 = expr, method3 = expr)

attributes with regexes, speak up!)

  • required Vec or not.


    Conclusion: current default is exactly what most of our users need (~85%). Changing it would be unwise.
  • min_values = 0 vs min_values = 1 (behavior of multiple = true)

    • I searched "contains min_values" and there are only 13 results. None of them are min_values = 0. A little bonus: almost all of them are min_values = 1 which is the default behavior, lol.

    Conclusion: nobody wants `min_values = 0`. Making it default would be pointless.
  • Multiple values vs multiple occurrences. Default behavior for now is both. This is kind of thing that is hard to grep for, so I tried to figure out how many users are aware of multiple occurrences in the first place:

    • Contains number_of_values - 11 results.
    • Contains max_values: 1 result.
    • Contains multiple = true - 11 matches.


      This is all I've managed to come up with. Looks like there are not many people out there who are aware or want multiple occurrences, but the results here are probably very inaccurate. I vote for using multiple_values = true and ask the community after the beta is out.

@pksunkara
Copy link
Member

I think that our willingness of changing defaults should depend of what the most widespread usage is in practice.

I don't agree with this. Maybe all of them tried Option<Vec<T>> first and then decided that it didn't make a difference so they optimised it back to Vec<T>.

Also, we are not technically breaking this because this a separate library clap and not structopt. I would opt for it easier to understand w.r.t semantics in how to use the types for the fields and being different with each other rather than existing usage.

min_values should always be 0. As you saw, people saw Vec and with semantics they thought min_values might have been zero and that is why they customised the behaviour to be min_values = 1. (Related to #1682)

Note: multiple means multiple occurrences not values according to clap docs

Type required multiple
Vec<T> true false
Option<Vec<T>> false false
Vec<Vec<T>> true true
Option<Vec<Vec<T>>> false true
command Vec<T> Option<Vec<T>> Vec<Vec<T>> Option<Vec<Vec<T>>>
app -f 1 2 [1, 2] Some([1, 2]) [[1, 2]] Some([[1, 2]])
app -f [] Some([]) [[]] Some([[]])
app ERRROR None ERROR None
app -f 1 2 -f 3 4 [3, 4] Some([3, 4]) [[1, 2], [3, 4]] Some([[1, 2], [3, 4]])
app -ff [] Some([]) [[], []] Some([[], []])

By default, if one wants to support a -vvv kind of flag, they would have to use Option<Vec<Vec<bool>>>. But they always have the option of customizing stuff just by describing it using methods like how you already proposed.

#[clap(short, multiple = true, required = false, max_values = 0)]
verbose: Vec<bool>

@CreepySkeleton
Copy link
Contributor Author

Maybe all of them tried Option<Vec> first and then decided that it didn't make a difference so they optimised it back to Vec.

Sounds very unlikely, but maybe. @TeXitoi , has anybody ever told you something like "I tried Option<Vec> first, but settled with Vec because it's handy"? I remember only one such message: TeXitoi/structopt#285 .

Also, we are not technically breaking this because this a separate library clap and not structopt

I agree, absolutely. But we aren't talking just about breaking changes, we're talking about good defaults and whether it's worth to change them. If we find that the current defaults are not optimal, we shall change them, otherwise we shall leave them as is.

As you saw, people saw Vec and with semantics they thought min_values might have been zero and that is why they customised the behaviour to be min_values = 1

Eleven (min_values = 1) people out of 214 (total number of Vecs discovered). Very few people have been confused, so min_values = 1 default is preferable for Vec and min_values = 0 is preferable for Option<Vec>. In my opinion.


I like your Vec<Vec<T>> idea. We need to fix #1026 first though. Also, in the light of #1682, we may also specialize Vec<[T; N]> (arrays) and Vec<(T1, T2, ...)> (tuples) along with the Option<...> variants of them.

By default, if one wants to support a -vvv kind of flag, they would have to use Option<Vec<Vec>>

This is exactly what parse(from_occurrences) is for, isn't it?

@CreepySkeleton
Copy link
Contributor Author

CreepySkeleton commented Apr 18, 2020

Note: multiple means multiple occurrences not values according to clap docs

This part of reply ended up being literally drenched in sarcasm, an therefore was moved to it's own comment. I'm not blaming anyone and neither I mean to offend anybody. Just stating that the state of affairs is ridiculous.

Sure thing. According to 2.x docs. Well, let's test it! Playground

extern crate clap;

use clap::{App, Arg};

fn main() {
    let m = App::new("app")
        .arg(Arg::with_name("arg").multiple(true).takes_value(true))
        .arg(Arg::with_name("opt").long("foo").multiple(true).takes_value(true))
        .get_matches_from(&["test", "val1", "val2", "--foo", "optv1", "optv2"]);
        
    println!("{:?}", m.values_of("arg").unwrap().collect::<Vec<_>>());
    println!("{:?}", m.values_of("opt").unwrap().collect::<Vec<_>>());
}

So, we expect it to error, right?

["val1", "val2"]
["optv1", "optv2"]

Wait, what?

In practice, I've seen a lot of people taking multiple as synonym for "multiple values". And this is how it does work: multiple values + multiple occurrences. This is the way it's been working for, how long? Five years? People got used to it, it's in their DNA now. We can't just change the behavior (while preserving the name of the method) and expect it to be received well. And we didn't.

Well, the master is much better! Right? Right. You'd expect it to be documented properly...

Specifies that the argument may have an unknown number of multiple values. Without any other
settings, this argument may appear only *once*.

For example, `--opt val1 val2` is allowed, but `--opt val1 val2 --opt val3` is not.

Of course. "May appear only once". Crystal clear. What the function looks like?

pub fn multiple(mut self, multi: bool) -> Self {
    if multi {
        self.setb(ArgSettings::MultipleOccurrences); // <- YOU MUST BE KIDDING
        self.setting(ArgSettings::MultipleValues)
    } else {
        self.unsetb(ArgSettings::MultipleOccurrences);
        self.unset_setting(ArgSettings::MultipleValues)
    }
}

Sarcasm aside, I very much like how it's done in master:

// multiple values + multiple occurrences
pub fn multiple(mut self, multi: bool) -> Self;

// multiple values only
pub fn multiple_values(self, multi: bool) -> Self;

// multiple occurrences only
pub fn multiple_occurrences(self, multi: bool) -> Self;

Well done. Well done indeed. No sarcasm, the API is clear and tidy. Except copy-pasting docs and examples from 2.x probably wasn't the brightest idea.

@TeXitoi
Copy link
Contributor

TeXitoi commented Apr 18, 2020

Option<Vec> is a very recent feature. No one asked anything about Option<Vec> for several years.

I've never seen any question about Option<Vec> except from @cecton as far as I remember. Just one but report saying that Option<Vec> was a breaking change because someone was using Option<Vec<u8>>.

@CreepySkeleton
Copy link
Contributor Author

CreepySkeleton commented Apr 18, 2020

Also, the last like of @pksunkara 's table is impossible unless user explicitly specifies takes_value = false. All of the variants should be takes_value = true by default (multiple_values implies it).

app -ff [] Some([]) [[], []] Some([[], []])

@pksunkara
Copy link
Member

In that case, assume that I meant multiple_occurrences only when I said multiple in my earlier comment.

What do you mean by takes_value = true by default? Does all the types we mentioned above has that as default? Then doesn't it mean that app -f is not allowed too? Vec<T> would never be allowed 0 values.

@CreepySkeleton
Copy link
Contributor Author

Does all the types we mentioned above has that as default?

In clap 2: it does so explicitly, this is documented:

Setting multiple(true) for an option with no other details, allows multiple values and multiple occurrences because it isn't possible to have more occurrences than values for options

In clap 3: not explicitly, I haven't been able to locate the place this is handled in, but I just checked this program:

use clap::{App, Arg};

fn main() {
    let m = App::new("app")
        .arg(Arg::with_name("opt").long("foo").multiple_values(true))
        .get_matches();

    println!("{:?}", m.values_of("opt").map(|v| v.collect::<Vec<_>>()));
}

And yes, it works just like if takes_value(true) has been set:

D:\workspace\probe>cargo run --
    Finished dev [unoptimized + debuginfo] target(s) in 0.13s
     Running `target\debug\probe.exe`
None

D:\workspace\probe>cargo run -- --foo val1
    Finished dev [unoptimized + debuginfo] target(s) in 0.14s
     Running `target\debug\probe.exe --foo val1`
Some(["val1"])

D:\workspace\probe>cargo run -- --foo val1 val2
    Finished dev [unoptimized + debuginfo] target(s) in 0.13s
     Running `target\debug\probe.exe --foo val1 val2`
Some(["val1", "val2"])

app -fff from the table would have been parsed as (also checked with master)

app -f ff
    -- --
  flag value

@pksunkara
Copy link
Member

@kbknapp Since we have you here, please take a look at this issue.

@CreepySkeleton
Copy link
Contributor Author

Actually, if you wrap your head around the terminology a bit, it becomes quite clear:

  • Non-special types are required by default. (For instance, String is not special)
  • bool is an example of special type, it's not required by default,
  • Vec, Option, arrays, tuples and such can be considered "adapter" for other types. Precise behavior depends on adapter and the type it wraps.

It's a problem of good documentation more than anything, about the way it conveys this setup to user. I'm amending my previous statement about required with Vec, I see no problem with it anymore. Points about number_of_values and multiple_occurrences vs multiple_values still stand.

@kbknapp
Copy link
Member

kbknapp commented Apr 29, 2020

Whew ok that was a read!

Ever time I see issues pop up around multiple I cringe a little inside. In clap 1 there was only the concept of multipe = true, and I only ever really cared about -f v v -f v v (i.e. both occurrences and values) for no other reason than having not put a lot of thought into it.

Then clap 2 came around and there was confusion when people wanted multiple values, but not multiple occurrences, or vice versa. It was also over-loaded with flags which only occurrences are possible. This spawned things like number_of_values or min_values/max_values. Again, it was super confusing for the subset of people that had these various requirements, but most just adapted and moved on.

Looking back, I wish I'd made the sane default of multiple=true and no other info being -f v -f v is allowed -f v v is not. Which is what I attempted to make the default in 3.x because this was by far the most common request, most people expected -f v -f v (minus people mostly familiar with IRC like CLIs which was a minority). It also no longer required the number_of_values=1, which in all honest looks kind of confusing if you're not already familiar with clap...."Why am I say I want multiple values, AND also saying number of values is 1?!"

I also wanted to use the 3.x breaking changes to fix this confusion once and for all. I have a much better idea of what is most common. I want the defaults to be the most common case, and allow those who need something different be able to do so.

Unfortunately before I went on hiatus I didn't finish the docs which I'm sure has lead to even more confusion around the various types of multiples, and their defaults or interactions with each other.

I see multiples as one of those things, "It sounds sooo simple at first, why can't it just work!?" Until you dig into the details and realize, "Ooooh. Yeah I can totally see how some would want X while others need Y, and you need to distinguish between them sometimes, but not always." It's kind of like Strings in that manner; such a simple concept yet so hard to do correctly. ...Or CLIs for that matter 😜

Ok, so back to the topic at hand:

I find myself agreeing with @CreepySkeleton here. We're talking about sane defaults. I can also totally respect @TeXitoi's sentiment of Option<Vec<T>> is much more of a pain to deal with than Vec<T>. However I disagree that all options should default to 0 values. In fact, I've come across very few CLIs that rely on --foo <no value>. They do exist for sure, but most I've run into either expect a value, or expect no argument to be used (with or without a default value).

So I think this comes down to balancing Rust ergonomics and semantics (Vec<T>, where all Vecs can have zero values by default) with clap ergonomics and semantics (Option<T> means optional argument, thus Option<Vec<T>> means optional option which accepts multiple values...combined with in clap speak all options by default must have at least 1 value). I think it's perfectly reasonable to impose clap defaults/semantics here. People are building a struct specifically around claps semantics, so I think that although Vec<T> can totally have zero or more values by default the fact that it does not lining up with claps 1 or more values by default is fine and people will first look to their knowledge about clap semantics when building these structs. This is in part because there is no intrinsic Rust type that is one or more values.

I would suggest simply calling it out in the docs directly, that although the default is Vec<T> means required = true, if you want the ergonomics of Vec<T> just override the required attribute with false

Also, perhaps I'm out of touch, but if I wanted to allow multiple values with occurrences via derive in clap I'd instinctivily try to do Option<Vec<Vec<T>>>

command type
-f v -f v Some([v, v])
-f v v Some([v, v])
-f v v -f v v Some([[v, v], [v, v]])

app -fff from the table would have been parsed as [-f=ff]

Correct. That is intended behaviour due to how parsing is structured. I don't think there is anything we can, or would want to do about this.

It's a problem of good documentation more than anything, about the way it conveys this setup to user.

100% agree.


Having said all that, if we settled on Vec<T> defaulting to not-required because Rust ergonomics are more important, it's not a huge issue to me so long as it's documented very clearly.

@sourcefrog
Copy link
Contributor

I just came here from TeXitoi/structopt#396 and wanted to say, I really hope in a v3 breaking change, Clap can address the multiple / number_of_values confusion, both for the API and for the new derive macros.

Looking back, I wish I'd made the sane default of multiple=true and no other info being -f v -f v is allowed -f v v is not. Which is what I attempted to make the default in 3.x because this was by far the most common request, most people expected -f v -f v (minus people mostly familiar with IRC like CLIs which was a minority). It also no longer required the number_of_values=1, which in all honest looks kind of confusing if you're not already familiar with clap...."Why am I say I want multiple values, AND also saying number of values is 1?!"

+100 this bit. It really sticks out as a trap in what's generally a friendly API, and is exacerbated by the fact that you can easily not notice the bug is there.

@epage
Copy link
Member

epage commented Jul 9, 2021

Catching up on this issue. Sorry if I missed something or am just repeating others.

Multiple values: From a code author and user perspective, I have found variable number of values (-f 1 and -f 1 2) to be confusing and brittle. In the spirit of "pit of success" (make the right things easy), this should require extra work to opt-in to. I'd go as far as saying that the user should always be required to specify an attribute for this

Delimited values: When I want to support multiple values, I tend to do it with a delimiter (-f 1,2 or Rust's --features "foo bar"). If Rust support &'static str in const-generics, we could easily provide a DelimitedValues container for users that implements FromStr, IntoIterator, etc.

When I'm writing a CLI, I tend to prefer multiple occurrences with delimited values all flattened, There tends to be some friction in this process, so I tend to only do it when I really need it. When I do skip on one, I skip on delimited values, preferring multiple occurrences. It tends to be more composable. If I need require at least one value, it is across both occurrences and delimited values.

The fact that Vec is not required by default is inconsistent with all the other types in clap_derive that are required by default unless they are wrapped in Option (except bool but it's a very special case).

So far, the talk seems to be about Option being the way of specifying required and anything else is a deviation provided for ergonomics (Vec, bool).

Something being left out is alternative mental models, like focusing on the type's semantics, for which Option and Vecs behavior could be made consistent.

So let's break down some type semantics:

  • i32: single value (N)
  • Option<i32>: 0 or 1 values (?)
  • Vec<i32>: 0 or more values (*)
  • (i32, i32): exactly two values (N) (never seen this documented but we'll talk about it later)

(we don't have a native type for "1 or more" (+), though there is the vec1 crate and ranged number of values ({M,N}) is too specialized to worry about)

And with the notes at the top, our considerations are:

  • Variable number of values is brittle and we should at least not make too easy, maybe even make it difficult.
  • It takes extra work to customize the default parsing logic (parse attribute) for natively supported types
  • I'm not aware of a way to plug in a type that behaves like Option, Vec, or bool

My proposal would be to match the type semantics, focusing on occurrences:

  • Option<i32>: 0 or 1 occurrence
  • Vec<i32>: 0 or more occurrences
  • i32: 1 occurrence
  • (i32, i32): multiple values (fixed number), this is a weird one because fixed number of occurrences is rare enough not to provide but fixed number of values seems common enough. Python's argparse goes as far as automatically handling this with type=(str, str) and metavar=("NAME", "VALUE") when you say nargs=2.
  • attribute: multiple values (variable number)

Note: one downside to implementing any intuitive behavior is people have to predict both what was intuitive to you and how complete your implementation is. No-intuitive behavior is fully explicit and predictable. A theoretical fully-intuitive system would be, by definition, predictable. When you land in the middle, people feel some uncertainty. How quickly that uncertainty is dispelled is proportionate to the quality of your library(1) in framing (API design, docs explaining) (2) making it quick to answer questions (reference). As an aside, I think structopt has let me down on both of these fronts and hopefully we'll iterate and improve on this. This doesn't mean we should never try to make things intuitive, we should just avoid cleverness and recognize two reasonable people will have different opinions on where things land with this.

@pksunkara
Copy link
Member

Tuples and other adapters can be handled/discussed in #1717. What we need to finalize in this issue are the Option and Vec combination adapters.

@epage
Copy link
Member

epage commented Jul 10, 2021

To clarify, my intent was not to pull #1717 into this but to look at the problem holistically to see how it can affect decisions made here.

@pksunkara
Copy link
Member

I think we need to first decide on a table like https://docs.rs/structopt/0.3.22/structopt/#type-magic

epage added a commit to epage/clap that referenced this issue Nov 1, 2021
I noticed this while investigating clap-rs#2692.  Since we are making
multiple-occurrences a thing for positional arguments, this allows us to
remove a special case.

Another way to look at this is that we should make the default whatever
we do for dervies (clap-rs#1772).  I'm going to propose we make the derive
always turn `Vec<i32>` into multiple occurences and not multiple values
(with users being able to change it through attributes), but that is an
in-work proposal and not decided yet.
epage added a commit to epage/clap that referenced this issue Nov 1, 2021
I noticed this while investigating clap-rs#2692.  Since we are making
multiple-occurrences a thing for positional arguments, this allows us to
remove a special case.

Another way to look at this is that we should make the default whatever
we do for dervies (clap-rs#1772).  I'm going to propose we make the derive
always turn `Vec<i32>` into multiple occurences and not multiple values
(with users being able to change it through attributes), but that is an
in-work proposal and not decided yet.

BREAKING CHANGE: `Arg::from(...)` will now use `multiple_occurrences`
for a positional `...`, rather than `multiple_values`.
epage added a commit to epage/clap that referenced this issue Nov 2, 2021
I noticed this while investigating clap-rs#2692.  Since we are making
multiple-occurrences a thing for positional arguments, this allows us to
remove a special case.

Another way to look at this is that we should make the default whatever
we do for dervies (clap-rs#1772).  I'm going to propose we make the derive
always turn `Vec<i32>` into multiple occurences and not multiple values
(with users being able to change it through attributes), but that is an
in-work proposal and not decided yet.

BREAKING CHANGE: `Arg::from(...)` will now use `multiple_occurrences`
for a positional `...`, rather than `multiple_values`.
epage added a commit to epage/clap that referenced this issue Nov 4, 2021
In experimenting on clap-rs#1772, I want to write test cases for various
combinations of required or not, values vs occurrences, etc.  There
wasn't really a clear place to put these.

On top of that, I wanted there to be a clear place in the tests for
describing the behavior of special types, to make it easier to audit and
easier to see how a PR for clap-rs#1772 changes things.

As part of this effort in organizing these tests, I reduced the number
of tests that use special types.  This better focuses these tests on the
cases they are intending to cover, rather than pulling in unrelated
features.  This makes it easier to audit special types and makes it so
failures give more focused results, making it easier to see what broke.
epage added a commit to epage/clap that referenced this issue Nov 4, 2021
In experimenting on clap-rs#1772, I want to write test cases for various
combinations of required or not, values vs occurrences, etc.  There
wasn't really a clear place to put these.

On top of that, I wanted there to be a clear place in the tests for
describing the behavior of special types, to make it easier to audit and
easier to see how a PR for clap-rs#1772 changes things.

As part of this effort in organizing these tests, I reduced the number
of tests that use special types.  This better focuses these tests on the
cases they are intending to cover, rather than pulling in unrelated
features.  This makes it easier to audit special types and makes it so
failures give more focused results, making it easier to see what broke.
epage added a commit to epage/clap that referenced this issue Nov 4, 2021
In experimenting on clap-rs#1772, I want to write test cases for various
combinations of required or not, values vs occurrences, etc.  There
wasn't really a clear place to put these.

On top of that, I wanted there to be a clear place in the tests for
describing the behavior of special types, to make it easier to audit and
easier to see how a PR for clap-rs#1772 changes things.

As part of this effort in organizing these tests, I reduced the number
of tests that use special types.  This better focuses these tests on the
cases they are intending to cover, rather than pulling in unrelated
features.  This makes it easier to audit special types and makes it so
failures give more focused results, making it easier to see what broke.
epage added a commit to epage/clap that referenced this issue Nov 4, 2021
In experimenting on clap-rs#1772, I want to write test cases for various
combinations of required or not, values vs occurrences, etc.  There
wasn't really a clear place to put these.

On top of that, I wanted there to be a clear place in the tests for
describing the behavior of special types, to make it easier to audit and
easier to see how a PR for clap-rs#1772 changes things.

As part of this effort in organizing these tests, I reduced the number
of tests that use special types.  This better focuses these tests on the
cases they are intending to cover, rather than pulling in unrelated
features.  This makes it easier to audit special types and makes it so
failures give more focused results, making it easier to see what broke.
epage added a commit to epage/clap that referenced this issue Nov 4, 2021
In experimenting on clap-rs#1772, I want to write test cases for various
combinations of required or not, values vs occurrences, etc.  There
wasn't really a clear place to put these.

On top of that, I wanted there to be a clear place in the tests for
describing the behavior of special types, to make it easier to audit and
easier to see how a PR for clap-rs#1772 changes things.

As part of this effort in organizing these tests, I reduced the number
of tests that use special types.  This better focuses these tests on the
cases they are intending to cover, rather than pulling in unrelated
features.  This makes it easier to audit special types and makes it so
failures give more focused results, making it easier to see what broke.
epage added a commit to epage/clap that referenced this issue Nov 5, 2021
Before:
- `bool`: a flag
- `Option<_>`: not required
- `Option<Option<_>>` is not required and when it is present, the value
  is not required
- `Vec<_>`: multiple values, optional
- `Option<Vec<_>>`: multiple values, min values of 1, optional

After:
- `bool`: a flag
- `Option<_>`: not required
- `Option<Option<_>>` is not required and when it is present, the value
  is not required
- `Vec<_>`: multiple occurrences, optional
  - optional: `Vec` implies 0 or more, so should not imply required
- `Option<Vec<_>>`: multiple occurrences, optional
  - optional: Use over `Vec` to detect when no option being present when
    using multiple values

Motivations:

My priorities were:
1. Are we getting in the users way?
2. Does the API make sense?
3. Does the API encourage best practices?

I was originally concerned about the lack of composability with
`Option<Option<_>>` and `Option<Vec<_>>` (and eventually `Vec<Vec<_>>`).
It prescribes special meaning to each type depending on where it shows
up, rather than providing a single meaning for a type generally.  You
then can't do things like have `Option<_>` mean "required argument with
optional value" without hand constructing it.  However, in practice the
outer type correlates with the argument occurrence and the inner type with the
value.   It is rare to want the value behavior without also the
occurrence behavior.  So I figure it is probably fine as long as people
can set the flags to manually get the behavior they want.

`Vec<_>` implies multiple occurrences, rather than multiple values.
Anecdotally, whenver I've used the old `Arg::multiple`, I thought I was
getting `Arg::multiple_occurrences` only.  `Arg::multiple_values`,
without any bounds or delimeter requirement, can lead to a confusing
user experience but isn't a good default for these.  On top of that, if
someone does have an unbounded or a delimeter multiple values, they are
probably also using multiple occurrences.

`Vec<_>` is optional because a `Vec` implies 0 or more, so we stick to
the meaning of the rust type.

`Option<Vec<_>>` ends up matching `Vec<_>` which an raise the question
of why have it.  Some users might prefer the type.  Otherwise, this is
so users can no when the argument is present or not when using
`min_values(0)`.  Rather than defining an entire policy around this and
having users customize it, or setting `min_values(0)` without the rest
of a default policy, this gives people a blank slate to work from.

Another option would have been to not infer a setting if someone sets a
handful of settings manually, which would have avoided the confusion in
Issue clap-rs#2599 but I see that being confusing (for someone who knows the
default, they will be expecting it to be additive; which flags?) and
brittle (as flags are added or changed, how do we ensure we keep this
up?)

Tests were added to ensure we support people customizing the behavior to
match their needs.

This is not solving:
- `Vec<Vec<_>>`, see clap-rs#2924
- `(T1, T2)`, `Vec<(T1, T2)>`, etc, see clap-rs#1717

Fixes clap-rs#1772
Fixes clap-rs#2599
See also clap-rs#2195
epage added a commit to epage/clap that referenced this issue Nov 5, 2021
Before:
- `bool`: a flag
- `Option<_>`: not required
- `Option<Option<_>>` is not required and when it is present, the value
  is not required
- `Vec<_>`: multiple values, optional
- `Option<Vec<_>>`: multiple values, min values of 0, optional

After:
- `bool`: a flag
- `Option<_>`: not required
- `Option<Option<_>>` is not required and when it is present, the value
  is not required
- `Vec<_>`: multiple occurrences, optional
  - optional: `Vec` implies 0 or more, so should not imply required
- `Option<Vec<_>>`: multiple occurrences, optional
  - optional: Use over `Vec` to detect when no option being present when
    using multiple values

Motivations:

My priorities were:
1. Are we getting in the users way?
2. Does the API make sense?
3. Does the API encourage best practices?

I was originally concerned about the lack of composability with
`Option<Option<_>>` and `Option<Vec<_>>` (and eventually `Vec<Vec<_>>`).
It prescribes special meaning to each type depending on where it shows
up, rather than providing a single meaning for a type generally.  You
then can't do things like have `Option<_>` mean "required argument with
optional value" without hand constructing it.  However, in practice the
outer type correlates with the argument occurrence and the inner type with the
value.   It is rare to want the value behavior without also the
occurrence behavior.  So I figure it is probably fine as long as people
can set the flags to manually get the behavior they want.

`Vec<_>` implies multiple occurrences, rather than multiple values.
Anecdotally, whenver I've used the old `Arg::multiple`, I thought I was
getting `Arg::multiple_occurrences` only.  `Arg::multiple_values`,
without any bounds or delimeter requirement, can lead to a confusing
user experience but isn't a good default for these.  On top of that, if
someone does have an unbounded or a delimeter multiple values, they are
probably also using multiple occurrences.

`Vec<_>` is optional because a `Vec` implies 0 or more, so we stick to
the meaning of the rust type.

`Option<Vec<_>>` ends up matching `Vec<_>` which an raise the question
of why have it.  Some users might prefer the type.  Otherwise, this is
so users can no when the argument is present or not when using
`min_values(0)`.  Rather than defining an entire policy around this and
having users customize it, or setting `min_values(0)` without the rest
of a default policy, this gives people a blank slate to work from.

Another option would have been to not infer a setting if someone sets a
handful of settings manually, which would have avoided the confusion in
Issue clap-rs#2599 but I see that being confusing (for someone who knows the
default, they will be expecting it to be additive; which flags?) and
brittle (as flags are added or changed, how do we ensure we keep this
up?)

Tests were added to ensure we support people customizing the behavior to
match their needs.

This is not solving:
- `Vec<Vec<_>>`, see clap-rs#2924
- `(T1, T2)`, `Vec<(T1, T2)>`, etc, see clap-rs#1717

Fixes clap-rs#1772
Fixes clap-rs#2599
See also clap-rs#2195
epage added a commit to epage/clap that referenced this issue Nov 5, 2021
Before:
- `bool`: a flag
- `Option<_>`: not required
- `Option<Option<_>>` is not required and when it is present, the value
  is not required
- `Vec<_>`: multiple values, optional
- `Option<Vec<_>>`: multiple values, min values of 0, optional

After:
- `bool`: a flag
- `Option<_>`: not required
- `Option<Option<_>>` is not required and when it is present, the value
  is not required
- `Vec<_>`: multiple occurrences, optional
  - optional: `Vec` implies 0 or more, so should not imply required
- `Option<Vec<_>>`: multiple occurrences, optional
  - optional: Use over `Vec` to detect when no option being present when
    using multiple values

Motivations:

My priorities were:
1. Are we getting in the users way?
2. Does the API make sense?
3. Does the API encourage best practices?

I was originally concerned about the lack of composability with
`Option<Option<_>>` and `Option<Vec<_>>` (and eventually `Vec<Vec<_>>`).
It prescribes special meaning to each type depending on where it shows
up, rather than providing a single meaning for a type generally.  You
then can't do things like have `Option<_>` mean "required argument with
optional value" without hand constructing it.  However, in practice the
outer type correlates with the argument occurrence and the inner type with the
value.   It is rare to want the value behavior without also the
occurrence behavior.  So I figure it is probably fine as long as people
can set the flags to manually get the behavior they want.

`Vec<_>` implies multiple occurrences, rather than multiple values.
Anecdotally, whenever I've used the old `Arg::multiple`, I thought I was
getting `Arg::multiple_occurrences` only.  `Arg::multiple_values`,
without any bounds or delimiter requirement, can lead to a confusing
user experience and isn't a good default for these.  On top of that, if
someone does have an unbounded or a delimiter multiple values, they are
probably also using multiple occurrences.

`Vec<_>` is optional because a `Vec` implies 0 or more, so we stick to
the meaning of the rust type.

`Option<Vec<_>>` ends up matching `Vec<_>` which an raise the question
of why have it.  Some users might prefer the type.  Otherwise, this is
so users can no when the argument is present or not when using
`min_values(0)`.  Rather than defining an entire policy around this and
having users customize it, or setting `min_values(0)` without the rest
of a default policy, this gives people a blank slate to work from.

Another option would have been to not infer a setting if someone sets a
handful of settings manually, which would have avoided the confusion in
Issue clap-rs#2599 but I see that being confusing (for someone who knows the
default, they will be expecting it to be additive; which flags?) and
brittle (as flags are added or changed, how do we ensure we keep this
up?)

Tests were added to ensure we support people customizing the behavior to
match their needs.

This is not solving:
- `Vec<Vec<_>>`, see clap-rs#2924
- `(T1, T2)`, `Vec<(T1, T2)>`, etc, see clap-rs#1717

Fixes clap-rs#1772
Fixes clap-rs#2599
See also clap-rs#2195
epage added a commit to epage/clap that referenced this issue Nov 5, 2021
Before:
- `bool`: a flag
- `Option<_>`: not required
- `Option<Option<_>>` is not required and when it is present, the value
  is not required
- `Vec<_>`: multiple values, optional
- `Option<Vec<_>>`: multiple values, min values of 0, optional

After:
- `bool`: a flag
- `Option<_>`: not required
- `Option<Option<_>>` is not required and when it is present, the value
  is not required
- `Vec<_>`: multiple occurrences, optional
  - optional: `Vec` implies 0 or more, so should not imply required
- `Option<Vec<_>>`: multiple occurrences, optional
  - optional: Use over `Vec` to detect when no option being present when
    using multiple values

Motivations:

My priorities were:
1. Are we getting in the users way?
2. Does the API make sense?
3. Does the API encourage best practices?

I was originally concerned about the lack of composability with
`Option<Option<_>>` and `Option<Vec<_>>` (and eventually `Vec<Vec<_>>`).
It prescribes special meaning to each type depending on where it shows
up, rather than providing a single meaning for a type generally.  You
then can't do things like have `Option<_>` mean "required argument with
optional value" without hand constructing it.  However, in practice the
outer type correlates with the argument occurrence and the inner type with the
value.   It is rare to want the value behavior without also the
occurrence behavior.  So I figure it is probably fine as long as people
can set the flags to manually get the behavior they want.

`Vec<_>` implies multiple occurrences, rather than multiple values.
Anecdotally, whenever I've used the old `Arg::multiple`, I thought I was
getting `Arg::multiple_occurrences` only.  `Arg::multiple_values`,
without any bounds or delimiter requirement, can lead to a confusing
user experience and isn't a good default for these.  On top of that, if
someone does have an unbounded or a delimiter multiple values, they are
probably also using multiple occurrences.

`Vec<_>` is optional because a `Vec` implies 0 or more, so we stick to
the meaning of the rust type.  At least for me, I also rarely need a
required with multiple occurrences argument but more often need optional
with multiple occurrences.

`Option<Vec<_>>` ends up matching `Vec<_>` which an raise the question
of why have it.  Some users might prefer the type.  Otherwise, this is
so users can no when the argument is present or not when using
`min_values(0)`.  Rather than defining an entire policy around this and
having users customize it, or setting `min_values(0)` without the rest
of a default policy, this gives people a blank slate to work from.

Another option would have been to not infer a setting if someone sets a
handful of settings manually, which would have avoided the confusion in
Issue clap-rs#2599 but I see that being confusing (for someone who knows the
default, they will be expecting it to be additive; which flags?) and
brittle (as flags are added or changed, how do we ensure we keep this
up?)

Tests were added to ensure we support people customizing the behavior to
match their needs.

This is not solving:
- `Vec<Vec<_>>`, see clap-rs#2924
- `(T1, T2)`, `Vec<(T1, T2)>`, etc, see clap-rs#1717

Fixes clap-rs#1772
Fixes clap-rs#2599
See also clap-rs#2195
epage added a commit to epage/clap that referenced this issue Nov 5, 2021
Before:
- `bool`: a flag
- `Option<_>`: not required
- `Option<Option<_>>` is not required and when it is present, the value
  is not required
- `Vec<_>`: multiple values, optional
- `Option<Vec<_>>`: multiple values, min values of 0, optional

After:
- `bool`: a flag
- `Option<_>`: not required
- `Option<Option<_>>` is not required and when it is present, the value
  is not required
- `Vec<_>`: multiple occurrences, optional
  - optional: `Vec` implies 0 or more, so should not imply required
- `Option<Vec<_>>`: multiple occurrences, optional
  - optional: Use over `Vec` to detect when no option being present when
    using multiple values

Motivations:

My priorities were:
1. Are we getting in the users way?
2. Does the API make sense?
3. Does the API encourage best practices?

I was originally concerned about the lack of composability with
`Option<Option<_>>` and `Option<Vec<_>>` (and eventually `Vec<Vec<_>>`).
It prescribes special meaning to each type depending on where it shows
up, rather than providing a single meaning for a type generally.  You
then can't do things like have `Option<_>` mean "required argument with
optional value" without hand constructing it.  However, in practice the
outer type correlates with the argument occurrence and the inner type with the
value.   It is rare to want the value behavior without also the
occurrence behavior.  So I figure it is probably fine as long as people
can set the flags to manually get the behavior they want.

`Vec<_>` implies multiple occurrences, rather than multiple values.
Anecdotally, whenever I've used the old `Arg::multiple`, I thought I was
getting `Arg::multiple_occurrences` only.  `Arg::multiple_values`,
without any bounds or delimiter requirement, can lead to a confusing
user experience and isn't a good default for these.  On top of that, if
someone does have an unbounded or a delimiter multiple values, they are
probably also using multiple occurrences.

`Vec<_>` is optional because a `Vec` implies 0 or more, so we stick to
the meaning of the rust type.  At least for me, I also rarely need a
required with multiple occurrences argument but more often need optional
with multiple occurrences.

`Option<Vec<_>>` ends up matching `Vec<_>` which can raise the question
of why have it.  Some users might prefer the type.  Otherwise, this is
so users can detect whether the argument is present or not when using
`min_values(0)`.  Rather than defining an entire policy around this and
having users customize it, or setting `min_values(0)` without the rest
of a default policy, this gives people a blank slate to work from.

Another design option would have been to not infer any special-type
settings if someone sets a handful of settings manually, which would
have avoided the confusion in Issue clap-rs#2599 but I see that being confusing
(for someone who knows the default, they will be expecting it to be
additive; which flags disable inferred settings?) and brittle (as flags
are added or changed, how do we ensure we keep this up?).

Tests were added to ensure we support people customizing the behavior to
match their needs.

This is not solving:
- `Vec<Vec<_>>`, see clap-rs#2924
- `(T1, T2)`, `Vec<(T1, T2)>`, etc, see clap-rs#1717
- `Vec<Option<_>>` and many other potential combinations

Fixes clap-rs#1772
Fixes clap-rs#2599
See also clap-rs#2195
@epage
Copy link
Member

epage commented Nov 5, 2021

Before:

  • bool: a flag
  • Option<_>: not required
  • Option<Option<_>> is not required and when it is present, the value
    is not required
  • Vec<_>: multiple values, optional
  • Option<Vec<_>>: multiple values, min values of 0, optional

After:

  • bool: a flag
  • Option<_>: not required
  • Option<Option<_>> is not required and when it is present, the value
    is not required
  • Vec<_>: multiple occurrences, optional
    • optional: Vec implies 0 or more, so should not imply required
  • Option<Vec<_>>: multiple occurrences, optional
    • optional: Use over Vec to detect when no option being present when
      using multiple values

Motivations:

My priorities were:

  1. Are we getting in the users way?
  2. Does the API make sense?
  3. Does the API encourage best practices?

I was originally concerned about the lack of composability with
Option<Option<_>> and Option<Vec<_>> (and eventually Vec<Vec<_>>).
It prescribes special meaning to each type depending on where it shows
up, rather than providing a single meaning for a type generally. You
then can't do things like have Option<_> mean "required argument with
optional value" without hand constructing it. However, in practice the
outer type correlates with the argument occurrence and the inner type with the
value. It is rare to want the value behavior without also the
occurrence behavior. So I figure it is probably fine as long as people
can set the flags to manually get the behavior they want.

Vec<_> implies multiple occurrences, rather than multiple values.
Anecdotally, whenever I've used the old Arg::multiple, I thought I was
getting Arg::multiple_occurrences only. Arg::multiple_values,
without any bounds or delimiter requirement, can lead to a confusing
user experience and isn't a good default for these. On top of that, if
someone does have an unbounded or a delimiter multiple values, they are
probably also using multiple occurrences.

Vec<_> is optional because a Vec implies 0 or more, so we stick to
the meaning of the rust type. At least for me, I also rarely need a
required with multiple occurrences argument but more often need optional
with multiple occurrences.

Option<Vec<_>> ends up matching Vec<_> which can raise the question
of why have it. Some users might prefer the type. Otherwise, this is
so users can detect whether the argument is present or not when using
min_values(0). Rather than defining an entire policy around this and
having users customize it, or setting min_values(0) without the rest
of a default policy, this gives people a blank slate to work from.

Another design option would have been to not infer any special-type
settings if someone sets a handful of settings manually, which would
have avoided the confusion in Issue #2599 but I see that being confusing
(for someone who knows the default, they will be expecting it to be
additive; which flags disable inferred settings?) and brittle (as flags
are added or changed, how do we ensure we keep this up?).

Tests were added to ensure we support people customizing the behavior to
match their needs.

For now, I am leaving off:

To see this in action, see #2993

@epage
Copy link
Member

epage commented Dec 7, 2021

Something like #2993 was recently merged, along with a new derive reference.

@epage epage closed this as completed Dec 7, 2021
@pksunkara
Copy link
Member

Vec<Option<_>> and many other potential combinations

@epage Are we tracking this somewhere? I couldn't find any issue about this since we closed this one.

@epage
Copy link
Member

epage commented Dec 8, 2021

Not yet but I think we should wait until there are users asking for them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-derive Area: #[derive]` macro API C-enhancement Category: Raise on the bar on expectations
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants