Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partially parse args, capturing unknown args into a vec, rather than error #1404

Open
nkeor opened this issue Jan 23, 2019 · 30 comments
Open
Labels
A-parsing Area: Parser's logic and needs it changed somehow. C-enhancement Category: Raise on the bar on expectations S-waiting-on-design Status: Waiting on user-facing design to be resolved before implementing

Comments

@nkeor
Copy link

nkeor commented Jan 23, 2019

Maintainer's notes

Workaround

  • Parse with the args you care about with Command::ignore_errors

Affected Version of clap

  • 2.32.0

Bug or Feature Request Summary

It would be great to have a way to save unknow args into a Vec<> or slice. For example, an Arg::with_name() option, e.g. Arg::with_name('unknow').unknow_args() or something like this.
Maybe it could be that instead of calling get_matches() on the arg list, add a get_know_matches() that returns a tuple, the first element being what would be get_matches() and the second a vector of the unknow args... Something like Python's argparse:

args, unknown = parser.parse_known_args()
let (matches, unknow_matches) = App::new("myprog")
    .version("0.1")
    .author("You <you@example.com>")
    .about("Something about your program")
    .arg(Arg::with_name("debug")
        .short("d")
        .multiple(true)
        .help("Turn debugging information on"))
     .arg(Arg::with_name("quiet")
         .short("q")
         .long("shut-up")
         .help("Shut up"))
      .get_know_matches();

And then, if you call myprog -d --something, you have in matches the normal clap behaviour, and in unknow_matches a vector containg '--something'

I don't know if I'm explaining it well, as English is not my primary language.
EDIT: Save unknow in vector or slice instead of ignoring them

The reason for this is that i'm building a program that has "subargs" (e.g. myprogram -Xh is the message help for myprogram -X, but not the same help as myprogram -Yh or myprogram -h.) I can build this by adding -X and -Y arguments, and run another function based on which arg was used, but clap needs to know all arguments, and that's why this would be nice.

@nkeor nkeor changed the title Ignore unknow args Ignore and save in vector unknow args Jan 25, 2019
@nkeor
Copy link
Author

nkeor commented Feb 3, 2019

I found this: #1361 It's exactly what I was looking for... maybe we should focus on that issue.

@zkat
Copy link
Contributor

zkat commented Dec 15, 2019

I could still quite use this: I want to be able to pass-through the args to a specific subcommand to a child process, ignoring all the direct args to the subcommand itself.

@cecton
Copy link
Contributor

cecton commented Jan 30, 2020

It would be useful for me too

@cecton
Copy link
Contributor

cecton commented Jan 30, 2020

I just found out how to do it! I used: AppSettings::TrailingVarArg, AppSettings::AllowLeadingHyphen

I did it with structopt but you can translate to the clap equivalent.

@pksunkara pksunkara added this to the 3.1 milestone Apr 9, 2020
@Urhengulas
Copy link

Hi @CreepySkeleton, @pksunkara,

I'd be interested in implementing this, if there is some guidance in how to approach this best. My idea would be to do it very similar to AppSettings::TrailingVarArg and put all the "ignored arguments" into the final trailing positional argument.

What do you think?

@pksunkara
Copy link
Member

I don't think anything needs implementation here. As @cecton says, you can use those app settings to allow the end users to add extra arguments.

@Urhengulas
Copy link

I don't think anything needs implementation here. As @cecton says, you can use those app settings to allow the end users to add extra arguments.

Hi @pksunkara, I've tried that, but this only ignores arguments "at the end", not "in the middle". Let me outline my situation:

What I want

Let's assume I have following app:

use std::path::PathBuf;
use structopt::StructOpt;

#[derive(Debug, StructOpt)]
struct Opt {
    #[structopt(short = "-L", long, parse(from_os_str))]
    library_path: Vec<PathBuf>,

    #[structopt(short, long, parse(from_os_str))]
    output: PathBuf,

    #[structopt(short = "T", long)]
    script: Vec<String>,
}

fn main() {
    let opt: Opt = Opt::from_args();
    dbg!(opt);
}

Therefore my cli can be invoked like this:

$ cli -T memory.x -o odir/ -L ldir/ -T memory2.x
[src/main.rs:29] opt = Opt {
    library_path: [
        "ldir/",
    ],
    output: "odir/",
    script: [
        "memory2.x",
        "memory.x",
    ],
}

But now I also want to allow the user to add arbitrary arguments at all positions, like this (which is the short form of this):

$ cli \
    --eh-frame-hdr \
    -flavor gnu \
    -L ldir1/ \
    file.o \
    -o odir/ \
    --gc-sections \
    -L ldir2/ \
    -L ldir3/ \
    -Tmemory.x \
    -Bdynamic

In my software I only care about -L, -T, -o, but all the other arguments should be accepted anyways, since I want to pass them through to another cli. Tbh, I don't necessarily need to capture them, but could also drop them, since currently I obtain all arguments to be passed further with fn std::env::args().

Therfore I'd like aboves invocation to result in sth like this:

[src/main.rs:29] opt = Opt {
    library_path: [
        "ldir1/",
        "ldir2/",
        "ldir3/",
    ],
    output: "dir/",
    script: [
        "memory.x",
    ],
    # `_rest` is nice to have, but actually not needed in my case
    _rest: [
        "--eh-frame-hdr",
        "-flavor",
        "gnu",
        "file.o",
        "--gc-sections",
        "-Bdynamic",
    ]
}

Why &[AppSettings::TrailingVarArg, AppSettings::AllowLeadingHyphen] doesn't work

I've tried the solution suggested above, but this didn't work for me for two reasons:

  1. It only accepts arguments added at the end, but not in the middle of the argument list
    • gets accept: cli -T memory.x -o odir/ -L ldir/ -T memory2.x -Bdynamic
    • get rejected: cli -T memory.x -Bdynamic -o odir/ -L ldir/ -T memory2.x
  2. Needed arguments after the first unknown argument don't get captured anymore
    • If I invoke the cli like this: cli -T memory.x -o odir/ -L ldir/ -T memory2.x -Bdynamic -T memory3.x, the last -T memory3.x gets not added to script, but rather _rest.

After having written all of this down it sounds like a AppSettings::IgnoreAdditionalArgs. What do you think?

@pksunkara
Copy link
Member

You are welcome to try implementing it but I think this would need a complete parsing logic refactor since we need to design the new logic from ground up.

@Urhengulas
Copy link

You are welcome to try implementing it but I think this would need a complete parsing logic refactor since we need to design the new logic from ground up.

@pksunkara I am not familiar with the clap parsing logic, but that is what I feared.. Do you know some workaround or other crate which could help in my case by any chance?

@pksunkara
Copy link
Member

Implementing #1880 might help you with this. You can try researching that direction.

@cecton
Copy link
Contributor

cecton commented Mar 30, 2021

@pksunkara I am not familiar with the clap parsing logic, but that is what I feared.. Do you know some workaround or other crate which could help in my case by any chance?

You can ssk the user to provide the parameters for the inner process separately:

./my_program --my-argument --my-other-argument -- --inner-process-argument1 --inner-process-argument2

That's what -- is usually for in Unix-like command lines.

@Urhengulas
Copy link

You can ssk the user to provide the parameters for the inner process separately:

./my_program --my-argument --my-other-argument -- --inner-process-argument1 --inner-process-argument2

That's what -- is usually for in Unix-like command lines.

@cecton Thank you for the tip. The problem is that I want to provide compatibility with another cli, ld to be precise (man page), and allow the user to invoke it just like they would invoke ld. Therefore I can't enforce additional usage rules.

But I only want to capture some of the arguments to do some preprocessing and then hand over all the arguments to ld.


The approach I am trying to go for now, is to preprocess the args to only include the ones i am interested in and then parse this with StructOpt::from_iter<I: IntoIterator>(iter: I).

@Urhengulas
Copy link

Implementing #1880 might help you with this. You can try researching that direction.

This feature actually looks like a possible solution for me!

@epage epage added A-parsing Area: Parser's logic and needs it changed somehow. C-enhancement Category: Raise on the bar on expectations and removed C: args labels Dec 8, 2021
@epage epage changed the title Ignore and save in vector unknow args Capture unknown args into a vec, rather than error Dec 9, 2021
@epage epage changed the title Capture unknown args into a vec, rather than error Partially parse args, capturing unknown args into a vec, rather than error Dec 9, 2021
@epage
Copy link
Member

epage commented Dec 9, 2021

This would also cover the pre-parsing use case in #1880 by defining a subset of arguments and using that to parse

@epage
Copy link
Member

epage commented Dec 9, 2021

As mentioned in #1880, IgnoreErrors might help with some use cases

@epage
Copy link
Member

epage commented Dec 17, 2021

We need to decide

  • How to opt-in
  • How to expose this in ArgMatches
  • How to expose this in derive

Examples include

We add a AppSettings::PartialParse. We then store everything in ArgMatches::unknown: Vec<OsString>. Like with #[clap(external_subcommand)], the unknown are accessed via a special attribute #[clap(unknown)].

We add a AppSettings::PartialParse. debug_asserts fail if external_subcommand is also set. We then store everything in "" argm like external subcommands. Like with #[clap(external_subcommand)], the unknown are accessed via a special attribute #[clap(unknown_args)].

We add a App::unknown_args(arg: Id). When called, unknown args will be put in ArgMatches[arg]. When not set, we'll error. We then expose this in derive with #[clap(unknown_args)].

Earlier, someone brought up a parse_known. In Python's argparse, they use a parse_known_args() -> Tuple[Namespace, list[str]] (contrasted with parse() -> Namespace). I lean away from this approach because

  • We already have a proliferation of parse methods because of our lack of default parameters
  • Our ArgMatches is already a bit heftier than Namespace, so we can augment it with other data like the unknown, in contrast to Python needing to return it via a tuple
  • We already have a pattern of settings changing how the parse behaves
  • We already have precedence with external subcommand for using the "" field in ArgMatches

@MarijnS95
Copy link

Has any progress been made on this issue? In cargo-apk we have one special -- "subcommand" for which we want to plainly invoke cargo with the environment that cargo-apk detects and sets up, together with (almost) all the arguments passed by the user: even those that are unrecognised by our application. The few Args that we do recognize here are used by us to decide how to set up the environment, and manually passed into std::process::Command when invoking cargo.

As seen in that example and mentioned above trailing_var_arg = true only brings us so far as it kicks in on encountering the first unknown argument whereas we'd like the user to mix these arguments however they please, and only pick out the few bits that we need.

@epage
Copy link
Member

epage commented Nov 7, 2022

No, no progress at this time. See the 4.x milestone for our current focus areas.

@MarijnS95
Copy link

@epage that's unfortunate, it doesn't seem part of 4.x. Is this anything you think an external contributor with no prior clap-internals experience can get started on (given this needs design first), or is there a workaround we could use? I've already committed to using clap since it's quite frankly completely superior to manual error-prone and help-less argument parsing, but broken some of our users by overlooking an implementation for this.

One workaround I assume is to just get all arguments into a Vec<String> (use trailing_var_arg = true without any other args, or is there something better) and then parse that vector into a separate Args struct where I can hopefully ignore unrecognized flags?

@epage
Copy link
Member

epage commented Nov 7, 2022

Earlier, I had posted some API design questions that first need to be resolved. Someone creating a proposal for that would be the first step.

We do have Command::ignore_errors for doing the workaround you mentioned.

@MarijnS95
Copy link

MarijnS95 commented Nov 9, 2022

@epage unfortunately I have a harder time than expected implementing this. Here's what I'm trying so far: https://github.com/MarijnS95/android-ndk-rs/compare/broken-trailing-args

The main problem is that as soon as I add #[clap(ignore_errors = true)], clap starts complaining about the first argument being missing, which is just a bool flag:

$ cargo r --release -p cargo-apk -- apk -- doc --no-deps
    Finished release [optimized] target(s) in 0.03s
     Running `target/release/cargo-apk apk -- doc --no-deps`
[cargo-apk/src/main.rs:104] &cargo_args = [
    "--no-deps",
]
error: The following required argument was not provided: quiet

Usage: cargo-apk [OPTIONS]

For more information try '--help'

EDIT: It seems that the error only continues to the next missing field (workspace) if I provide -q before any unknown arguemnts, as if ignore_errors = true still bails on the first unknown argument and subsequently complains about fields being left uninitialized...

And there are probably more things in that implementation that I can cleanup/optimize. Suggestions welcome!

EDIT2: I've taken the easy route and added a -- separator, even though this may be slightly confusing if they pass cargo-apk supported arguments after the -- and then wonder why cargo-apk ignores certain configuration values...

MarijnS95 added a commit to rust-mobile/ndk that referenced this issue Nov 14, 2022
… with `--` separator

`clap` [does not currently support] parsing unknown arguments into a
side `Vec`, which is exactly what `cargo apk --` needs to parse a few
known `cargo` arguments (such as `--target` and `-p`) but forward the
rest verbatim to the underlying `cargo` subcommand.

`allow_hyphen_values = true` could partially help us out with this, but
it parses all remaining arguments into that `Vec` upon encountering the
first unknown flag/arg, resulting in all known flags after that to also
be treated as "unknown" instead of filling up our `args: Args` struct.

Since [a workaround for this isn't currently functioning], introduce
pure trailing args with an additional `--` separator to make it clear
which arguments go to `cargo-apk` (and are almost all, except
`--device`, forwarded to `cargo`) and which are only passed to `cargo
<subcommand>`.

[does not currently support]: clap-rs/clap#1404
[a workaround for this isn't currently functioning]: clap-rs/clap#1404 (comment)
@epage
Copy link
Member

epage commented Nov 14, 2022

To answer my own questions, I feel like treating unknown args as positionals with a setting, like what @MarijnS95 did for cargo-apk with the former allow_hyphen_values would be the easest way to represent this in clap. Once we get grouped values, the use of this will be further improved.

The main question left for me is more about mapping this to the use case to make sure people are getting what they want. If an end-user has a bug like --takes-value-forwarded --flag-notforwarded positional-forwarded, then a wrapper application will treat it as --flag-notforwaded -- --takes-value-forwarded positional-forwarded and will exhibit wrong behavior, from bad errors to actually running incorrectly. Even if we have grouped values, there isn't a way to express that when forwarding these arguments to the command below.

Is this just a given that people have to live with or is there something that can be done to improve this situation?

@epage
Copy link
Member

epage commented Nov 14, 2022

I guess another question is if this new setting should capture -- or if we should treat that normally. I lean towards treating it normally as it loses some of its meaning if we capture that in a positional and then keep parsing. This does mean that a user might need to -- -- to escape within the forwarded arguments.

@epage
Copy link
Member

epage commented Nov 14, 2022

Hmm, I guess another question is what to do about unknown short arguments.

  • If any are unknown, do we treat the whole set as unknown and forward it?
  • Do we ignore unknown and artificially create the unknown argument to put into the captured positional?
    • How do we know what of the following might be known arguments vs captured values?
  • Do we just error on unknown shorts?

@MarijnS95
Copy link

I feel like treating unknown args as positionals with a setting, like what @MarijnS95 did for cargo-apk with the former allow_hyphen_values would be the easest way to represent this in clap.

@epage For the record, as per my message above, I have not fully been able to get this workaround to work.

As for the rest, all valid questions(especially whether an argument following a flag is a positional or captured value) though I am unsure how to deal with --.

@epage
Copy link
Member

epage commented Nov 21, 2022

@epage For the record, as per my message above, I have not fully been able to get this workaround to work.

I was more referring to a new setting that would apply to a multi-value positional. A question earlier was where to store the unknown arguments and a positional was a great alternative.

@MarijnS95
Copy link

MarijnS95 commented Nov 21, 2022

@epage Sure 🙂 - just curious if you can help me out with the issue in #1404 (comment) or if I should create a new issue for that as to not derail this (much) further (it does feel of similar nature though, wrt bailing out on the first unknown argument because at that point the parser has no idea if the rest should be positionals or values for flags). Though I still intend on deferring to rust-mobile/ndk#363 because this all seems too finicky/complex.

@epage
Copy link
Member

epage commented Nov 21, 2022

Sorry, I had missed that. The challenge with ignore_errors is it doesn't get much use and there are a lot of corner cases where things might not be handled correctly. For example, from your post, I'm guessing defaults aren't applied. With the derive, you also have to make sure all non-default fields are wrapped in Option so the derive can construct your struct (note: all flags like --verbose are implicitly defaulted).

So long as you are using Option and run into problems with ignore_errors, feel free to create issues.

MarijnS95 added a commit to rust-mobile/ndk that referenced this issue Nov 21, 2022
… with `--` separator

`clap` [does not currently support] parsing unknown arguments into a
side `Vec`, which is exactly what `cargo apk --` needs to parse a few
known `cargo` arguments (such as `--target` and `-p`) but forward the
rest verbatim to the underlying `cargo` subcommand.

`allow_hyphen_values = true` could partially help us out with this, but
it parses all remaining arguments into that `Vec` upon encountering the
first unknown flag/arg, resulting in all known flags after that to also
be treated as "unknown" instead of filling up our `args: Args` struct.

Since [a workaround for this isn't currently functioning], introduce
pure trailing args with an additional `--` separator to make it clear
which arguments go to `cargo-apk` (and are almost all, except
`--device`, forwarded to `cargo`) and which are only passed to `cargo
<subcommand>`.

[does not currently support]: clap-rs/clap#1404
[a workaround for this isn't currently functioning]: clap-rs/clap#1404 (comment)
MarijnS95 added a commit to rust-mobile/ndk that referenced this issue Nov 22, 2022
… with `--` separator

`clap` [does not currently support] parsing unknown arguments into a
side `Vec`, which is exactly what `cargo apk --` needs to parse a few
known `cargo` arguments (such as `--target` and `-p`) but forward the
rest verbatim to the underlying `cargo` subcommand.

`allow_hyphen_values = true` could partially help us out with this, but
it parses all remaining arguments into that `Vec` upon encountering the
first unknown flag/arg, resulting in all known flags after that to also
be treated as "unknown" instead of filling up our `args: Args` struct.

Since [a workaround for this isn't currently functioning], introduce
pure trailing args with an additional `--` separator to make it clear
which arguments go to `cargo-apk` (and are almost all, except
`--device`, forwarded to `cargo`) and which are only passed to `cargo
<subcommand>`.

[does not currently support]: clap-rs/clap#1404
[a workaround for this isn't currently functioning]: clap-rs/clap#1404 (comment)
@MarijnS95
Copy link

For completeness I got inspired by https://github.com/sonos/dinghy/blob/87463fec2df24bc01c98817b26f8b04c4ec007fa/cargo-dinghy/src/cli.rs#L108-L154 (which more so seems like an incomplete implementation of trailing_var_arg = true, allow_hyphen_values = true) to build the "recognized argument" filtering myself based on Args::command().get_arguments(), see the full write-down and code in rust-mobile/ndk#363! @epage any feedback and improvements are appreciated, even though it succeeds the few basic but important tests I set up for it.

MarijnS95 added a commit to rust-mobile/ndk that referenced this issue Nov 23, 2022
… with `--` separator

`clap` [does not currently support] parsing unknown arguments into a
side `Vec`, which is exactly what `cargo apk --` needs to parse a few
known `cargo` arguments (such as `--target` and `-p`) but forward the
rest verbatim to the underlying `cargo` subcommand.

`allow_hyphen_values = true` could partially help us out with this, but
it parses all remaining arguments into that `Vec` upon encountering the
first unknown flag/arg, resulting in all known flags after that to also
be treated as "unknown" instead of filling up our `args: Args` struct.

Since [a workaround for this isn't currently functioning], introduce
pure trailing args with an additional `--` separator to make it clear
which arguments go to `cargo-apk` (and are almost all, except
`--device`, forwarded to `cargo`) and which are only passed to `cargo
<subcommand>`.

[does not currently support]: clap-rs/clap#1404
[a workaround for this isn't currently functioning]: clap-rs/clap#1404 (comment)
MarijnS95 added a commit to rust-mobile/ndk that referenced this issue Nov 23, 2022
…`cargo` (#363)

* cargo-apk: Reimplement "default" subcommand trailing args for `cargo` with `--` separator

`clap` [does not currently support] parsing unknown arguments into a
side `Vec`, which is exactly what `cargo apk --` needs to parse a few
known `cargo` arguments (such as `--target` and `-p`) but forward the
rest verbatim to the underlying `cargo` subcommand.

`allow_hyphen_values = true` could partially help us out with this, but
it parses all remaining arguments into that `Vec` upon encountering the
first unknown flag/arg, resulting in all known flags after that to also
be treated as "unknown" instead of filling up our `args: Args` struct.

Since [a workaround for this isn't currently functioning], introduce
pure trailing args with an additional `--` separator to make it clear
which arguments go to `cargo-apk` (and are almost all, except
`--device`, forwarded to `cargo`) and which are only passed to `cargo
<subcommand>`.

[does not currently support]: clap-rs/clap#1404
[a workaround for this isn't currently functioning]: clap-rs/clap#1404 (comment)

* cargo-apk: Separate unrecognized `cargo` arguments from cargo-apk `Args`

With some custom logic, and assuming (validated with an `assert!`) our
`Args` struct doesn't have any positionals, we can implement argument
separation ourselves: this allows the user to mix and match `cargo
<subcommand>` arguments with arguments recognized by `cargo-apk`, and
expect `cargo-apk` to set up the environment as expected (as it did
previously) by taking these arguments into account while disregarding
_only_ unknown arguments.

* Add `args: Args` back to `Ndk` subcommand to see them in `-h`

Mixed `cargo-apk` and `cargo` args will still be split out from
`cargo_args`, and they'll be appended to existing values for `args`.
rib pushed a commit to rust-mobile/cargo-apk that referenced this issue Dec 6, 2022
…`cargo` (#363)

* cargo-apk: Reimplement "default" subcommand trailing args for `cargo` with `--` separator

`clap` [does not currently support] parsing unknown arguments into a
side `Vec`, which is exactly what `cargo apk --` needs to parse a few
known `cargo` arguments (such as `--target` and `-p`) but forward the
rest verbatim to the underlying `cargo` subcommand.

`allow_hyphen_values = true` could partially help us out with this, but
it parses all remaining arguments into that `Vec` upon encountering the
first unknown flag/arg, resulting in all known flags after that to also
be treated as "unknown" instead of filling up our `args: Args` struct.

Since [a workaround for this isn't currently functioning], introduce
pure trailing args with an additional `--` separator to make it clear
which arguments go to `cargo-apk` (and are almost all, except
`--device`, forwarded to `cargo`) and which are only passed to `cargo
<subcommand>`.

[does not currently support]: clap-rs/clap#1404
[a workaround for this isn't currently functioning]: clap-rs/clap#1404 (comment)

* cargo-apk: Separate unrecognized `cargo` arguments from cargo-apk `Args`

With some custom logic, and assuming (validated with an `assert!`) our
`Args` struct doesn't have any positionals, we can implement argument
separation ourselves: this allows the user to mix and match `cargo
<subcommand>` arguments with arguments recognized by `cargo-apk`, and
expect `cargo-apk` to set up the environment as expected (as it did
previously) by taking these arguments into account while disregarding
_only_ unknown arguments.

* Add `args: Args` back to `Ndk` subcommand to see them in `-h`

Mixed `cargo-apk` and `cargo` args will still be split out from
`cargo_args`, and they'll be appended to existing values for `args`.
@MarijnS95
Copy link

MarijnS95 commented Dec 8, 2022

Unfortunately this part of that workaround happens to to not be fully solid, as args.update_from_arg_matches(&m) seems to be setting flags in args back to false if wasn't provided in m.

Likewise when values for an array (i.e. --features x) appear on the left of unrecognized arguments (and end up in #[clap(flatten)] args: Args) and on the right side like --features x --something-unrecognized --features y, args.update_from_arg_matches(&m) overwrites args.features: ["x"] with args.features: ["y"].

I am not sure if args.update_from_arg_matches() is intended to effectively discard existing parse results in args (as it may be hard to understand what values were intended, though I'd say values that weren't specified in m shouldn't be overwritten nor cleared) but is there otherwise different functionality I can use to achieve my goal, or should I just discard that functionality as it really only makes supported fields show up in our cargo apk -- --help anyway...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-parsing Area: Parser's logic and needs it changed somehow. C-enhancement Category: Raise on the bar on expectations S-waiting-on-design Status: Waiting on user-facing design to be resolved before implementing
Projects
None yet
Development

No branches or pull requests

8 participants