Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Early support for reading program arguments from env-variable (issue #712) #734

Closed
wants to merge 3 commits into from

Conversation

purew
Copy link

@purew purew commented Nov 6, 2016

Hi Kevin,

This is Anders from Thursday's meetup. Take a look at the code and let me know what you want changed or what is missing?

Also, I wasn't sure where you put the unit-tests so it went in the same file. I also didn't spend any time on documentation until the interface is ironed out.


This change is Reviewable

@coveralls
Copy link

coveralls commented Nov 6, 2016

Coverage Status

Coverage increased (+0.009%) to 91.264% when pulling df90d4a on PureW:master into 087cee7 on kbknapp:master.

Copy link
Member

@kbknapp kbknapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great so far, I think if we make those two simple changes I'd be good with a merge.

src/app/mod.rs Outdated
@@ -30,6 +31,8 @@ use errors::Result as ClapResult;
pub use self::settings::AppSettings;
use completions::Shell;

const DEFAULT_ARGS_ENV_VAR: &'static str = "VAR";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do away with a default variable name. The user typing None is just as if they'd typed the env var name VAR

src/app/mod.rs Outdated
/// env-var `env_var_name`.
///
/// Defaults to read from "VAR" if `env_var_name` is `None`.
pub fn get_matches_with_env(self, env_var_name: Option<String>)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change this function declaration to:

pub fn get_matches_with_env<S: Into<&'b str>(self, env_var_name: S) -> ArgMatches<'a> {

This will allow people to use string slices too and not have to allocate a string if they don't need to. This will also require a few lines to change inside the function too, but not many.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh right, I've used the Into-trait once or twice before. I always have to lookup the syntax though :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't that be <S:AsRef<str>(self, env_var_name: &S)?
env_var_name isn't stored anywhere, so why the lifetime 'b?
And &str doesn't implement From<&String>.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tormol Good catch! Yeah the AsRef trait should be used, not Into

@kbknapp
Copy link
Member

kbknapp commented Nov 6, 2016

Also, when you change these commits, could you use the commit subject to something like feat: adds the ability to get matches from environmental vars?

This will allow that commit to be picked up automatically by clog and put into the next versions change log. 😄

@kbknapp
Copy link
Member

kbknapp commented Nov 6, 2016

Ah yeah, and the unit tests are fine where they are. I'm good with using those or placing them as integration tests in the tests/ directory. Thanks again for tackling this issue! 👍

@purew
Copy link
Author

purew commented Nov 6, 2016

About the commit-message, should I rebase the history to a single commit?

EDIT: Nevermind, I found the contribution-guidelines!

@purew
Copy link
Author

purew commented Nov 6, 2016

I just realized that this simple solution might be too simple. Take the following case:

VAR="--arg1 val1" ./prog --arg1 val2 --arg1 val3

if the program calls values_of("arg1") I think it returns (val1, val2, val3) which is likely not what the user wants. The program arguments should take complete precedence over env-var I believe...

src/app/mod.rs Outdated
.chain(vargs.split(0x20)
.into_iter()
.map(|s| s.to_os_string()))
.collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use b' ' instead of 0x20
.split() returns an iterator (osstringext::OsSplit), so .into_iter() does nothing.
Vec implements std::iter::Extend so this can be written as args.extend(vargs.split(b' ').map(...)).

@tormol
Copy link
Contributor

tormol commented Nov 6, 2016

.chain() will return the elements of the first iterator first, so you would get ["arg2,"arg3","arg1"].
Putting env first will allow options in the program arguments to override those set by ARG.

There are some other issues with concatenating:

  • ARG='--' ./prog --opt will turn --opt into an argument
  • ARG='--opt-with-optional-argument' ./prog arg will make arg an argument to the option.

Fixing that might be difficult, and i'd guess environmental variables will mostly be used to set options, so I don't think this is a big problem.

Separating arguments with .split() makes it impossible to give arguments with spaces in them,
and repeated spaces results in n-1 empty slices.

bash doesn't seem to split on non-ASCII whitespace either, so that's probably fine but you might want to separate on tabs and newlines too.

@kbknapp
Copy link
Member

kbknapp commented Nov 6, 2016

I just realized that this simple solution might be too simple...

Ah yeah, I'd forgotten we talked about that. There's not really a good way around this, unless we keep track of how a particular argument was matched. That may be quite a bit more complicated to implement.

@kbknapp
Copy link
Member

kbknapp commented Nov 6, 2016

@tormol regards to your bullet points above, I think so long as the env args are always first, or last it's fine. So long as it's deterministic and documented that way.

Personally I'd prefer that the env args come first. Allowing them to be overriden if the consumer of the library uses Arg::overrides_with. Either way though, we'll need some way to solve the $ VAR="--opt val1" prog --opt val2. I'm of the mind that env args are of a lower precedent than manually input args.

There are quite a few edge cases that have many different correct answers depending on use case.

If an option allows multiple values, should $ VAR="--opt val1" prog --opt val2 cause the final value to be val2, or val1, val2? Etc.

@kbknapp
Copy link
Member

kbknapp commented Nov 6, 2016

We could also have different methods for _pre_pending vs _ap_pending these env args.

@purew
Copy link
Author

purew commented Nov 10, 2016

Just pushed a rebased commit, but I'm having a problem with an error I haven't seen before:

error[E0282]: unable to infer enough type information about `_`
    --> src/app/mod.rs:1655:44
     |
1655 |                                           .get_matches_with_env(None);
     |                                            ^^^^^^^^^^^^^^^^^^^^ cannot infer type for `_`
     |
     = note: type annotations or generic parameter binding required

Doesn't sound like a difficult error but I don't understand what _ is...

I tried splitting on space, tab, and newline but OsStr.split does seems to only support splitting on a char, see comments.

@kbknapp
Copy link
Member

kbknapp commented Nov 10, 2016

The error comes from rustc not being able to tell what kind of Option<T> to use for the None value. I forget the exact syntax, but you'd have to do something like Option::<OsStr>::None This is part of the reason I'd rather just remove the Option and default VAR.

@tormol
Copy link
Contributor

tormol commented Nov 10, 2016

I have written an iterator that returns the arguments in a variable.
It supports escaping bytes with \ and quoting arguments with " and '.
It's completely untested.

@purew
Copy link
Author

purew commented Nov 11, 2016

This is part of the reason I'd rather just remove the Option and default VAR

@kbknapp Ohh, I misunderstood your first comment about VAR. Just pushed commit that removes the Option.

@tormol Looks interesting! Is it safe to put byte-sequences in env-vars?

@coveralls
Copy link

coveralls commented Nov 11, 2016

Coverage Status

Coverage increased (+1.4%) to 92.674% when pulling c20a61c on PureW:master into 087cee7 on kbknapp:master.

src/app/mod.rs Outdated
/// Similar to [`App::get_matches`] but also reads args from
/// env-var `env_var_name`.
///
/// If the same argument is specified both as program argument
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change this to say, "The env var args are appended to the manually entered arguments (if any) and thus if an argument is provided both in the env var and manually, this could result in either a conflict, override, or additional value depending on the specific rules for said arg."

@kbknapp
Copy link
Member

kbknapp commented Nov 11, 2016

@purew looks great! Could you just change that one comment so that it's more explicit? This way people know no magic is happening, and they must be careful.

Once that is done, @tormol if you'd like to give any other advice I think this will be good for a merge.

Once this is merged I'd also be interested in adding in the splitting that @tormol wrote once we can add some tests which could be done as a separate PR.

@coveralls
Copy link

coveralls commented Nov 13, 2016

Coverage Status

Coverage decreased (-0.07%) to 91.181% when pulling b9ea3bf on PureW:master into 087cee7 on kbknapp:master.

@tormol
Copy link
Contributor

tormol commented Nov 13, 2016

@purew: safe/unsafe for who?
I wrote escape bytes because i got uncertain, but I don't see any problems with multibyte/non-ASCII characters.

@purew
Copy link
Author

purew commented Nov 13, 2016

@tormol I was merely curious about byte-sequences in env-vars and if common shell's allow that?

@tormol
Copy link
Contributor

tormol commented Nov 15, 2016

VAR=`echo -e '\xff'` works in bash.

@kbknapp
Copy link
Member

kbknapp commented Nov 15, 2016

@purew see #748 where @casey mentions having a MergeType enum which specifies how conflicts between these various "other ways" to specify arguments (command line, env, and files). I haven't decided on exactly how I want to go about that, and what sort of naming I want to use (which is kind of a 🚲 🏚️ ), but if I decide to use something like that, I'd want it to apply to this PR as well, and can't add that in after the fact. So I'm going to put this on hold for a short time until I solidify how I want that API to look. I'll post back here once I've got the details 😉

@homu
Copy link
Contributor

homu commented Dec 28, 2016

☔ The latest upstream changes (presumably #784) made this pull request unmergeable. Please resolve the merge conflicts.

@kbknapp
Copy link
Member

kbknapp commented May 16, 2017

I'm going to close this due to inactivity and punting on this issue until I have a little more free-time to work on it - feel free to re-open and we can re-energize it.

@kbknapp kbknapp closed this May 16, 2017
@purew
Copy link
Author

purew commented May 16, 2017

Ok, are all "upstream" blockers resolved? You mentioned solidifying a particular API before continuing on this PR:

So I'm going to put this on hold for a short time until I solidify how I want that API to look. I'll post back here once I've got the details

@kbknapp
Copy link
Member

kbknapp commented May 16, 2017

@purew for single args getting a value from the ENV all blockers are solved. That could be merged today. The precedence is CLI overrides all, ENV overrides default vals, and defaults override empty values. The Arg API could be Arg::env_var.

The allowing an ENV var to hold an arbitrary number of args (such as VAR='some --opt=val) I still haven't even decided on the specifics. It will be something along the lines of allowing an "external argv" but the merge semantics haven't been decided yet. This is mostly due to my not having had much time this year to work on side projects, which is unfortunate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants