-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve handling of bool parameters. #2427
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm cool with the motivation and functionality here.
Left a couple comments/questions.
Also, please have a look at the failing tests.
Thanks!
test/helpers.py
Outdated
@@ -160,7 +160,7 @@ def run_locally(self, args): | |||
temp = CmdlineParser._instance | |||
try: | |||
CmdlineParser._instance = None | |||
run_exit_status = luigi.run(['--local-scheduler', '--no-lock'] + args) | |||
run_exit_status = luigi.run(args + ['--local-scheduler', '--no-lock']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for the order swap here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I digged a bit deeper in the argparse
module and it turns out that cases like
program --optional-arg(nargs="?") positional-arg
cannot be properly interpreted as argparse consumes optional arguments prior to positionals. So internally, "positional-arg"
is considered the value of optional-arg
although it does not necessarily require a value.
I swapped the order above just to test something, but apparently the change in this PR would require all bool parameters to be placed behind the task name on the command line. This also causes some tests to fail. The "Running from the Command Line" docs do not encourage to have parameters before the task family, and I personally would never use the cli this way but still, prohibiting it in the first place is a major change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you're fine with that change? If yes, I'll also add a note to the docs on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, i'm fine with it - so long as tests pass (they aren't currently)
luigi/parameter.py
Outdated
parser_kwargs = super(BoolParameter, cls)._parser_kwargs(*args, **kwargs) | ||
parser_kwargs.update({ | ||
"nargs": "?", | ||
"const": True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the importance of nargs
and const
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argparse translates action="store_true"
into nargs=0, const=True
which means that no value (nargs=0
) is expected for that argument, and if the argument but no value is given, it should be interpreted as True
(const=True
). The proposed feature is almost identical, except for the nargs="?"
part.
Did you consider adding a different version of BoolParameter or a parameter to it's constructor? I'm so afraid of changing the existing behavior will cause breakage (even if we think it's backward compatible) |
@Tarrasch This is actually what I fear as well. I added a parameter luigi.BoolParameter.improved_parsing = True at an early stage to trigger the improved parsing for all parameters, including internal ones. |
Sounds good! Could we rename it to something like this?
Just an idea that could improve readability. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine, but can you just change all the examples back so that --local-scheduler can be passed first? Shouldn't be any issues now that the "explicit" way is optional and not default right?
test/cmdline_test.py
Outdated
@@ -125,12 +125,12 @@ def test_cmdline_local_scheduler(self, logger): | |||
|
|||
@mock.patch("logging.getLogger") | |||
def test_cmdline_other_task(self, logger): | |||
luigi.run(['--local-scheduler', '--no-lock', 'SomeTask', '--n', '1000']) | |||
luigi.run(['SomeTask', '--local-scheduler', '--no-lock', '--n', '1000']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be needed to change right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔︎
test/cmdline_test.py
Outdated
self.assertEqual(dict(MockTarget.fs.get_all_data()), {'/tmp/test_1000': b'done'}) | ||
|
||
@mock.patch("logging.getLogger") | ||
def test_cmdline_ambiguous_class(self, logger): | ||
self.assertRaises(Exception, luigi.run, ['--local-scheduler', '--no-lock', 'AmbiguousClass']) | ||
self.assertRaises(Exception, luigi.run, ['AmbiguousClass', '--local-scheduler', '--no-lock']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, we don't want code who pass --local-scheduler
first to break.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔︎
Yes, you're right, they don't require any changes. |
The requested changes are included now, so feel free to review again :) |
Sorry for the late review. It looks all good but I would like some docs too. Simple class-documentation for |
See docs for usage.
See docs for usage.
Description, Motivation and Context
The current BoolParameter implementation does not allow to negate parameters on the command line. Example:
So right now, there is no way to set
param_b
to False from the cli. Of course, users can redefineparam_b
to mean the opposite e.g.no_param_b
, but this might become cumbersome in the long run (at least for us it kind of does). Also, there are some luigi-internal bool parameters (e.g.send_failure_email
,check_unfulfilled_deps
) which are true by default, so one has to toggle the value manually in the config file for running individual tasks.The behavior I have in mind is:
To achieve this, one needs to pass different values of
nargs
andconst
to the ArgumentParser, which requires only a minor change. I also updatedBoolParameter.normalize()
which was originally introduced in #1461. Until now it returnedbool(value)
orNone
. I think it is safe to return the result ofparse()
orNone
for back-compat in case of an error.Does this sound reasonable to you?
Have you tested this? If so, how?
I extended all existing tests that deal with BoolParamter parsing to cover the update behavior.