Improve handling of bool parameters. #2427

riga · 2018-05-24T18:03:41Z

Description, Motivation and Context

The current BoolParameter implementation does not allow to negate parameters on the command line. Example:

class MyTask(luigi.Task):
    param_a = luigi.BoolParameter(default=False)
    param_b = luigi.BoolParameter(default=True) # cannot be set to False on the CLI
    ...

So right now, there is no way to set param_b to False from the cli. Of course, users can redefine param_b to mean the opposite e.g. no_param_b, but this might become cumbersome in the long run (at least for us it kind of does). Also, there are some luigi-internal bool parameters (e.g. send_failure_email, check_unfulfilled_deps) which are true by default, so one has to toggle the value manually in the config file for running individual tasks.

The behavior I have in mind is:

luigi MyTask
# -> param_a = False (as before)

luigi MyTask --param_a
# -> param_a = True (as before)

luigi MyTask --param_a true
# -> param_a = True (new)

luigi MyTask --param_a false
# -> param_a = False (new)

# same behavior for param_b except for the default as before
luigi MyTask
# -> param_b = True (as before)

To achieve this, one needs to pass different values of nargs and const to the ArgumentParser, which requires only a minor change. I also updated BoolParameter.normalize() which was originally introduced in #1461. Until now it returned bool(value) or None. I think it is safe to return the result of parse() or None for back-compat in case of an error.

Does this sound reasonable to you?

Have you tested this? If so, how?

I extended all existing tests that deal with BoolParamter parsing to cover the update behavior.

dlstadther

I'm cool with the motivation and functionality here.

Left a couple comments/questions.

Also, please have a look at the failing tests.

Thanks!

dlstadther · 2018-05-24T23:24:54Z

test/helpers.py

@@ -160,7 +160,7 @@ def run_locally(self, args):
        temp = CmdlineParser._instance
        try:
            CmdlineParser._instance = None
-            run_exit_status = luigi.run(['--local-scheduler', '--no-lock'] + args)
+            run_exit_status = luigi.run(args + ['--local-scheduler', '--no-lock'])


Is there a reason for the order swap here?

I digged a bit deeper in the argparse module and it turns out that cases like

program --optional-arg(nargs="?") positional-arg

cannot be properly interpreted as argparse consumes optional arguments prior to positionals. So internally, "positional-arg" is considered the value of optional-arg although it does not necessarily require a value.

I swapped the order above just to test something, but apparently the change in this PR would require all bool parameters to be placed behind the task name on the command line. This also causes some tests to fail. The "Running from the Command Line" docs do not encourage to have parameters before the task family, and I personally would never use the cli this way but still, prohibiting it in the first place is a major change.

So you're fine with that change? If yes, I'll also add a note to the docs on that.

Yeah, i'm fine with it - so long as tests pass (they aren't currently)

dlstadther · 2018-05-24T23:26:57Z

luigi/parameter.py

+        parser_kwargs = super(BoolParameter, cls)._parser_kwargs(*args, **kwargs)
+        parser_kwargs.update({
+            "nargs": "?",
+            "const": True,


What is the importance of nargs and const?

Argparse translates action="store_true" into nargs=0, const=True which means that no value (nargs=0) is expected for that argument, and if the argument but no value is given, it should be interpreted as True (const=True). The proposed feature is almost identical, except for the nargs="?" part.

Tarrasch · 2018-06-05T19:20:57Z

Did you consider adding a different version of BoolParameter or a parameter to it's constructor? I'm so afraid of changing the existing behavior will cause breakage (even if we think it's backward compatible)

riga · 2018-06-06T11:43:03Z

@Tarrasch This is actually what I fear as well. I added a parameter improved_parsing to the BoolParameter init which steers the parsing behavior. It defaults to the class attribute improved_parsing which itself defaults to False, so the current behavior is still standard. However, one can do

luigi.BoolParameter.improved_parsing = True

at an early stage to trigger the improved parsing for all parameters, including internal ones.

Tarrasch · 2018-06-06T19:34:20Z

Sounds good! Could we rename it to something like this?

luigi.BoolParameter.parsing = luigi.BoolParameter.(IMPLICIT_PARSING/EXPLICIT_PARSING)

Just an idea that could improve readability.

Tarrasch

Looks fine, but can you just change all the examples back so that --local-scheduler can be passed first? Shouldn't be any issues now that the "explicit" way is optional and not default right?

Tarrasch · 2018-06-16T16:46:59Z

test/cmdline_test.py

@@ -125,12 +125,12 @@ def test_cmdline_local_scheduler(self, logger):

    @mock.patch("logging.getLogger")
    def test_cmdline_other_task(self, logger):
-        luigi.run(['--local-scheduler', '--no-lock', 'SomeTask', '--n', '1000'])
+        luigi.run(['SomeTask', '--local-scheduler', '--no-lock', '--n', '1000'])


This shouldn't be needed to change right?

Tarrasch · 2018-06-16T16:47:20Z

test/cmdline_test.py

        self.assertEqual(dict(MockTarget.fs.get_all_data()), {'/tmp/test_1000': b'done'})

    @mock.patch("logging.getLogger")
    def test_cmdline_ambiguous_class(self, logger):
-        self.assertRaises(Exception, luigi.run, ['--local-scheduler', '--no-lock', 'AmbiguousClass'])
+        self.assertRaises(Exception, luigi.run, ['AmbiguousClass', '--local-scheduler', '--no-lock'])


Same here, we don't want code who pass --local-scheduler first to break.

riga · 2018-06-18T06:47:02Z

Yes, you're right, they don't require any changes.

riga · 2018-06-29T07:59:50Z

The requested changes are included now, so feel free to review again :)

Tarrasch · 2018-07-04T20:45:58Z

Sorry for the late review. It looks all good but I would like some docs too. Simple class-documentation for BoolParameter would be easiest I think.

See docs for usage.

Improve handling of bool parameters.

b535be6

dlstadther reviewed May 24, 2018

View reviewed changes

riga added 2 commits June 5, 2018 17:57

Fix tests.

618ef50

Append local scheduler flag in luigi.run.

ac552cc

Add switch to enable improved parsing of BoolParameter's.

a7c6d27

riga mentioned this pull request Jun 6, 2018

Update BoolParameter riga/law#50

Closed

Use constants to denote implicit/explicit parsing.

c729629

Tarrasch suggested changes Jun 16, 2018

View reviewed changes

Change back argument order in tests.

5cdcbc0

Add class docs to BoolParameter.

c7cab1c

Tarrasch approved these changes Jul 9, 2018

View reviewed changes

Tarrasch merged commit d994bca into spotify:master Jul 9, 2018

dlstadther mentioned this pull request Jul 31, 2018

Support inverted arguments for BoolParameter #1598

Closed

thisiscab pushed a commit to glossier/luigi that referenced this pull request Aug 3, 2018

Allow explicit parsing of BoolParameters (spotify#2427)

5ae7808

See docs for usage.

thisiscab pushed a commit to glossier/luigi that referenced this pull request Aug 8, 2018

Allow explicit parsing of BoolParameters (spotify#2427)

c03e32b

See docs for usage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve handling of bool parameters. #2427

Improve handling of bool parameters. #2427

riga commented May 24, 2018 •

edited

Loading

dlstadther left a comment

dlstadther May 24, 2018

riga May 25, 2018 •

edited

Loading

riga Jun 5, 2018

dlstadther Jun 5, 2018

dlstadther May 24, 2018

riga May 25, 2018 •

edited

Loading

Tarrasch commented Jun 5, 2018

riga commented Jun 6, 2018

Tarrasch commented Jun 6, 2018 •

edited

Loading

Tarrasch left a comment

Tarrasch Jun 16, 2018

riga Jun 18, 2018

Tarrasch Jun 16, 2018

riga Jun 18, 2018

riga commented Jun 18, 2018

riga commented Jun 29, 2018

Tarrasch commented Jul 4, 2018

Improve handling of bool parameters. #2427

Improve handling of bool parameters. #2427

Conversation

riga commented May 24, 2018 • edited Loading

Description, Motivation and Context

Have you tested this? If so, how?

dlstadther left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

riga May 25, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

riga May 25, 2018 • edited Loading

Choose a reason for hiding this comment

Tarrasch commented Jun 5, 2018

riga commented Jun 6, 2018

Tarrasch commented Jun 6, 2018 • edited Loading

Tarrasch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

riga commented Jun 18, 2018

riga commented Jun 29, 2018

Tarrasch commented Jul 4, 2018

riga commented May 24, 2018 •

edited

Loading

riga May 25, 2018 •

edited

Loading

riga May 25, 2018 •

edited

Loading

Tarrasch commented Jun 6, 2018 •

edited

Loading