-
Notifications
You must be signed in to change notification settings - Fork 914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add plural flags --nodes, --tags, and --load-versions to kedro run
#2301
Conversation
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Question for reviewers: Do we want to restrict users from using both |
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
env, | ||
runner, | ||
is_async, | ||
node_names, | ||
nodes_names, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is named this way for differentiation from the already existing node_names
- in #2245 I'd suggest keeping node_names
as the parameter name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine to keep node_names
for now in #2245 but as part of #2247 I think we should consider just calling this nodes
. (Just while I remember, Ivan's argument for why we might not want to do this: it's deliberately written as node_names
to deter people from trying to pass node objects themselves to the underlying session.run
. My argument for it is consistency and simplicity, particularly when we eventually return to #1423)
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good! I think the one thing missing is a test (or tests) to check that providing both --node
and --nodes
and tags, and load versions, works properly as well. Because now we're not really checking that these lines work:
tag = tag + tags
node_names = node_names + nodes_names
load_version = {**load_version, **load_versions}
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really great PR 👍 Super clear explanations and excellent testing ⭐ I'm a little bit confused about _get_values_as_tuple
, but that's not really a result of your changes. I wonder if that could be simplified after (or as part of) the ticket to remove the old flags, since it will no longer need to handle anything more complicated than just a single string (?). Happy to chat more about this anyway so we can try to figure it out!
Question for reviewers: Do we want to restrict users from using both
--node
and--nodes
while both are available?
I think it's fine as it is. I don't think anyone will really try to do that, it's only going to be possible for a limited period of time, plus they will already get the deprecation warning and things will work ok when you use both. So not worth the extra code required to handle it - doing the +
to combine arguments together like you do now is more than enough I think.
env, | ||
runner, | ||
is_async, | ||
node_names, | ||
nodes_names, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine to keep node_names
for now in #2245 but as part of #2247 I think we should consider just calling this nodes
. (Just while I remember, Ivan's argument for why we might not want to do this: it's deliberately written as node_names
to deter people from trying to pass node objects themselves to the underlying session.run
. My argument for it is consistency and simplicity, particularly when we eventually return to #1423)
kedro/framework/cli/project.py
Outdated
@@ -412,6 +438,16 @@ def run( | |||
tag = _get_values_as_tuple(tag) if tag else tag | |||
node_names = _get_values_as_tuple(node_names) if node_names else node_names | |||
|
|||
# temporary duplicates for the plural flags | |||
tags = _get_values_as_tuple(tags) if tags else tuple(tags) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor, but do we need these if/else
clauses? Since _get_values_as_tuple("") == tuple("") == tuple()
already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch - was following convention from the earlier implementation but looks like they weren't needed there either. My suspicion is that multiple=True
implies a default value of an empty tuple 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was unable to find any documentation on the matter but manual testing does expose that values on options where multiple=True
will be tuples if the default is unspecified, even when a value is not provided. I'll remove the unnecessary if
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, that's what I suspected about multiple=True
👍 Just to double check though, it's ok to remove the if/else
for both the multiple=True
and the multiple=False
versions of these arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! For the other flags their input is immediately handled by the callback function which goes on to return a list, and it's this list that get's passed to _get_values_as_tuple
, which splits them safely
@@ -468,6 +472,11 @@ def _split_params(ctx, param, value): | |||
return conf | |||
|
|||
|
|||
def _split_load_versions(ctx, param, value): | |||
lv_tuple = _get_values_as_tuple([value]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the [value]
here rather than value
? Elsewhere we call _get_values_as_tuple
just with a single string. (Edit: is that true? Maybe it gets tuple-ified due to multiple=True
? But then that wouldn't make sense for the instances you've added 🤔 )
Also do we need the if/else
here? Wouldn't _reformat_load_versions
work the same just passing the empty value
into it directly?
After we've removed the old load_version
we should definitely just move _reformat_load_versions
code into this instead of having two separate functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AntonyMilneQB
Do we need the [value] here rather than value? Elsewhere we call _get_values_as_tuple just with a single string.
Without wrapping value
in a list _get_values_as_tuples
will treat each character in the string as a value. The function is never called on a single string as multiple=True
tuplifies the values of --node
and --tag
, and the callback functions for --nodes
and --tags
both return lists.
Also do we need the if/else here? Wouldn't _reformat_load_versions work the same just passing the empty value into it directly?
The if/else
isn't needed as _reformat_load_versions
would still work - the idea was to prevent calling another method if unnecessary. Which approach would be considered more pythonic?
After we've removed the old load_version we should definitely just move _reformat_load_versions code into this instead of having two separate functions.
Agreed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
if/else
isn't needed as_reformat_load_versions
would still work - the idea was to prevent calling another method if unnecessary. Which approach would be considered more pythonic?
Overall I think without the if/else
. The performance overhead of calling another function here is very small compared to the (also small) additional complexity of branching in the code. It's a small thing, but I'd only skip calling the function here if it was particularly expensive.
kedro run
kedro run
Co-authored-by: Antony Milne <49395058+AntonyMilneQB@users.noreply.github.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
@AntonyMilneQB
It should be able to be removed 🤔 it's functionality then wouldn't extend beyond |
This would be a nice simplification. It would also be nice if we could put all this argument-munging into the callbacks rather than needing to do them inside the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! 👍 Don't forget to update the release notes.
Yes, I would separately add that these flags have been added. That way it's also easier to trace which PR the changes were added in. |
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
04e8a32
to
917841d
Compare
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
…2301) * Add plural flags nodes, tags, load-versions Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Lint Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Lint Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Trim whitespaces from nodes Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Trim whitespaces from load versions Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Write test for nodes Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Write test for tags Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Write test for load versions Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Remove shorthand flag from plural Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Fix tests after merge Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Lint Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Undo merge side-effects Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Add tests to check behaviour when both singular and plural flags used Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Lint Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Lint again Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Change deprecation condition Co-authored-by: Antony Milne <49395058+AntonyMilneQB@users.noreply.github.com> * Remove unnecessary ifs Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Lint changes from main Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Add changes to RELEASE.md Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Try fix CI Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Revert CLI change Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> --------- Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> Co-authored-by: Antony Milne <49395058+AntonyMilneQB@users.noreply.github.com> Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>
Signed-off-by: Ahdra Merali ahdra.merali@quantumblack.com
Description
This PR completes #2244 by adding pluralised flags for the node, tag, and load-version options on
kedro run
For Reviewers:
_split_load_versions
was added to handle input strings, as previously --load-version would process the input as tuples due tomultiple=True
..strip()
is used to trim the whitespaces from user input--load-versions
is subsumed under the test for_split_load_versions
just as testing--load-version
was bytest_reformat_load_versions
.kedro run
commands. Whilst I don't think they will, is it worth considering limiting this behaviour?Development notes
The helper function
_split_load_versions
was added as--load-version
and--load-versions
return different types, the singular returns a tuple and the plural a string. This is as a result of usingmultiple=True
for the singular flag.The both new and old tags have been tested manually to ensure they (still) behave as expected.
N/B: Tags are robust against whitespaces
Checklist
kedro run
suggestions and examples #2246RELEASE.md
file - to be handled in Add deprecation warning for--node
,--tag
and--load-version
flags #2303