Consistent arguments between `kedro run` CLI and the `--config` yaml and provide examples in documentation #1485

noklam · 2022-04-26T12:34:54Z

Description

Is your feature request related to a problem? A clear and concise description of what the problem is: "I'm always frustrated when ..."

https://discord.com/channels/778216384475693066/846330075535769601/968467444131848204
Currently, there are 2 ways to provide argument to kedro run CLI.

Using the CLI argument kedro run --from-nodes=some_node
Using the --config argument and define the arguments in a YAML file. kedro run --config=config.yml

# config.yml
run:
  from_nodes: some_node # Notice this is underscore instead of a dash

The inconsistent API and the lack of examples in documentation could confuse user.

Few proposed changes:

The YAML config should use the same arguments as the CLI, i.e. using the dash instead of the underscore
Add examples of YAML file in documentation
(Optional, open to discuss) - Support native YAML syntax and more consistent with the type. (i.e. a list for list or node, dict for params etc.

Currently for params it can be defined as a dict, but it will not accept a list for arguments but it has to be a string with comma from_nodes: xxxx,yyyy,zzzz. (Not YAML native syntax)
load_versions in the YAML file will be defined as a list of DATASET_NAME:VERSION.
- dataset_name:version_name
  https://discord.com/channels/778216384475693066/846330075535769601/968467444131848204

[BONUS] - validate the arguments are the expected arguments from the run function, optionally print out what's the argument it has parsed. Currently, if we provide an invalid argument it will still run but doesn't do anything, it is hard to debug, especially the CLI argument can override the YAML config.

Context

Why is this change important to you? How would you use it? How can it benefit other users?
A consistent API will make users' life easier and there is no strong reason why we want 2 different API.

Possible Implementation

(Optional) Suggest an idea for implementing the addition or change.

A fix in Kedro's run function should be doable, I expect it will be a small change that parse

Possible Alternatives

(Optional) Describe any alternative solutions or features you've considered.

The text was updated successfully, but these errors were encountered:

noklam · 2022-07-08T10:59:25Z

This was linked incorrectly to the wrong PR and shouldn't be closed

datajoely · 2022-07-08T12:52:08Z

I would add kedro new --config here too

merelcht · 2022-08-17T15:07:45Z

We discussed this task in the Technical Design session and decided on the following:

Ideally the YAML config should use the same arguments as the CLI, i.e. using the dash instead of the underscore. However, it's not entirely clear why this doesn't work currently. The click context handles some of this logic, and so we need to find out if it is possible to use dash in yaml or not. If it is possible, then we'll implement it, but that would be a breaking change.
We need to document more clearly that the syntax for CLI arguments is different from the values in config.yml and why that's the case.
It's not clear why we don't support native YAML syntax and more consistent with the type. (i.e. a list for list or node, dict for params etc.) The reason seems to be because the yaml content gets passed directly to the CLI, but this doesn't stop us from adding some parsing logic for lists. @lorenabalan might know why it works like this, so we need to follow up with her
Validating that the provided arguments are the expected arguments from the run function, seems like a good idea. We need to do some investigation to check how feasible this is and what the current behaviour is (e.g. will it throw an error if you misspell a node in --from-nodes my_node1 or is it ignored?)

These issues are the follow up actions:

noklam added the Issue: Feature Request New feature or improvement to existing feature label Apr 26, 2022

merelcht added the Stage: Technical Design 🎨 Ticket needs to undergo technical design before implementation label May 4, 2022

noklam linked a pull request Jun 13, 2022 that will close this issue

Consistent node execution order by sorting node with Sequentialrunner #1604

Merged

5 tasks

noklam closed this as completed in #1604 Jun 16, 2022

noklam removed a link to a pull request Jul 8, 2022

Consistent node execution order by sorting node with Sequentialrunner #1604

Merged

5 tasks

noklam reopened this Jul 8, 2022

merelcht closed this as completed Aug 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistent arguments between `kedro run` CLI and the `--config` yaml and provide examples in documentation #1485

Consistent arguments between `kedro run` CLI and the `--config` yaml and provide examples in documentation #1485

noklam commented Apr 26, 2022 •

edited

Loading

noklam commented Jul 8, 2022 •

edited

Loading

datajoely commented Jul 8, 2022

merelcht commented Aug 17, 2022 •

edited

Loading

Consistent arguments between kedro run CLI and the --config yaml and provide examples in documentation #1485

Consistent arguments between kedro run CLI and the --config yaml and provide examples in documentation #1485

Comments

noklam commented Apr 26, 2022 • edited Loading

Description

Context

Possible Implementation

Possible Alternatives

noklam commented Jul 8, 2022 • edited Loading

datajoely commented Jul 8, 2022

merelcht commented Aug 17, 2022 • edited Loading

Consistent arguments between `kedro run` CLI and the `--config` yaml and provide examples in documentation #1485

Consistent arguments between `kedro run` CLI and the `--config` yaml and provide examples in documentation #1485

noklam commented Apr 26, 2022 •

edited

Loading

noklam commented Jul 8, 2022 •

edited

Loading

merelcht commented Aug 17, 2022 •

edited

Loading