Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent arguments between kedro run CLI and the --config yaml and provide examples in documentation #1485

Closed
noklam opened this issue Apr 26, 2022 · 3 comments
Labels
Issue: Feature Request New feature or improvement to existing feature Stage: Technical Design 🎨 Ticket needs to undergo technical design before implementation

Comments

@noklam
Copy link
Contributor

noklam commented Apr 26, 2022

Description

Is your feature request related to a problem? A clear and concise description of what the problem is: "I'm always frustrated when ..."

https://discord.com/channels/778216384475693066/846330075535769601/968467444131848204
Currently, there are 2 ways to provide argument to kedro run CLI.

  1. Using the CLI argument kedro run --from-nodes=some_node
  2. Using the --config argument and define the arguments in a YAML file. kedro run --config=config.yml
# config.yml
run:
  from_nodes: some_node # Notice this is underscore instead of a dash

The inconsistent API and the lack of examples in documentation could confuse user.

Few proposed changes:

  1. The YAML config should use the same arguments as the CLI, i.e. using the dash instead of the underscore
  2. Add examples of YAML file in documentation
  3. (Optional, open to discuss) - Support native YAML syntax and more consistent with the type. (i.e. a list for list or node, dict for params etc.
  1. [BONUS] - validate the arguments are the expected arguments from the run function, optionally print out what's the argument it has parsed. Currently, if we provide an invalid argument it will still run but doesn't do anything, it is hard to debug, especially the CLI argument can override the YAML config.

Context

Why is this change important to you? How would you use it? How can it benefit other users?
A consistent API will make users' life easier and there is no strong reason why we want 2 different API.

Possible Implementation

(Optional) Suggest an idea for implementing the addition or change.

  • A fix in Kedro's run function should be doable, I expect it will be a small change that parse

Possible Alternatives

(Optional) Describe any alternative solutions or features you've considered.

@noklam noklam added the Issue: Feature Request New feature or improvement to existing feature label Apr 26, 2022
@merelcht merelcht added the Stage: Technical Design 🎨 Ticket needs to undergo technical design before implementation label May 4, 2022
@noklam noklam linked a pull request Jun 13, 2022 that will close this issue
5 tasks
@noklam
Copy link
Contributor Author

noklam commented Jul 8, 2022

This was linked incorrectly to the wrong PR and shouldn't be closed

@noklam noklam reopened this Jul 8, 2022
@datajoely
Copy link
Contributor

I would add kedro new --config here too

@merelcht
Copy link
Member

merelcht commented Aug 17, 2022

We discussed this task in the Technical Design session and decided on the following:

  1. Ideally the YAML config should use the same arguments as the CLI, i.e. using the dash instead of the underscore. However, it's not entirely clear why this doesn't work currently. The click context handles some of this logic, and so we need to find out if it is possible to use dash in yaml or not. If it is possible, then we'll implement it, but that would be a breaking change.
  2. We need to document more clearly that the syntax for CLI arguments is different from the values in config.yml and why that's the case.
  3. It's not clear why we don't support native YAML syntax and more consistent with the type. (i.e. a list for list or node, dict for params etc.) The reason seems to be because the yaml content gets passed directly to the CLI, but this doesn't stop us from adding some parsing logic for lists. @lorenabalan might know why it works like this, so we need to follow up with her
  4. Validating that the provided arguments are the expected arguments from the run function, seems like a good idea. We need to do some investigation to check how feasible this is and what the current behaviour is (e.g. will it throw an error if you misspell a node in --from-nodes my_node1 or is it ignored?)

These issues are the follow up actions:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue: Feature Request New feature or improvement to existing feature Stage: Technical Design 🎨 Ticket needs to undergo technical design before implementation
Projects
Archived in project
Development

No branches or pull requests

3 participants