Kedro new project creation ‐ how it works

Kedro Project Creation - The Developer Docs

The kedro new command allows users to create a new project. This project can be customised to suit the user's needs; they can provide their specifications through several different paths:

Argument	Through interactive flow	Through CLI flag	Through config file
Project name	Yes	Yes; if not provided, interactive flow will be triggered	Yes; if not provided, error is thrown.
Tools	Yes	Yes; if not provided, interactive flow will be triggrered	Yes; if not provided, default value of none will be used
Example pipeline	Yes	Yes; if not provided, interactive flow will be triggered	Yes; if not provided, default value of no will be used
Starter	No	Yes; cannot be used with tools or example	No
Checkout	No	Yes; cannot be used without starter, project version used if not provided
Directory	No	Yes; cannot be used without a starter, cannot be used with Kedro starter alias
Config	No	Yes	No

Invoking the command will trigger the following execution path:

^{_{Link to the Miro board}}

Let's explore this in a little more detail.

Validate CLI flags

As noted in the table above, some CLI flags cannot be used in combination which each other. At this stage in the execution, we check for the presence for any of the following invalid CLI flag combinations:

--checkout AND NO --starter
--directory AND NO --starter
--starter AND (--tools OR --example)
--directory AND --starter IF starter provided is one of Kedro starters

After this validation the directory and path to project template are updated according to the inputs, bringing us to the next step:

Setup cookiecutter

First, we fetch the path to a cookiecutter template project directory. In this template project, we look at any prompts.yml in the template and collect the prompts required for the project. If the user's desired project name, tools selection, or example code selection has already been provided through the command flags, we validate them and delete the respective prompts from the collection.

With the collection of necessary prompts, the execution proceeds to the next step.

Get the cookiecutter context

To proceed, we must first check if a config file is included. If one is included, we don't need to launch the interactive flow.

If a config file is provided

Validate the file can be loaded
Validate tools or example_pipeline selection wasn't included in config if starter was provided
Validate all necessary prompt values are provided in the config file
Validate the output directory is valid, if specified
Validate the provided project name matches the format expected
Validate the example pipeline selection matches the format expected, and parse to either "True" or "False"
Validate the tools selected are all valid tools, and that if none or all were selected, they were not selected with any other tools
Parse the validated selection to full readable names

If a config file isn't provided

For each prompt, get the user's input. Each input is validated against the relevant regex specified in prompts.yml
If tools are provided, parse any ranges into a list of numbers, validating that any ranges are correctly specified (smaller to larger number), and that the end of the range isn't outside the range of available tools
Convert the list of numbers to tools names
Parse any example pipeline selection to either "True" or "False"

Update cookiecutter's extra_context with CLI values

Currently, any values provided by CLI flag will overwrite any provided in config (remember user prompts won't ask for any input if values were provided in the CLI). Tools provided via CLI are parsed into a list of the full tool names.

Set default for required fields

Though not required by cookiecutter for our project creation, we require some values to be populated in the new project's pyproject.toml for telemetry purposes. This includes the project's Kedro version, the tools selection, and the example pipeline selection. As the user has no way to specify the former, and is not always required to specify the latter two, we set default values to be used instead.

Tip

When making changes expected in pyproject.toml, make sure to update the expected values in ProjectMetadata() accordingly

Note

The default value for tools, str(["None"]), may strike you as odd, and similarly, the values passed as the tools selection to cookiecutter are all string-wrapped lists. This is done because cookiecutter treats lists as possible options, only populating the placeholders in pyproject.toml with one item from the list. Instead, to pass the whole list through, we wrap it in a string, and unwrap it when it's populated in the placeholder.

Collect cookiecutter arguments and create project

After collecting all the project specifications, we ensure that in the case that a starter was selected, any specified directory and checkout values are passed to cookiecutter to ensure the correct project template is used for creating the project. Additionally, any tools and example pipeline selection will determine which template is used. We collect the path to the correct template project and the specified arguments for cookiecutter, and call cookiecutter() to create the project.

Post-project creation hook.

With cookiecutter, you can specify hooks to run before or after its project creation execution. We make use of the post project generation hook to make changes to our generated project. The template project includes all files and requirements necessary for all tools we provide, before completing the project generation we must ensure it is modified in line with what the user requires.

We go through every tools option and check if they are included in the user's selection. If they are not included, we remove the related setup for that tool in the generated project
We sort the requirements in the generated project to be in alphabetical order

Note

We previously created sort requirements as the first iteration would inject the necessary requirements. Now that we opt for removal, is this step still necessary?

Print success messages

Finally, our generated project is now ready and suited to the user's specifications. We print a success message. If no starter was used, we also print the selections for tools and example pipeline. The process then finishes here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kedro new project creation ‐ how it works

Kedro Project Creation - The Developer Docs

Validate CLI flags

Setup cookiecutter

Get the cookiecutter context

If a config file is provided

If a config file isn't provided

Update cookiecutter's extra_context with CLI values

Set default for required fields

Collect cookiecutter arguments and create project

Post-project creation hook.

Print success messages

Contribute to Kedro

Kedro architecture

Technical docs

Developer docs

Kedro framework team norms

Research insights & summaries

☕️ Kedro Coffee Chat 🔶

Clone this wiki locally