Improve get started docs and guide to working with notebooks (#2031)

* Revised the Introduction to make it short and sweet. * Revised the Get Started section. Gone is "Hello Kedro". Gone are the installation pre-requisites (that's just part of the Install Kedro page now). Gone is the "Standalone use of the data catalog - woot woot" and GONE is the page on Kedro starters. * Reordered the create project material to put the project structure breakdown in the section that introduces key concepts and shorten the Iris tutorial to the bare minimum. I did add visualisation at this point though, to highlight Kedro Viz, as I felt it was coming far too late in the spaceflights tutorial and needed to be more prominent as a feature. * Added a TL;DR page to Get Started which some people could probably just use as-is and ignore the rest of the section. * Starters material has moved into a new section all about "Kedro project setup". Much of that section still needs review/revision but I have updated the Starters page so it reads more clearly. * Improved the Kedro-Viz page somewhat (still more to come for Plotly) * Notebooks/IPython materials now merged and simplified
kedro-org · Nov 23, 2022 · ebf3d64 · ebf3d64
1 parent 82d69f4
commit ebf3d64
Show file tree

Hide file tree

Showing 45 changed files with 850 additions and 880 deletions.
diff --git a/README.md b/README.md
@@ -12,11 +12,13 @@
 
 ## What is Kedro?
 
-Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code. It borrows concepts from software engineering and applies them to machine-learning code; applied concepts include modularity, separation of concerns and versioning. Kedro is hosted by the [LF AI & Data Foundation](https://lfaidata.foundation/).
+Kedro is an open-source Python framework to create reproducible, maintainable, and modular data science code. It uses software engineering best practices to help you build production-ready data engineering and data science pipelines.
+
+Kedro is hosted by the [LF AI & Data Foundation](https://lfaidata.foundation/).
 
 ## How do I install Kedro?
 
-To install Kedro from the Python Package Index (PyPI) simply run:
+To install Kedro from the Python Package Index (PyPI) run:
 
 ```
 pip install kedro
@@ -28,7 +30,7 @@ It is also possible to install Kedro using `conda`:
 conda install -c conda-forge kedro
 ```
 
-Our [Get Started guide](https://kedro.readthedocs.io/en/stable/get_started/prerequisites.html) contains full installation instructions, and includes how to set up Python virtual environments.
+Our [Get Started guide](https://kedro.readthedocs.io/en/stable/get_started/install.html) contains full installation instructions, and includes how to set up Python virtual environments.
 
 
 ## What are the main features of Kedro?
@@ -48,10 +50,14 @@ Our [Get Started guide](https://kedro.readthedocs.io/en/stable/get_started/prere
 
 ## How do I use Kedro?
 
-The [Kedro documentation](https://kedro.readthedocs.io/en/stable/) includes three examples to help get you started:
-- A typical "Hello World" example, for an [entry-level description of the main Kedro concepts](https://kedro.readthedocs.io/en/stable/get_started/hello_kedro.html)
-- An [introduction to the project template](https://kedro.readthedocs.io/en/stable/get_started/example_project.html) using the Iris dataset
-- A more detailed [spaceflights tutorial](https://kedro.readthedocs.io/en/stable/tutorial/tutorial_template.html) to give you hands-on experience
+The [Kedro documentation](https://kedro.readthedocs.io/en/stable/) first explains [how to install Kedro](https://kedro.readthedocs.io/en/stable/get_started/install.html) and then introduces [key Kedro concepts](https://kedro.readthedocs.io/en/stable/get_started/kedro_concepts.html).
+
+- The first example illustrates the [basics of a Kedro project](https://kedro.readthedocs.io/en/stable/get_started/new_project.html) using the Iris dataset
+- You can then review the [spaceflights tutorial](https://kedro.readthedocs.io/en/stable/tutorial/tutorial_template.html) to build a Kedro project for hands-on experience
+
+For new and intermediate Kedro users, there's a comprehensive section on [how to visualise Kedro projects using Kedro-Viz](https://kedro.readthedocs.io/en/stable/visualisation/kedro-viz_visualisation.html) and [how to work with Kedro and Jupyter notebooks](https://kedro.readthedocs.io/en/stable/notebooks_and_ipython/kedro_and_notebooks).
+
+Further documentation is available for more advanced Kedro usage and deployment. We also recommend the [glossary](https://kedro.readthedocs.io/en/stable/resources/glossary.html) and the [API reference documentation](/kedro) for additional information.
 
 
 ## Why does Kedro exist?

diff --git a/docs/build-docs.sh b/docs/build-docs.sh
@@ -17,9 +17,9 @@ mkdir docs/build/
 cp -r docs/_templates docs/conf.py docs/*.svg docs/*.json  docs/build/
 
 if [ "$action" == "linkcheck" ]; then
-  sphinx-build -c docs/ -WETan -j auto -D language=en -b linkcheck docs/build/ docs/build/html
+  sphinx-build -c docs/ -ETan -j auto -D language=en -b linkcheck docs/build/ docs/build/html
 elif [ "$action" == "docs" ]; then
-  sphinx-build -c docs/ -WETa -j auto -D language=en docs/build/ docs/build/html
+  sphinx-build -c docs/ -ETa -j auto -D language=en docs/build/ docs/build/html
 fi
 
 # Clean up build artefacts

diff --git a/docs/source/contribution/backwards_compatibility.md b/docs/source/contribution/backwards_compatibility.md
@@ -12,7 +12,7 @@ Your change is **not** considered a breaking change, and so is backwards compati
 
 We aim to minimise the number of breaking changes to keep Kedro software stable and reduce the overhead for users as they migrate their projects. However, there are cases where a breaking change brings considerable value or increases the maintainability of the codebase. In these cases, breaking backwards compatibility can make sense.
 
-Before you contribute a breaking change, you should create a [Github Issue](https://github.com/kedro-org/kedro/issues) that describes the change and justifies the value gained by breaking backwards compatibility.
+Before you contribute a breaking change, you should create a [GitHub Issue](https://github.com/kedro-org/kedro/issues) that describes the change and justifies the value gained by breaking backwards compatibility.
 
 ## The Kedro release model
 
@@ -22,4 +22,4 @@ All breaking changes go into `develop`, from which a major release can be deploy
 
 ![Kedro Gitflow Diagram](../meta/images/kedro_gitflow.svg)
 
-Please check the Q&A on [GitHub discussions](https://github.com/kedro-org/kedro/discussions) and ask any new questions about the development process there too!
+Got a question about the development process? Ask the community on [Slack](https://slack.kedro.org) if you need to!
diff --git a/docs/source/contribution/contribute_to_kedro.md b/docs/source/contribution/contribute_to_kedro.md
@@ -5,7 +5,7 @@ We welcome any and all contributions to Kedro, at whatever level you can manage.
 - Join the community on [Slack](https://slack.kedro.org)
 - Review Kedro's [GitHub isusses](https://github.com/kedro-org/kedro/issues) or raise your own issue to report a bug or feature request
 - Start a conversation about the Kedro project on [GitHub discussions](https://github.com/kedro-org/kedro/discussions)
-- Make a pull request on the [Kedro-Community Github repo](https://github.com/kedro-org/kedro-community) to update the curated list of Kedro community content.
+- Make a pull request on the [Kedro-Community GitHub repo](https://github.com/kedro-org/kedro-community) to update the curated list of Kedro community content.
 - Report a bug or propose a new feature on [GitHub issues](https://github.com/kedro-org/kedro/issues)
 - [Review other contributors' PRs](https://github.com/kedro-org/kedro/pulls)
 - [Contribute code](./developer_contributor_guidelines.md), for example to fix a bug or add a feature

diff --git a/docs/source/contribution/developer_contributor_guidelines.md b/docs/source/contribution/developer_contributor_guidelines.md
@@ -26,14 +26,14 @@ To work on the Kedro codebase, you will need to be set up with Git, and Make.
 If your development environment is Windows, you can use the `win_setup_conda` and `win_setup_env` commands from [Circle CI configuration](https://github.com/kedro-org/kedro/blob/main/.circleci/config.yml) to guide you in the correct way to do this.
 ```
 
-You will also need to create and activate virtual environment. If this is unfamiliar to you, read through our [pre-requisites documentation](../get_started/prerequisites.md).
+You will also need to create and activate virtual environment. If this is unfamiliar to you, read through our [pre-requisites documentation](../get_started/install.md#installation-prerequisites).
 
-Next, you'll need to fork the [Kedro source code from the Github repository](https://github.com/kedro-org/kedro):
+Next, you'll need to fork the [Kedro source code from the GitHub repository](https://github.com/kedro-org/kedro):
 
 * Fork the project by clicking **Fork** in the top-right corner of the [Kedro GitHub repository](https://github.com/kedro-org/kedro)
 * Choose your target account
 
-If you need further guidance, consult the [Github documentation about forking a repo](https://docs.github.com/en/get-started/quickstart/fork-a-repo#forking-a-repository).
+If you need further guidance, consult the [GitHub documentation about forking a repo](https://docs.github.com/en/get-started/quickstart/fork-a-repo#forking-a-repository).
 
 You are almost ready to go. In your terminal, navigate to the folder into which you forked the Kedro code.
 
@@ -194,4 +194,4 @@ Working on your first pull request? You can learn how from these resources:
 * [First timers only](https://www.firsttimersonly.com/)
 * [How to contribute to an open source project on GitHub](https://egghead.io/courses/how-to-contribute-to-an-open-source-project-on-github)
 
-Please check the Q&A on [GitHub discussions](https://github.com/kedro-org/kedro/discussions) and ask any new questions about the development process there too!
+Previous Q&A on [GitHub discussions](https://github.com/kedro-org/kedro/discussions) and our [searchable archive of Discord discussions](https://linen-discord.kedro.org). You can ask new questions about the development process on [Slack](https://slack.kedro.org) too!
diff --git a/docs/source/contribution/documentation_contributor_guidelines.md b/docs/source/contribution/documentation_contributor_guidelines.md
@@ -127,7 +127,8 @@ Do not pass "Go", do not collect £200.
 
 * You will need to use restructured text formatting within the box. Aim to keep the formatting of the callout text plain, although you can include bold, italic, code and links.
 * Keep the amount of text (and the number of callouts used) to a minimum.
-* Prefer to use `note`, `warning` and `important` only, rather than a number of different colours/types of callout.
+* Prefer to use `note`, `warning` and `important` only, rather than a larger range of callout.
+
     * Use `note` for notable information
     * Use `warning` to indicate a potential `gotcha`
     * Use `important` when highlighting a key point that cannot be ignored

diff --git a/docs/source/contribution/technical_steering_committee.md b/docs/source/contribution/technical_steering_committee.md
@@ -21,11 +21,10 @@ In this section, we detail:
 
 - Make sure that ongoing pull requests are moving forward at the right pace or closing them
 - Guide the community to use the right channel:
-  - [Github](https://github.com/kedro-org/kedro/) for feature requests and bug reports
-  - [GitHub discussions](https://github.com/kedro-org/kedro/discussions) to discuss
-    the Kedro project
-  - [Slack](https://slack.kedro.org/)
-    for questions and to support other users
+
+  - [GitHub issues](https://github.com/kedro-org/kedro/issues) for feature requests and bug reports
+  - [GitHub discussions](https://github.com/kedro-org/kedro/discussions) to discuss the future of the Kedro project
+  - [Slack](https://slack.kedro.org) for questions and to support other users
 
 ## Requirements to become a maintainer
 
@@ -52,11 +51,11 @@ and the `kedro-team` channel on the Kedro Slack organisation.
 
 ## Voting process
 
-Voting can change project maintainers and decide on the future of Kedro. The TSC leads it as voting maintainers of Kedro. The voting period is one week and is either performed on GitHub Discussions or through a pull request.
+Voting can change project maintainers and decide on the future of Kedro. The TSC leads it as voting maintainers of Kedro. The voting period is one week and is either performed on GitHub discussions or through a pull request.
 
 ### Other issues or proposals
 
-Open Github Discussions host votes on issues, proposals and changes affecting the future of Kedro, including amendments to our ways of working described in this document. These votes require **a 1/2 majority**.
+GitHub discussions is used to host votes on issues, proposals and changes affecting the future of Kedro, including amendments to our ways of working described on this page. These votes require **a 1/2 majority**.
 
 ### Adding or removing maintainers
 

diff --git a/docs/source/data/data_catalog.md b/docs/source/data/data_catalog.md
@@ -535,7 +535,7 @@ The code API allows you to:
 
 ### Configure a Data Catalog
 
-In a file like `catalog.py`, you can construct a `DataCatalog` object programmatically. In the following, we are using a number of pre-built data loaders documented in the [API reference documentation](/kedro.extras.datasets).
+In a file like `catalog.py`, you can construct a `DataCatalog` object programmatically. In the following, we are using several pre-built data loaders documented in the [API reference documentation](/kedro.extras.datasets).
 
 ```python
 from kedro.io import DataCatalog

diff --git a/docs/source/deployment/airflow_astronomer.md b/docs/source/deployment/airflow_astronomer.md
@@ -2,7 +2,7 @@
 
 This tutorial explains how to deploy a Kedro project on [Apache Airflow](https://airflow.apache.org/) with [Astronomer](https://www.astronomer.io/). Apache Airflow is an extremely popular open-source workflow management platform. Workflows in Airflow are modelled and organised as [DAGs](https://en.wikipedia.org/wiki/Directed_acyclic_graph), making it a suitable engine to orchestrate and execute a pipeline authored with Kedro. [Astronomer](https://docs.astronomer.io/astro/install-cli) is a managed Airflow platform which allows users to spin up and run an Airflow cluster easily in production. Additionally, it also provides a set of tools to help users get started with Airflow locally in the easiest way possible.
 
-The following discusses how to run the [example Iris classification pipeline](../get_started/example_project.md) on a local Airflow cluster with Astronomer.
+The following discusses how to run the [example Iris classification pipeline](../get_started/new_project.md#create-the-example-project) on a local Airflow cluster with Astronomer.
 
 ## Strategy
 

diff --git a/docs/source/deployment/aws_batch.md b/docs/source/deployment/aws_batch.md
@@ -3,7 +3,7 @@
 ## Why would you use AWS Batch?
 [AWS Batch](https://aws.amazon.com/batch/) is optimised for batch computing and applications that scale with the number of jobs running in parallel. It manages job execution and compute resources, and dynamically provisions the optimal quantity and type. AWS Batch can assist with planning, scheduling, and executing your batch computing workloads, using [Amazon EC2](https://aws.amazon.com/ec2/) On-Demand and [Spot Instances](https://aws.amazon.com/ec2/spot/), and it has native integration with [CloudWatch](https://aws.amazon.com/cloudwatch/) for log collection.
 
-AWS Batch helps you run massively parallel Kedro pipelines in a cost-effective way, and allows you to parallelise the pipeline execution across a number of compute instances. Each Batch job is run in an isolated Docker container environment.
+AWS Batch helps you run massively parallel Kedro pipelines in a cost-effective way, and allows you to parallelise the pipeline execution across multiple compute instances. Each Batch job is run in an isolated Docker container environment.
 
 The following sections are a guide on how to deploy a Kedro project to AWS Batch, and uses the [spaceflights tutorial](../tutorial/spaceflights_tutorial.md) as primary example. The guide assumes that you have already completed the tutorial, and that the project was created with the project name **Kedro Tutorial**.
 

diff --git a/docs/source/deployment/dask.md b/docs/source/deployment/dask.md
@@ -10,7 +10,7 @@ Dask offers both a default, single-machine scheduler and a more sophisticated, d
 
 ## Prerequisites
 
-The only additional requirement, beyond what was already required by your Kedro pipeline, is to [install `dask.distributed`](http://distributed.dask.org/en/stable/install.html). To review the full installation instructions, including how to set up Python virtual environments, see our [Get Started guide](../get_started/prerequisites.md).
+The only additional requirement, beyond what was already required by your Kedro pipeline, is to [install `dask.distributed`](http://distributed.dask.org/en/stable/install.html). To review the full installation instructions, including how to set up Python virtual environments, see our [Get Started guide](../get_started/install.md#installation-prerequisites).
 
 ## How to distribute your Kedro pipeline using Dask
 

diff --git a/docs/source/deployment/databricks.md b/docs/source/deployment/databricks.md
@@ -234,7 +234,7 @@ You can interact with Kedro in Databricks through the Kedro [IPython extension](
 
 The Kedro IPython extension launches a [Kedro session](../kedro_project_setup/session.md) and makes available the useful Kedro variables `catalog`, `context`, `pipelines` and `session`. It also provides the `%reload_kedro` [line magic](https://ipython.readthedocs.io/en/stable/interactive/magics.html) that reloads these variables (for example, if you need to update `catalog` following changes to your Data Catalog).
 
-The IPython extension can be used in a Databricks notebook in a similar way to how it is used in [Jupyter notebooks](../tools_integration/ipython.md).
+The IPython extension can be used in a Databricks notebook in a similar way to how it is used in [Jupyter notebooks](../notebooks_and_ipython/kedro_and_notebooks.md).
 
 If you encounter a `ContextualVersionConflictError`, it is likely caused by Databricks using an old version of `pip`. Hence there's one additional step you need to do in the Databricks notebook to make use of the IPython extension. After you load the IPython extension using the below command:
 

diff --git a/docs/source/deployment/deployment_guide.md b/docs/source/deployment/deployment_guide.md
@@ -2,7 +2,7 @@
 
 ## Deployment choices
 
-Your choice of deployment method will depend on a number of factors. In this section we provide a number of guides for different approaches.
+Your choice of deployment method will depend on various factors. In this section we provide guides for different approaches.
 
 If you decide to deploy your Kedro project on a single machine, you should consult our [guide to single-machine deployment](single_machine.md), and decide whether to [use Docker for container-based deployment](./single_machine.md#container-based) or to use [package-based deployment](./single_machine.md#package-based) or to [use the CLI to clone and deploy](./single_machine.md#cli-based) your codebase to a server.
 

diff --git a/docs/source/deployment/distributed.md b/docs/source/deployment/distributed.md
@@ -40,4 +40,4 @@ We encourage you to play with different ways of parameterising your runs as you
 
 ## 4. (Optional) Create starters
 
-This is an optional step, but it may speed up your work in the long term. If you find yourself having to deploy in a similar environment or to a similar platform fairly often, you may want to [build your own Kedro starter](../get_started/starters.md). That way you will be able to re-use any deployment scripts written as part of step 2.
+This is an optional step, but it may speed up your work in the long term. If you find yourself having to deploy in a similar environment or to a similar platform fairly often, you may want to [build your own Kedro starter](../kedro_project_setup/starters.md). That way you will be able to re-use any deployment scripts written as part of step 2.
diff --git a/docs/source/development/commands_reference.md b/docs/source/development/commands_reference.md
@@ -506,7 +506,7 @@ To start an IPython shell:
 kedro ipython
 ```
 
-The [Kedro IPython extension](../tools_integration/ipython.md) will make the following variables available in your IPython or Jupyter session:
+The [Kedro IPython extension](../notebooks_and_ipython/kedro_and_notebooks.md#a-custom-kedro-kernel) makes the following variables available in your IPython or Jupyter session:
 
 * `catalog` (type `DataCatalog`): [Data Catalog](../data/data_catalog.md) instance that contains all defined datasets; this is a shortcut for `context.catalog`
 * `context` (type `KedroContext`): Kedro project context that provides access to Kedro's library components

diff --git a/docs/source/development/set_up_pycharm.md b/docs/source/development/set_up_pycharm.md
@@ -153,9 +153,11 @@ You can configure Pycharm's IPython to load Kedro's Extension.
 
 Click **PyCharm | Preferences** for macOS or **File | Settings**, inside **Build, Execution, Deployment** and **Console**, enter the **Python Console** configuration.
 
-You can append the configuration necessary to use Kedro IPython to the **Starting script** as described in the [IPython configuring documentation](../tools_integration/ipython.md).
+You can append the configuration necessary to use Kedro IPython to the **Starting script**:
 
-![](../meta/images/pycharm_ipython_starting_script.png)
+```
+%load_ext kedro.ipython
+```
 
 With this configuration, when you create a Python Console you should be able to use context, session and catalog.
 

diff --git a/docs/source/extend_kedro/common_use_cases.md b/docs/source/extend_kedro/common_use_cases.md
@@ -39,4 +39,4 @@ Your plugin's implementation can take advantage of other extension mechanisms su
 
 ## Use Case 4: How to customise the initial boilerplate of your project
 
-Sometimes you might want to tailor the starting boilerplate of a Kedro project to your specific needs. For example, your organisation might have a standard CI script that you want to include in every new Kedro project. To this end, please visit our [guide to create Kedro starters](./create_kedro_starters.md) to solve this extension requirement.
+Sometimes you might want to tailor the starting boilerplate of a Kedro project to your specific needs. For example, your organisation might have a standard CI script that you want to include in every new Kedro project. To this end, please visit the [guide for creating Kedro starters](../kedro_project_setup/starters.md#how-to-create-a-kedro-starter) to solve this extension requirement.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -40,4 +40,4 @@ We encourage you to play with different ways of parameterising your runs as you

		## 4. (Optional) Create starters

		This is an optional step, but it may speed up your work in the long term. If you find yourself having to deploy in a similar environment or to a similar platform fairly often, you may want to [build your own Kedro starter](../get_started/starters.md). That way you will be able to re-use any deployment scripts written as part of step 2.
		This is an optional step, but it may speed up your work in the long term. If you find yourself having to deploy in a similar environment or to a similar platform fairly often, you may want to [build your own Kedro starter](../kedro_project_setup/starters.md). That way you will be able to re-use any deployment scripts written as part of step 2.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -39,4 +39,4 @@ Your plugin's implementation can take advantage of other extension mechanisms su

		## Use Case 4: How to customise the initial boilerplate of your project

		Sometimes you might want to tailor the starting boilerplate of a Kedro project to your specific needs. For example, your organisation might have a standard CI script that you want to include in every new Kedro project. To this end, please visit our [guide to create Kedro starters](./create_kedro_starters.md) to solve this extension requirement.
		Sometimes you might want to tailor the starting boilerplate of a Kedro project to your specific needs. For example, your organisation might have a standard CI script that you want to include in every new Kedro project. To this end, please visit the [guide for creating Kedro starters](../kedro_project_setup/starters.md#how-to-create-a-kedro-starter) to solve this extension requirement.