Skip to content

Commit

Permalink
Create spaceflights-pandas-viz starter (#152)
Browse files Browse the repository at this point in the history
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
  • Loading branch information
merelcht authored Oct 6, 2023
1 parent a2a38ac commit 0c56523
Show file tree
Hide file tree
Showing 51 changed files with 155,403 additions and 1 deletion.
1 change: 1 addition & 0 deletions features/environment.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ def before_scenario(context, scenario):
"pyspark-iris",
"spaceflights",
"spaceflights-pyspark",
"spaceflights-pandas-viz",
"spaceflights-pyspark-viz",
]
starters_paths = {
Expand Down
9 changes: 8 additions & 1 deletion features/lint.feature
Original file line number Diff line number Diff line change
Expand Up @@ -35,13 +35,20 @@ Feature: Lint all starters
When I lint the project
Then I should get a successful exit code

Scenario: Lint spaceflights-pandas-viz starter
Given I have prepared a config file
And I have run a non-interactive kedro new with the starter spaceflights-pandas-viz
And I have installed the Kedro project's dependencies
When I lint the project
Then I should get a successful exit code

Scenario: Lint spaceflights-pyspark starter
Given I have prepared a config file
And I have run a non-interactive kedro new with the starter spaceflights-pyspark
And I have installed the Kedro project's dependencies
When I lint the project
Then I should get a successful exit code

Scenario: Lint spaceflights-pyspark-viz starter
Given I have prepared a config file
And I have run a non-interactive kedro new with the starter spaceflights-pyspark-viz
Expand Down
7 changes: 7 additions & 0 deletions features/run.feature
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,13 @@ Feature: Run all starters
When I run the Kedro pipeline
Then I should get a successful exit code

Scenario: Run a Kedro project created from spaceflights-pandas-viz
Given I have prepared a config file
And I have run a non-interactive kedro new with the starter spaceflights-pandas-viz
And I have installed the Kedro project's dependencies
When I run the Kedro pipeline
Then I should get a successful exit code

Scenario: Run a Kedro project created from spaceflights-pyspark
Given I have prepared a config file
And I have run a non-interactive kedro new with the starter spaceflights-pyspark
Expand Down
43 changes: 43 additions & 0 deletions spaceflights-pandas-viz/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# The `spaceflights-pandas-viz` Kedro starter

## Overview

This is a completed version of the [spaceflights tutorial project](https://docs.kedro.org/en/stable/tutorial/spaceflights_tutorial.html) described in the [online Kedro documentation](https://docs.kedro.org) and the extra tutorial sections on [visualisation with Kedro-Viz](https://docs.kedro.org/en/stable/visualisation/index.html) and [experiment tracking with Kedro-Viz](https://docs.kedro.org/en/stable/experiment_tracking/index.html). This project includes the data required to run it.

The tutorial works through the steps necessary to create this project. To learn the most about Kedro, we recommend that you start with a blank template as the tutorial describes, and follow the workflow. However, if you prefer to read swiftly through the documentation and get to work on the code, you may want to generate a new Kedro project using this [starter](https://docs.kedro.org/en/stable/kedro_project_setup/starters.html) because the steps have been done for you.

To use this starter, create a new Kedro project using the commands below. To make sure you have the required dependencies, run it in your virtual environment (see [our documentation about virtual environments](https://docs.kedro.org/en/stable/get_started/install.html#virtual-environments) for guidance on how to get set up):

```bash
pip install kedro
kedro new --starter=spaceflights-pandas-viz
cd <my-project-name> # change directory into newly created project directory
```

This will give you the complete project and project template. If you would prefer to have a reduced project template you can use `add-ons` instead and select `Kedro-Viz` as add-on with an example:
```bash
pip install kedro
kedro new --add-ons=XXX
cd <my-project-name> # change directory into newly created project directory
```

Install the required dependencies:

```bash
pip install -r src/requirements.txt
```

Now you can run the project:

```bash
kedro run
```

To visualise the default pipeline, run:
```bash
kedro viz
```

This will open the default browser and display the following pipeline visualisation:

![](./images/pipeline_visualisation_with_layers.png)
6 changes: 6 additions & 0 deletions spaceflights-pandas-viz/cookiecutter.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"project_name": "Spaceflights Pandas Viz",
"repo_name": "{{ cookiecutter.project_name.strip().replace(' ', '-').replace('_', '-').lower() }}",
"python_package": "{{ cookiecutter.project_name.strip().replace(' ', '_').replace('-', '_').lower() }}",
"kedro_version": "{{ cookiecutter.kedro_version }}"
}
9 changes: 9 additions & 0 deletions spaceflights-pandas-viz/prompts.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
project_name:
title: "Project Name"
text: |
Please enter a human readable name for your new project.
Spaces, hyphens, and underscores are allowed.
regex_validator: "^[\\w -]{2,}$"
error_message: |
It must contain only alphanumeric symbols, spaces, underscores and hyphens and
be at least 2 characters long.
151 changes: 151 additions & 0 deletions spaceflights-pandas-viz/{{ cookiecutter.repo_name }}/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
##########################
# KEDRO PROJECT

# ignore all local configuration
conf/local/**
!conf/local/.gitkeep

# ignore potentially sensitive credentials files
conf/**/*credentials*

# ignore everything in the following folders
data/**

# except their sub-folders
!data/**/

# also keep all .gitkeep files
!.gitkeep

# keep also the example dataset
!data/01_raw/*


##########################
# Common files

# IntelliJ
.idea/
*.iml
out/
.idea_modules/

### macOS
*.DS_Store
.AppleDouble
.LSOverride
.Trashes

# Vim
*~
.*.swo
.*.swp

# emacs
*~
\#*\#
/.emacs.desktop
/.emacs.desktop.lock
*.elc

# JIRA plugin
atlassian-ide-plugin.xml

# C extensions
*.so

### Python template
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/

# Translations
*.mo
*.pot

# Django stuff:
*.log
.static_storage/
.media/
local_settings.py

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# mkdocs documentation
/site

# mypy
.mypy_cache/
34 changes: 34 additions & 0 deletions spaceflights-pandas-viz/{{ cookiecutter.repo_name }}/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# {{ cookiecutter.project_name }}

## Overview

This is your new Kedro project, which was generated using `Kedro {{ cookiecutter.kedro_version }}`.

Take a look at the [Kedro documentation](https://docs.kedro.org) to get started.

## Rules and guidelines

In order to get the best out of the template:

* Don't remove any lines from the `.gitignore` file we provide
* Make sure your results can be reproduced by following a [data engineering convention](https://docs.kedro.org/en/stable/faq/faq.html#what-is-data-engineering-convention)
* Don't commit data to your repository
* Don't commit any credentials or your local configuration to your repository. Keep all your credentials and local configuration in `conf/local/`

## How to install dependencies

Declare any dependencies in `src/requirements.txt` for `pip` installation.

To install them, run:

```
pip install -r src/requirements.txt
```

## How to run your Kedro pipeline

You can run your Kedro project with:

```
kedro run
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# What is this for?

This folder should be used to store configuration files used by Kedro or by separate tools.

This file can be used to provide users with instructions for how to reproduce local configuration with their own credentials. You can edit the file however you like, but you may wish to retain the information below and add your own section in the section titled **Instructions**.

## Local configuration

The `local` folder should be used for configuration that is either user-specific (e.g. IDE configuration) or protected (e.g. security keys).

> *Note:* Please do not check in any local configuration to version control.
## Base configuration

The `base` folder is for shared configuration, such as non-sensitive and project-related configuration that may be shared across team members.

WARNING: Please do not put access credentials in the base configuration folder.

## Instructions

## Find out more
You can find out more about configuration from the [user guide documentation](https://docs.kedro.org/en/stable/configuration/configuration_basics.html).
Loading

0 comments on commit 0c56523

Please sign in to comment.