The project uses Kaggle ecommerce dataset to demonstrate a simple ELT pipeline using Apache Airflow and Google Cloud Platform.
Follow the instructions in Development Environment to set up the development environment.
The source data structure is documented in dataset.md
To set up a local virtualenv for the IDE suggestions, use the following commands:
python3.10 -m venv venv/
pip install --upgrade pip
pip install 'apache-airflow==2.8.1' \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.8.1/constraints-3.10.txt"
pip install -r dev/requirements.txt