Airflow scheduler pulling data from Dune, storing it in a Postgres database and creating Grafana Dahbaords.
- Quick Start
- Contributing
- Prerequisites
- Manual Setup
- Project Structure
- Customization
- Testing
- Accessing Fetched Data
- Troubleshooting
- TODO
- License
To set up and run the entire project, simply execute:
chmod +x setup.sh
./setup.sh
this will set up the Docker images, initialize Airflow, start Airflow services, and run the DAGs.
with
- airflow on http://localhost:8080/
- graphana on http://localhost:3000/
After running the schedulers dags copy_csv_to_postgres
and user_operations_analysis.py
you can access the data in the Grafana dashboards or/and querying the final view table:
SELECT hour, category, operation_count FROM view_final_results ORDER BY hour;
-
Clone the repository
-
Create a new branch
-
Run
make pre-commit
to run the pre-commit hooks -
Commit messages should be in the following format:
feat: {description}
fix: {description}
docs: {description}
style: {description}
refactor: {description}
test: {description}
-
Before merging to
master
, thedevelop
branch should be merged intomaster
- Docker
- Docker Compose
- Make
cp .env-example .env `
make sure you set .env with:
DUNE_API_KEY={YOUR_KEY}
If you prefer to run the commands manually, you can use the following Make commands:
make build
: Build the Docker imagesmake init
: Initialize Airflowmake up
: Start Airflow servicesmake down
: Stop Airflow servicesmake logs
: View logsmake shell
: Access the Airflow shell
dags/user_operations_analysis.py.py
: The main Airflow DAG file that pulls data off and on chaindags/validate_dags.py.py
: The helper script to validate the DAGsdags/copy_csv_to_postgres.py
: The helper script to copy the CSV files to the DBdags/google_sheet_to_postgres.py
: The helper script to copy from sheets files to the DBDockerfile
: Defines the Docker image for Airflowdocker-compose.yml
: Defines the services (Airflow, PostgreSQL)requirements.txt
: Lists the Python dependenciesMakefile
: Contains shortcuts for common commandssetup.sh
: Script to automate the entire setup process
To modify the analysis or add new features:
- Edit the
dags/daily_etl.py
file - Rebuild the Docker images and Restart the services using
make sync-dags
- If updated dependencies in
requirements.txt
then make sure Airflow has the latest dependencies by running:
make build
make down
make up
This project includes both unit tests and end-to-end tests for the Airflow DAG.
make test
: Run all testsmake test-unit
: Run only unit testsmake test-e2e
: Run only end-to-end tests
To access the DB data:
docker exec -it yuza-sosa-postgres-1 bash
psql -U airflow -d airflow
- or
psql -h localhost -p 5433 -U airflow -d airflow
SELECT * FROM user_operations LIMIT 10;
select * from view_final_results;
If you encounter any issues:
- Check the logs using
make logs
- Ensure all required ports are available (8080 for Airflow webserver)
- Try stopping all services with
make down
, then start again withmake up
- Create dags, logs and plugins folder inside the project directory
- Set user permissions for Airflow to your current user ex:
sudo chown -R airflow:airflow /opt/airflow
- If fails, set manually
DUNE_API_KEY
can be done in airflow.cfg or console - check export paths in modules are not recognised
export PYTHONPATH=dags/:$PYTHONPATH
- set
google_sheet_to_postgres
with service_accout_key - better naming convention for DAGs and tables
- validate and add more tests
- add more dashboards
- partition tables
MIT License