Skip to content

Commit

Permalink
[Update] 3 worker setup + README
Browse files Browse the repository at this point in the history
  • Loading branch information
staging (deeplearning) committed Jun 11, 2020
1 parent 5a75708 commit 87e705b
Show file tree
Hide file tree
Showing 2 changed files with 109 additions and 4 deletions.
51 changes: 51 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Airflow Production docker-compose setup

## Introduction

Production grade airflow docker-compose setup:
- celery backend
- rabbitmq (with dashboard)
- scalable number of nodes
- postgresql (needs external db, but you can configure a service for a local instance)

You can just copy paste the worker definition (in `docker-compose.yml`) as many times as you want, everything else should automagically work, provided
you've follwed the steps below.

## Setup

- First, copy the `env.example` and populate it with the relevant values.

```bash
cp -rv env.example .env # populate .env
```

- Then just run the docker-compose stack

```bash
docker-compose up --build -d
```


## UI

- Airflow Dashboard: `/airflow/`
- Flower UI: `/flower/`
- RabbitMQ Management Dashboard: `/` (currently there's an error with mounting to url prefix)

## Issues

### RabbitMQ

Sometimes the workers and other nodes are not able to connect to the rabbitmq instance inspite of the exact same credentials. In such a situation,
just reset the password for the user you've created and you should be good to go.

```bash
docker exec -it <rabbitmq_container_id> rabbitmqctl change_password $RABBIT_USER $RABBIT_PASSWORD
# verify once
docker exec -it <rabbitmq_container_id> rabbitmqctl authenticate_user $RABBIT_USER $RABBIT_PASSWORD
```

### `airflow.exceptions.AirflowTaskTimeout` error

WIP (I think changing the timeout config should help, please submit a PR if you've been able to fix this!)

62 changes: 58 additions & 4 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,9 @@ services:
restart: always
depends_on:
- webserver
- worker_i
- worker_0
- worker_1
- worker_2
networks:
- task_network
volumes:
Expand Down Expand Up @@ -99,10 +101,62 @@ services:
timeout: 30s
retries: 3

worker_i:
worker_0:
build: .
hostname: worker_i
command: worker -cn worker_i
hostname: worker_0
command: worker -cn worker_0
restart: always
depends_on:
- rabbit
volumes:
- ./dags:/usr/local/airflow/dags
networks:
- task_network
environment:
- EXECUTOR=$EXECUTOR
- AIRFLOW__CELERY__BROKER_URL=$AIRFLOW__CELERY__BROKER_URL
- POSTGRES_USER=$POSTGRES_USER
- POSTGRES_PASSWORD=$POSTGRES_PASSWORD
- POSTGRES_HOST=$POSTGRES_HOST
- POSTGRES_PORT=$POSTGRES_PORT
- POSTGRES_DB=$POSTGRES_DB
- POSTGRES_EXTRAS=$POSTGRES_EXTRAS
healthcheck:
test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-worker.pid ]"]
interval: 30s
timeout: 30s
retries: 3

worker_1:
build: .
hostname: worker_1
command: worker -cn worker_1
restart: always
depends_on:
- rabbit
volumes:
- ./dags:/usr/local/airflow/dags
networks:
- task_network
environment:
- EXECUTOR=$EXECUTOR
- AIRFLOW__CELERY__BROKER_URL=$AIRFLOW__CELERY__BROKER_URL
- POSTGRES_USER=$POSTGRES_USER
- POSTGRES_PASSWORD=$POSTGRES_PASSWORD
- POSTGRES_HOST=$POSTGRES_HOST
- POSTGRES_PORT=$POSTGRES_PORT
- POSTGRES_DB=$POSTGRES_DB
- POSTGRES_EXTRAS=$POSTGRES_EXTRAS
healthcheck:
test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-worker.pid ]"]
interval: 30s
timeout: 30s
retries: 3

worker_2:
build: .
hostname: worker_2
command: worker -cn worker_2
restart: always
depends_on:
- rabbit
Expand Down

0 comments on commit 87e705b

Please sign in to comment.