-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Production-ready docker compose for the production image #8605
Comments
This one duplicates #8548 a bit - but I want to leave it for a while as I wanted to split it into smaller functional pieces. |
It would be nice to have this in "Quick Start Guide when using Docker Image" too. WDYT |
Absolutely. It's already planned in #8542 :) |
Added missing label :) |
Here is another example of a Docker Compose that I've been working on. The Compose defines multiple services to run Airflow. Extension fields are used for airflow environment variables to reduce code duplication. I added a Makefile along the docker-compose.yml in my repo so all you have to do to run the docker-compose is run version: "3.7"
x-airflow-environment: &airflow-environment
AIRFLOW__CORE__EXECUTOR: CeleryExecutor
AIRFLOW__WEBSERVER__RBAC: "True"
AIRFLOW__CORE__LOAD_EXAMPLES: "False"
AIRFLOW__CELERY__BROKER_URL: "redis://:@redis:6379/0"
AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres:5432/airflow
services:
postgres:
image: postgres:11.5
environment:
POSTGRES_USER: airflow
POSTGRES_DB: airflow
POSTGRES_PASSWORD: airflow
redis:
image: redis:5
environment:
REDIS_HOST: redis
REDIS_PORT: 6379
ports:
- 6379:6379
init:
image: apache/airflow:1.10.10
environment:
<<: *airflow-environment
depends_on:
- redis
- postgres
volumes:
- ./dags:/opt/airflow/dags
entrypoint: /bin/bash
command: >
-c "airflow list_users || (airflow initdb
&& airflow create_user --role Admin --username airflow --password airflow -e airflow@airflow.com -f airflow -l airflow)"
restart: on-failure
webserver:
image: apache/airflow:1.10.10
ports:
- 8080:8080
environment:
<<: *airflow-environment
depends_on:
- init
volumes:
- ./dags:/opt/airflow/dags
command: "webserver"
restart: always
flower:
image: apache/airflow:1.10.10
ports:
- 5555:5555
environment:
<<: *airflow-environment
depends_on:
- redis
command: flower
restart: always
scheduler:
image: apache/airflow:1.10.10
environment:
<<: *airflow-environment
depends_on:
- webserver
volumes:
- ./dags:/opt/airflow/dags
command: scheduler
restart: always
worker:
image: apache/airflow:1.10.10
environment:
<<: *airflow-environment
depends_on:
- scheduler
volumes:
- ./dags:/opt/airflow/dags
command: worker
restart: always |
Here's my docker-compose config using LocalExecutor... docker-compose.airflow.yml:version: '2.1'
services:
airflow:
# image: apache/airflow:1.10.10
build:
context: .
args:
- DOCKER_UID=${DOCKER_UID-1000}
dockerfile: Dockerfile
restart: always
environment:
- AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgres://airflow:${POSTGRES_PW-airflow}@postgres:5432/airflow
- AIRFLOW__CORE__FERNET_KEY=${AF_FERNET_KEY-GUYoGcG5xdn5K3ysGG3LQzOt3cc0UBOEibEPxugDwas=}
- AIRFLOW__CORE__EXECUTOR=LocalExecutor
- AIRFLOW__CORE__AIRFLOW_HOME=/opt/airflow/
- AIRFLOW__CORE__LOAD_EXAMPLES=False
- AIRFLOW__CORE__LOAD_DEFAULT_CONNECTIONS=False
- AIRFLOW__CORE__LOGGING_LEVEL=${AF_LOGGING_LEVEL-info}
volumes:
- ../airflow/dags:/opt/airflow/dags:z
- ../airflow/plugins:/opt/airflow/plugins:z
- ./volumes/airflow_data_dump:/opt/airflow/data_dump:z
- ./volumes/airflow_logs:/opt/airflow/logs:z
healthcheck:
test: ["CMD-SHELL", "[ -f /opt/airflow/airflow-webserver.pid ]"]
interval: 30s
timeout: 30s
retries: 3 docker-compose.yml:version: '2.1'
services:
postgres:
image: postgres:9.6
container_name: af_postgres
environment:
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=${POSTGRES_PW-airflow}
- POSTGRES_DB=airflow
- PGDATA=/var/lib/postgresql/data/pgdata
volumes:
- ./volumes/postgres_data:/var/lib/postgresql/data/pgdata:Z
ports:
- 127.0.0.1:5432:5432
webserver:
extends:
file: docker-compose.airflow.yml
service: airflow
container_name: af_webserver
command: webserver
depends_on:
- postgres
ports:
- ${DOCKER_PORTS-8080}
networks:
- proxy
- default
environment:
# Web Server Config
- AIRFLOW__WEBSERVER__DAG_DEFAULT_VIEW=graph
- AIRFLOW__WEBSERVER__HIDE_PAUSED_DAGS_BY_DEFAULT=true
- AIRFLOW__WEBSERVER__RBAC=true
# Web Server Performance tweaks
# 2 * NUM_CPU_CORES + 1
- AIRFLOW__WEBSERVER__WORKERS=${AF_WORKERS-2}
# Restart workers every 30min instead of 30seconds
- AIRFLOW__WEBSERVER__WORKER_REFRESH_INTERVAL=1800
labels:
- "traefik.enable=true"
- "traefik.http.routers.airflow.rule=Host(`af.example.com`)"
- "traefik.http.routers.airflow.middlewares=admin-auth@file"
scheduler:
extends:
file: docker-compose.airflow.yml
service: airflow
container_name: af_scheduler
command: scheduler
depends_on:
- postgres
environment:
# Performance Tweaks
# Reduce how often DAGs are reloaded to dramatically reduce CPU use
- AIRFLOW__SCHEDULER__MIN_FILE_PROCESS_INTERVAL=${AF_MIN_FILE_PROCESS_INTERVAL-60}
- AIRFLOW__SCHEDULER__MAX_THREADS=${AF_THREADS-1}
networks:
proxy:
external: true Dockerfile:# Custom Dockerfile
FROM apache/airflow:1.10.10
# Install mssql support & dag dependencies
USER root
RUN apt-get update -yqq \
&& apt-get install -y gcc freetds-dev \
&& apt-get install -y git procps \
&& apt-get install -y vim
RUN pip install apache-airflow[mssql,mssql,ssh,s3,slack]
RUN pip install azure-storage-blob sshtunnel google-api-python-client oauth2client \
&& pip install git+https://github.com/infusionsoft/Official-API-Python-Library.git \
&& pip install rocketchat_API
# This fixes permission issues on linux.
# The airflow user should have the same UID as the user running docker on the host system.
# make build is adjust this value automatically
ARG DOCKER_UID
RUN \
: "${DOCKER_UID:?Build argument DOCKER_UID needs to be set and non-empty. Use 'make build' to set it automatically.}" \
&& usermod -u ${DOCKER_UID} airflow \
&& find / -path /proc -prune -o -user 50000 -exec chown -h airflow {} \; \
&& echo "Set airflow's uid to ${DOCKER_UID}"
USER airflow MakefileAnd here's my Makefile to control it the containers like SERVICE = "scheduler"
TITLE = "airflow containers"
ACCESS = "http://af.example.com"
.PHONY: run
build:
docker-compose build
run:
@echo "Starting $(TITLE)"
docker-compose up -d
@echo "$(TITLE) running on $(ACCESS)"
runf:
@echo "Starting $(TITLE)"
docker-compose up
stop:
@echo "Stopping $(TITLE)"
docker-compose down
restart: stop print-newline run
tty:
docker-compose run --rm --entrypoint='' $(SERVICE) bash
ttyr:
docker-compose run --rm --entrypoint='' -u root $(SERVICE) bash
attach:
docker-compose exec $(SERVICE) bash
attachr:
docker-compose exec -u root $(SERVICE) bash
logs:
docker-compose logs --tail 50 --follow $(SERVICE)
conf:
docker-compose config
initdb:
docker-compose run --rm $(SERVICE) initdb
upgradedb:
docker-compose run --rm $(SERVICE) upgradedb
print-newline:
@echo ""
@echo "" |
@potiuk Is this the preferred way to add dependencies (airflow-mssql)?
|
I the preferred way will be to set properly AIRFLOW_EXTRAS variable and pass them as --build-arg They are defined like that in the Dockerfile:
and when building the dockerfile you can set them as I think that maybe it's worth to have "additional extras" and append them though |
Oh, that's super cool. |
You should also be able to build a new image using ON_BUILD feature - for building images depending on the base one. Added a separate issue here: #8872 |
The same applies to additional Python packages.
|
Moved to gist |
I see it there (even in incognito mode). Must have been a temporary glitch of DockerHub.
As mentioned in the docs above, if you want to customize the image you need to checkout airflow sources and run the docker command inside the Airfllow sources. As it is in case of most Dockerfiles, they need context ("." in the command) and some extra files (for example entrypoint scripts) that have to be available in this context, and the easiest way it is to checkout Airflow Sources in the right version and customize the image from there. You can find a nice description in here: https://airflow.readthedocs.io/en/latest/production-deployment.html - we moved the documentation to "docs" and it has not yet been released (but it will be in 1.10.13) - but you can use the "latest" version - it contains all detailed description of customizing vs. extending and even a nice table showing what are the differences - one point there is that you need to use Airflow sources to customize the image.
See above - you need to run in inside checked out sources of Airflow.
Yes. That's the whole point - customisation only works if you have sources of Airflow checked out. |
I think we should get this one in sooner before 2.0.0rc1, is someone willing to work on this one?? |
Also, I don't think docker-compose files need to be production-ready. It should just be meant for local-development or to quickly start / work on Airflow locally with different executors |
Agree. Starting small is good. |
@potiuk should we move milestone to 2.1 for this? |
Yep. Just did :). |
My docker compose: version: '3'
x-airflow-common:
&airflow-common
image: apache/airflow:1.10.12
environment:
- AIRFLOW__CORE__EXECUTOR=CeleryExecutor
- AIRFLOW__CORE__SQL_ALCHEMY_CONN=mysql://root@mysql/airflow?charset=utf8mb4
- AIRFLOW__CORE__SQL_ENGINE_COLLATION_FOR_IDS=utf8mb3_general_ci
- AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
- AIRFLOW__CELERY__RESULT_BACKEND=redis://:@redis:6379/0
- AIRFLOW__CORE__FERNET_KEY=FB0o_zt4e3Ziq3LdUUO7F2Z95cvFFx16hU8jTeR1ASM=
- AIRFLOW__CORE__LOAD_EXAMPLES=False
- AIRFLOW__CORE__LOGGING_LEVEL=Debug
- AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION=False
- AIRFLOW__WEBSERVER__RBAC=True
- AIRFLOW__CORE__STORE_SERIALIZED_DAGS=True
- AIRFLOW__CORE__STORE_DAG_CODE=True
volumes:
- ./dags:/opt/airflow/dags
- ./airflow-data/logs:/opt/airflow/logs
- ./airflow-data/plugins:/opt/airflow/plugins
depends_on:
- redis
- mysql
services:
mysql:
image: mysql:5.7
environment:
- MYSQL_ALLOW_EMPTY_PASSWORD=true
- MYSQL_ROOT_HOST=%
- MYSQL_DATABASE=airflow
volumes:
- ./mysql/conf.d:/etc/mysql/conf.d:ro
- /dev/urandom:/dev/random # Required to get non-blocking entropy source
- ./airflow-data/mysql-db-volume:/var/lib/mysql
ports:
- "3306:3306"
command:
- mysqld
- --character-set-server=utf8mb4
- --collation-server=utf8mb4_unicode_ci
redis:
image: redis:latest
ports:
- 6379:6379
flower:
<< : *airflow-common
command: flower
ports:
- 5555:5555
airflow-init:
<< : *airflow-common
container_name: airflow_init
entrypoint: /bin/bash
command:
- -c
- airflow list_users || (
airflow initdb &&
airflow create_user
--role Admin
--username airflow
--password airflow
--email airflow@airflow.com
--firstname airflow
--lastname airflow
)
restart: on-failure
airflow-webserver:
<< : *airflow-common
command: webserver
ports:
- 8080:8080
restart: always
airflow-scheduler:
<< : *airflow-common
container_name: airflow_scheduler
command:
- scheduler
- --run-duration
- '30'
restart: always
airflow-worker:
<< : *airflow-common
container_name: airflow_worker1
command: worker
restart: always
|
@BasPH shared on Slack: one-line command to start Airflow in docker:
https://apache-airflow.slack.com/archives/CQAMHKWSJ/p1608152276070500 |
I have prepared some Dockerfiles with some common configuration. Postgres - Redis - Airflow 2.0version: '3'
x-airflow-common:
&airflow-common
image: apache/airflow:1.10.14
environment:
- AIRFLOW__CORE__EXECUTOR=CeleryExecutor
- AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres/airflow
- AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:airflow@postgres/airflow
#- AIRFLOW__CELERY__RESULT_BACKEND=redis://:@redis:6379/0
- AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
- AIRFLOW__WEBSERVER__RBAC=True
- AIRFLOW__CORE__FERNET_KEY=FB0o_zt4e3Ziq3LdUUO7F2Z95cvFFx16hU8jTeR1ASM=
- AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION=True
volumes:
- ./dags:/opt/airflow/dags
- ./airflow-data/logs:/opt/airflow/logs
- ./airflow-data/plugins:/opt/airflow/plugins
depends_on:
redis:
condition: service_healthy
postgres:
condition: service_healthy
services:
postgres:
image: postgres:9.5
environment:
POSTGRES_USER: airflow
POSTGRES_PASSWORD: airflow
POSTGRES_DB: airflow
volumes:
- ./airflow-data/postgres-db-volume:/var/lib/postgresql/data
healthcheck:
test: ["CMD", "pg_isready", "-U", "airflow"]
interval: 30s
retries: 5
restart: always
redis:
image: redis:latest
ports:
- 6379:6379
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 30s
retries: 50
restart: always
airflow-webserver:
<< : *airflow-common
command: webserver
ports:
- 8080:8080
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
interval: 10s
timeout: 10s
retries: 5
restart: always
airflow-scheduler:
<< : *airflow-common
command: scheduler
restart: always
airflow-worker:
<< : *airflow-common
command: celery worker
restart: always
airflow-init:
<< : *airflow-common
entrypoint: /bin/bash
command:
- -c
- airflow users list || (
airflow db init &&
airflow users create
--role Admin
--username airflow
--password airflow
--email airflow@airflow.com
--firstname airflow
--lastname airflow
)
restart: on-failure
flower:
<< : *airflow-common
command: celery flower
ports:
- 5555:5555
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:5555/"]
interval: 10s
timeout: 10s
retries: 5
restart: always Postgres - Redis - Airflow 1.10.14version: '3'
x-airflow-common:
&airflow-common
image: apache/airflow:1.10.14
environment:
- AIRFLOW__CORE__EXECUTOR=CeleryExecutor
- AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres/airflow
- AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:airflow@postgres/airflow
- AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
- AIRFLOW__CORE__FERNET_KEY=FB0o_zt4e3Ziq3LdUUO7F2Z95cvFFx16hU8jTeR1ASM=
- AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION=True
volumes:
- ./dags:/opt/airflow/dags
- ./airflow-data/logs:/opt/airflow/logs
- ./airflow-data/plugins:/opt/airflow/plugins
depends_on:
redis:
condition: service_healthy
postgres:
condition: service_healthy
services:
postgres:
image: postgres:9.5
environment:
POSTGRES_USER: airflow
POSTGRES_PASSWORD: airflow
POSTGRES_DB: airflow
volumes:
- ./airflow-data/postgres-db-volume:/var/lib/postgresql/data
healthcheck:
test: ["CMD", "pg_isready", "-U", "airflow"]
interval: 30s
retries: 5
restart: always
redis:
image: redis:latest
ports:
- 6379:6379
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 30s
retries: 50
restart: always
airflow-webserver:
<< : *airflow-common
command: webserver
ports:
- 8080:8080
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
interval: 10s
timeout: 10s
retries: 5
restart: always
airflow-scheduler:
<< : *airflow-common
command: scheduler
restart: always
airflow-worker:
<< : *airflow-common
command: worker
restart: always
airflow-init:
<< : *airflow-common
entrypoint: /bin/bash
command:
- -c
- airflow list_users || (
airflow initdb &&
airflow create_user
--role Admin
--username airflow
--password airflow
--email airflow@airflow.com
--firstname airflow
--lastname airflow
)
restart: on-failure
flower:
<< : *airflow-common
command: flower
ports:
- 5555:5555
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:5555/healthcheck"]
interval: 10s
timeout: 10s
retries: 5
restart: always Mysql 8.0 - Redis - Airflow 2.0# Migrations are broken. Mysql 8.0 - Redis - Airflow 1.10.14version: '3'
x-airflow-common:
&airflow-common
image: apache/airflow:1.10.14
environment:
- AIRFLOW__CORE__EXECUTOR=CeleryExecutor
- AIRFLOW__CORE__SQL_ALCHEMY_CONN=mysql://root:airflow@mysql/airflow?charset=utf8mb4
- AIRFLOW__CORE__SQL_ENGINE_COLLATION_FOR_IDS=utf8mb3_general_ci
- AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
- AIRFLOW__CORE__FERNET_KEY=FB0o_zt4e3Ziq3LdUUO7F2Z95cvFFx16hU8jTeR1ASM=
- AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION=True
volumes:
- ./dags:/opt/airflow/dags
- ./airflow-data/logs:/opt/airflow/logs
- ./airflow-data/plugins:/opt/airflow/plugins
depends_on:
redis:
condition: service_healthy
mysql:
condition: service_healthy
services:
mysql:
image: mysql:8.0
environment:
- MYSQL_ROOT_PASSWORD=airflow
- MYSQL_ROOT_HOST=%
- MYSQL_DATABASE=airflow
volumes:
- ./airflow-data/mysql-db-volume:/var/lib/mysql
ports:
- "3306:3306"
command:
- mysqld
- --explicit-defaults-for-timestamp
- --default-authentication-plugin=mysql_native_password
- --character-set-server=utf8mb4
- --collation-server=utf8mb4_unicode_ci
healthcheck:
test: ["CMD-SHELL", "mysql -h localhost -P 3306 -u root -pairflow -e 'SELECT 1'"]
interval: 10s
timeout: 10s
retries: 5
restart: always
redis:
image: redis:latest
ports:
- 6379:6379
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 30s
retries: 50
restart: always
airflow-webserver:
<< : *airflow-common
command: webserver
ports:
- 8080:8080
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
interval: 10s
timeout: 10s
retries: 5
restart: always
airflow-scheduler:
<< : *airflow-common
command: scheduler
restart: always
airflow-worker:
<< : *airflow-common
command: worker
restart: always
airflow-init:
<< : *airflow-common
entrypoint: /bin/bash
command:
- -c
- airflow list_users || (
airflow initdb &&
airflow create_user
--role Admin
--username airflow
--password airflow
--email airflow@airflow.com
--firstname airflow
--lastname airflow
)
restart: on-failure
flower:
<< : *airflow-common
command: flower
ports:
- 5555:5555
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:5555/"]
interval: 10s
timeout: 10s
retries: 5
restart: always
I added health checks where it was simple. Anyone have an idea for health-checks for Besides, I am planning to prepare a tool that is used to generate docker-compose files using a simple wizard. I am thinking of something similar to the Pytorch project. |
Very good idea! ❤️ |
Has anyone successfully gotten turbodbc installed using pip? I have had to install miniconda and use conda-forge to get turbodbc + pyarrow working correctly. This adds a little complication to my Dockerfile, although I do kind of like the conda-env.yml file approach. @mik-laj wow, I knew I could use common environment variables but I had no idea you could also do the volumes and images, that is super clean. Any reason why you have the scheduler restart every 30 seconds like that? |
Thank you all for the
@mik-laj I also have a working healthcheck on the scheduler. Not the most expressive but works. This configuration relies on an existing and initialized database. External database - LocalExecutor - Airflow 2.0.0 - Traefik - Dags mostly based on DockerOperator. version: "3.7"
x-airflow-environment: &airflow-environment
AIRFLOW__CORE__EXECUTOR: LocalExecutor
AIRFLOW__CORE__LOAD_EXAMPLES: "False"
AIRFLOW__CORE__LOAD_DEFAULT_CONNECTIONS: "False"
AIRFLOW__CORE__SQL_ALCHEMY_CONN: ${DB_CONNECTION_STRING}
AIRFLOW__CORE__FERNET_KEY: ${ENCRYPTION_KEY}
AIRFLOW__CORE__DAGS_FOLDER: /opt/airflow/sync/git/dags
AIRFLOW__CORE__ENABLE_XCOM_PICKLING: "True" # because of https://github.com/apache/airflow/issues/13487
AIRFLOW__WEBSERVER__BASE_URL: https://airflow.example.com
AIRFLOW__WEBSERVER__ENABLE_PROXY_FIX: "True"
AIRFLOW__WEBSERVER__RBAC: "True"
services:
traefik:
image: traefik:v2.4
container_name: traefik
command:
- --ping=true
- --providers.docker=true
- --providers.docker.exposedbydefault=false
- --entrypoints.web.address=:80
- --entrypoints.websecure.address=:443
# HTTP -> HTTPS redirect
- --entrypoints.web.http.redirections.entrypoint.to=websecure
- --entrypoints.web.http.redirections.entrypoint.scheme=https
# TLS config
- --certificatesresolvers.myresolver.acme.dnschallenge=true
- --certificatesresolvers.myresolver.acme.storage=/letsencrypt/acme.json
## Comment following line for a production deployment
- --certificatesresolvers.myresolver.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
## See https://doc.traefik.io/traefik/https/acme/#providers for other providers
- --certificatesresolvers.myresolver.acme.dnschallenge.provider=digitalocean
- --certificatesresolvers.myresolver.acme.email=user@example.com
ports:
- 80:80
- 443:443
environment:
# See https://doc.traefik.io/traefik/https/acme/#providers for other providers
DO_AUTH_TOKEN:
restart: always
healthcheck:
test: ["CMD", "traefik", "healthcheck", "--ping"]
interval: 10s
timeout: 10s
retries: 5
volumes:
- certs:/letsencrypt
- /var/run/docker.sock:/var/run/docker.sock:ro
# Required because of DockerOperator. For secure access and handling permissions.
docker-socket-proxy:
image: tecnativa/docker-socket-proxy:0.1.1
environment:
CONTAINERS: 1
IMAGES: 1
AUTH: 1
POST: 1
privileged: true
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
restart: always
# Allows to deploy Dags on pushes to master
git-sync:
image: k8s.gcr.io/git-sync/git-sync:v3.2.2
container_name: dags-sync
environment:
GIT_SYNC_USERNAME:
GIT_SYNC_PASSWORD:
GIT_SYNC_REPO: https://example.com/my/repo.git
GIT_SYNC_DEST: dags
GIT_SYNC_BRANCH: master
GIT_SYNC_WAIT: 60
volumes:
- dags:/tmp:rw
restart: always
webserver:
image: apache/airflow:2.0.0
container_name: airflow_webserver
environment:
<<: *airflow-environment
command: webserver
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
interval: 10s
timeout: 10s
retries: 5
restart: always
volumes:
- dags:/opt/airflow/sync
- logs:/opt/airflow/logs
depends_on:
- git-sync
- traefik
labels:
- traefik.enable=true
- traefik.http.routers.webserver.rule=Host(`airflow.example.com`)
- traefik.http.routers.webserver.entrypoints=websecure
- traefik.http.routers.webserver.tls.certresolver=myresolver
- traefik.http.services.webserver.loadbalancer.server.port=8080
scheduler:
image: apache/airflow:2.0.0
container_name: airflow_scheduler
environment:
<<: *airflow-environment
command: scheduler
restart: always
healthcheck:
test: ["CMD-SHELL", 'curl --silent http://airflow_webserver:8080/health | grep -A 1 scheduler | grep \"healthy\"']
interval: 10s
timeout: 10s
retries: 5
volumes:
- dags:/opt/airflow/sync
- logs:/opt/airflow/logs
depends_on:
- git-sync
- webserver
volumes:
dags:
logs:
certs:
I have an extra container (not shown) to handle rotating logs that are output directly to files. It is based on logrotate. Not sharing it here because it is a custom image and is beyond the scope of the thread. But if anybody interested, message me. Hope it helps! |
@mik-laj Can we close this one since we already added the docker-compose files? |
@kaxil -> I believe so. I do not think 'production-ready" docker-compose is even a thing :) |
Description
In order to use the production image we are already working on a helm chart, but we might want to add a production-ready docker compose that will be able to run airflow installation.
Use case / motivation
For local tests/small deployments - being able to have such docker-compose environment would be really nice.
We seem to get to consensus that we need to have several docker-compose "sets" of files:
They should be varianted and possible to specify the number of parameters:
Depending on the setup, those Docker compose file should do proper DB initialisation.
Example Docker Compose (From https://apache-airflow.slack.com/archives/CQAMHKWSJ/p1587748008106000) that we might use as a base and #8548 . This is just example so this issue will not implement all of it and we will likely split those docker-compose into separate postgres/sqlite/mysql similarly as we do in CI script, so I wanted to keep it as separate issue - we will deal with user creation in #8606
Another example from https://apache-airflow.slack.com/archives/CQAMHKWSJ/p1587679356095400:
Related issues
The initial user creation #8606, #8548
Quick start documentation planned in #8542
The text was updated successfully, but these errors were encountered: