Skip to content

Commit

Permalink
Fixes warm shutdown for celery worker. (#18068)
Browse files Browse the repository at this point in the history
The way how dumb-init propagated the signal by default
made celery worker not to handle termination well.

Default behaviour of dumb-init is to propagate signals to the
process group rather than to the single child it uses. This is
protective behaviour, in case a user runs 'bash -c' command
without 'exec' - in this case signals should be sent not only
to the bash but also to the process(es) it creates, otherwise
bash exits without propagating the signal and you need second
signal to kill all processes.

However some airflow processes (in particular airflow celery worker)
behave in a responsible way and handles the signals appropriately
- when the first signal is received, it will switch to offline
mode and let all workers terminate (until grace period expires
resulting in Warm Shutdown.

Therefore we can disable the protection of dumb-init and let it
propagate the signal to only the single child it spawns in the
Helm Chart. Documentation of the image was also updated to include
explanation of signal propagation. For explicitness the
DUMB_INIT_SETSID variable has been set to 1 in the image as well.

Fixes #18066

(cherry picked from commit 9e13e45)
  • Loading branch information
potiuk authored and kaxil committed Sep 10, 2021
1 parent c81aa2b commit 45eb384
Show file tree
Hide file tree
Showing 3 changed files with 45 additions and 0 deletions.
1 change: 1 addition & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -479,6 +479,7 @@ LABEL org.apache.airflow.distro="debian" \
org.opencontainers.image.title="Production Airflow Image" \
org.opencontainers.image.description="Reference, production-ready Apache Airflow image"

ENV DUMB_INIT_SETSID="1"

ENTRYPOINT ["/usr/bin/dumb-init", "--", "/entrypoint"]
CMD []
3 changes: 3 additions & 0 deletions chart/templates/workers/worker-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,9 @@ spec:
envFrom:
{{- include "custom_airflow_environment_from" . | default "\n []" | indent 10 }}
env:
# Only signal the main process, not the process group, to make Warm Shutdown work properly
- name: DUMB_INIT_SETSID
value: "0"
{{- include "custom_airflow_environment" . | indent 10 }}
{{- include "standard_airflow_environment" . | indent 10 }}
{{- if .Values.workers.kerberosSidecar.enabled }}
Expand Down
41 changes: 41 additions & 0 deletions docs/docker-stack/entrypoint.rst
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,47 @@ If there are any other arguments - they are simply passed to the "airflow" comma
> docker run -it apache/airflow:2.1.0-python3.6 version
2.1.0
Signal propagation
------------------

Airflow uses ``dumb-init`` to run as "init" in the entrypoint. This is in order to propagate
signals and reap child processes properly. This means that the process that you run does not have
to install signal handlers to work properly and be killed when the container is gracefully terminated.
The behaviour of signal propagation is configured by ``DUMB_INIT_SETSID`` variable which is set to
``1`` by default - meaning that the signals will be propagated to the whole process group, but you can
set it to ``0`` to enable ``single-child`` behaviour of ``dumb-init`` which only propagates the
signals to only single child process.

The table below summarizes ``DUMB_INIT_SETSID`` possible values and their use cases.

+----------------+----------------------------------------------------------------------+
| Variable value | Use case |
+----------------+----------------------------------------------------------------------+
| 1 (default) | Propagates signals to all processes in the process group of the main |
| | process running in the container. |
| | |
| | If you run your processes via ``["bash", "-c"]`` command and bash |
| | spawn new processes without ``exec``, this will help to terminate |
| | your container gracefully as all processes will receive the signal. |
+----------------+----------------------------------------------------------------------+
| 0 | Propagates signals to the main process only. |
| | |
| | This is useful if your main process handles signals gracefully. |
| | A good example is warm shutdown of Celery workers. The ``dumb-init`` |
| | in this case will only propagate the signals to the main process, |
| | but not to the processes that are spawned in the same process |
| | group as the main one. For example in case of Celery, the main |
| | process will put the worker in "offline" mode, and will wait |
| | until all running tasks complete, and only then it will |
| | terminate all processes. |
| | |
| | For Airflow's Celery worker, you should set the variable to 0 |
| | and either use ``["celery", "worker"]`` command. |
| | If you are running it through ``["bash", "-c"]`` command, |
| | you need to start the worker via ``exec airflow celery worker`` |
| | as the last command executed. |
+----------------+----------------------------------------------------------------------+

Additional quick test options
-----------------------------

Expand Down

0 comments on commit 45eb384

Please sign in to comment.