- There are two ways you can run the Airflow dev env on your machine:
- With a Docker Container
- With a local virtual environment
Before deciding which method to choose, there are a couple factors to consider: Running Airflow in a container is the most reliable way: it provides a more consistent environment and allows integration tests with a number of integrations (cassandra, mongo, mysql, etc.). However it also requires 4GB RAM, 40GB disk space and at least 2 cores. If you are working on a basic feature, installing Airflow on a local environment might be sufficient.
- For a comprehensive venv tutorial - visit Virtual Env Guide
- Docker Community Edition
- Docker Compose
- pyenv (you can also use pyenv-virtualenv or virtualenvwrapper)
- jq
- Installing required packages for Docker and setting up docker repo
$ sudo apt-get update
$ sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
$ sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
- Install Docker
$ sudo apt-get update
$ sudo apt-get install docker-ce docker-ce-cli containerd.io
- Creating group for docker and adding current user to it.
$ sudo groupadd docker
$ sudo usermod -aG docker $USER
Note : After adding user to docker group Logout and Login again for group membership re-evaluation.
- Test Docker installation
$ docker run hello-world
- Installing latest version of Docker Compose
$ COMPOSE_VERSION="$(curl -s https://api.github.com/repos/docker/compose/releases/latest | grep '"tag_name":'\
| cut -d '"' -f 4)"
$ COMPOSE_URL="https://github.com/docker/compose/releases/download/${COMPOSE_VERSION}/\
docker-compose-$(uname -s)-$(uname -m)"
$ sudo curl -L "${COMPOSE_URL}" -o /usr/local/bin/docker-compose
$ sudo chmod +x /usr/local/bin/docker-compose
- Verifying installation
$ docker-compose --version
Note: You might have issues with pyenv if you have a Mac with an M1 chip. Consider using virtualenv as an alternative.
- Install pyenv and configure your shell's environment for Pyenv as suggested in Pyenv README
- After installing pyenv, you need to install a few more required packages for Airflow
$ sudo apt-get install openssl sqlite default-libmysqlclient-dev libmysqlclient-dev postgresql
- Restart your shell so the path changes take effect and verifying installation
$ exec $SHELL
$ pyenv --version
- Checking available version, installing required Python version to pyenv and verifying it
$ pyenv install --list
$ pyenv install 3.8.5
$ pyenv versions
- Creating new virtual environment named
for installed version python. In next chapter virtual environmentairflow-env
will be used for installing airflow.
$ pyenv virtualenv 3.8.5 airflow-env
- Entering virtual environment
$ pyenv activate airflow-env
is a lightweight and flexible command-line JSON processor.
Install jq
with the following command:
$ sudo apt install jq
Setup and develop using PyCharm
Goto https://github.com/apache/airflow/ and fork the project.
Goto your github account's fork of airflow click on
and copy the clone link.Open your IDE or source code editor and select the option to clone the repository
Paste the copied clone link in the URL field and submit.
- Open terminal and enter into virtual environment
and goto project directory
$ pyenv activate airflow-env
$ cd ~/Projects/airflow/
- Initializing breeze autocomplete
$ breeze setup-autocomplete
- Initialize breeze environment with required python version and backend. This may take a while for first time.
$ breeze --python 3.8 --backend mysql
If you encounter an error like "docker.credentials.errors.InitializationError: docker-credential-secretservice not installed or not available in PATH", you may execute the following command to fix it:
$ sudo apt install golang-docker-credential-helper
Once the package is installed, execute the breeze command again to resume image building.
- Once the breeze environment is initialized, create airflow tables and users from the breeze CLI.
airflow db reset
is required to execute at least once for Airflow Breeze to get the database/tables created.
root@b76fcb399bb6:/opt/airflow# airflow db reset
root@b76fcb399bb6:/opt/airflow# airflow users create --role Admin --username admin --password admin \
--email admin@example.com --firstname foo --lastname bar
- Closing Breeze environment. After successfully finishing above command will leave you in container,
to exit the container
root@b76fcb399bb6:/opt/airflow# exit
$ breeze stop
- It may require some packages to be installed; watch the output of the command to see which ones are missing.
$ sudo apt-get install sqlite libsqlite3-dev default-libmysqlclient-dev postgresql
- Initialize virtual environment with breeze.
$ ./scripts/tools/initialize_virtualenv.py
- Add following line to ~/.bashrc in order to call breeze command from anywhere.
export PATH=${PATH}:"/home/${USER}/Projects/airflow"
source ~/.bashrc
- Starting breeze environment using
breeze start-airflow
starts Breeze environment with last configuration run( In this case python and backend will be picked up from last executionbreeze --python 3.8 --backend mysql
) It also automatically starts webserver, backend and scheduler. It drops you in tmux with scheduler in bottom left and webserver in bottom right. Use[Ctrl + B] and Arrow keys
to navigate.
$ breeze start-airflow
Use CI image.
Branch name: main
Docker image: ghcr.io/apache/airflow/main/ci/python3.8:latest
Airflow source version: 2.4.0.dev0
Python version: 3.8
Backend: mysql 5.7
Port forwarding:
Ports are forwarded to the running docker containers for webserver and database
* 12322 -> forwarded to Airflow ssh server -> airflow:22
* 28080 -> forwarded to Airflow webserver -> airflow:8080
* 25555 -> forwarded to Flower dashboard -> airflow:5555
* 25433 -> forwarded to Postgres database -> postgres:5432
* 23306 -> forwarded to MySQL database -> mysql:3306
* 21433 -> forwarded to MSSQL database -> mssql:1443
* 26379 -> forwarded to Redis broker -> redis:6379
Here are links to those services that you can use on host:
* ssh connection for remote debugging: ssh -p 12322 airflow@ pw: airflow
* Webserver:
* Flower:
* Postgres: jdbc:postgresql://
* Mysql: jdbc:mysql://
* MSSQL: jdbc:sqlserver://;databaseName=airflow;user=sa;password=Airflow123
* Redis: redis://
Alternatively you can start the same using following commands
- Start Breeze
$ breeze --python 3.8 --backend mysql
- Open tmux
$ root@0c6e4ff0ab3d:/opt/airflow# tmux
- Press Ctrl + B and "
$ root@0c6e4ff0ab3d:/opt/airflow# airflow scheduler
- Press Ctrl + B and %
$ root@0c6e4ff0ab3d:/opt/airflow# airflow webserver
Now you can access airflow web interface on your local machine at with user name
and passwordadmin
.Setup mysql database in MySQL Workbench with Host
, port23306
, userroot
and password blank(leave empty), default schemaairflow
.If you cannot connect to MySQL, refer to the Prerequisites section in the Breeze documentation and try increasing Docker disk space.
Stopping breeze
root@f3619b74c59a:/opt/airflow# stop_airflow
root@f3619b74c59a:/opt/airflow# exit
$ breeze stop
- Knowing more about Breeze
$ breeze --help
For more information visit : Breeze documentation
Following are some of important topics of Breeze documentation:
Click on the branch symbol in the status bar
Give a name to a branch and checkout
Setup and develop using Visual Studio Code
Goto https://github.com/apache/airflow/ and fork the project.
Goto your github account's fork of airflow click on
and copy the clone link.Open your IDE or source code editor and select the option to clone the repository
Paste the copied clone link in the URL field and submit.
- Open terminal and enter into virtual environment
and goto project directory
$ pyenv activate airflow-env
$ cd ~/Projects/airflow/
- Initializing breeze autocomplete
$ breeze setup-autocomplete
$ source ~/.bash_completion.d/breeze-complete
- Initialize breeze environment with required python version and backend. This may take a while for first time.
$ breeze --python 3.8 --backend mysql
If you encounter an error like "docker.credentials.errors.InitializationError: docker-credential-secretservice not installed or not available in PATH", you may execute the following command to fix it:
$ sudo apt install golang-docker-credential-helper
Once the package is installed, execute the breeze command again to resume image building.
- Once the breeze environment is initialized, create airflow tables and users from the breeze CLI.
airflow db reset
is required to execute at least once for Airflow Breeze to get the database/tables created.
root@b76fcb399bb6:/opt/airflow# airflow db reset
root@b76fcb399bb6:/opt/airflow# airflow users create --role Admin --username admin --password admin \
--email admin@example.com --firstname foo --lastname bar
- Closing Breeze environment. After successfully finishing above command will leave you in container,
to exit the container
root@b76fcb399bb6:/opt/airflow# exit
$ breeze stop
- It may require some packages to be installed; watch the output of the command to see which ones are missing.
$ sudo apt-get install sqlite libsqlite3-dev default-libmysqlclient-dev postgresql
$ ./scripts/tools/initialize_virtualenv.py
- Add following line to ~/.bashrc in order to call breeze command from anywhere.
export PATH=${PATH}:"/home/${USER}/Projects/airflow"
source ~/.bashrc
- Starting breeze environment using
breeze start-airflow
starts Breeze environment with last configuration run( In this case python and backend will be picked up from last executionbreeze --python 3.8 --backend mysql
) It also automatically starts webserver, backend and scheduler. It drops you in tmux with scheduler in bottom left and webserver in bottom right. Use[Ctrl + B] and Arrow keys
to navigate.
$ breeze start-airflow
Use CI image.
Branch name: main
Docker image: ghcr.io/apache/airflow/main/ci/python3.8:latest
Airflow source version: 2.4.0.dev0
Python version: 3.8
Backend: mysql 5.7
Port forwarding:
Ports are forwarded to the running docker containers for webserver and database
* 12322 -> forwarded to Airflow ssh server -> airflow:22
* 28080 -> forwarded to Airflow webserver -> airflow:8080
* 25555 -> forwarded to Flower dashboard -> airflow:5555
* 25433 -> forwarded to Postgres database -> postgres:5432
* 23306 -> forwarded to MySQL database -> mysql:3306
* 21433 -> forwarded to MSSQL database -> mssql:1443
* 26379 -> forwarded to Redis broker -> redis:6379
Here are links to those services that you can use on host:
* ssh connection for remote debugging: ssh -p 12322 airflow@ pw: airflow
* Webserver:
* Flower:
* Postgres: jdbc:postgresql://
* Mysql: jdbc:mysql://
* MSSQL: jdbc:sqlserver://;databaseName=airflow;user=sa;password=Airflow123
* Redis: redis://
Alternatively you can start the same using following commands
- Start Breeze
$ breeze --python 3.8 --backend mysql
- Open tmux
$ root@0c6e4ff0ab3d:/opt/airflow# tmux
- Press Ctrl + B and "
$ root@0c6e4ff0ab3d:/opt/airflow# airflow scheduler
- Press Ctrl + B and %
$ root@0c6e4ff0ab3d:/opt/airflow# airflow webserver
Now you can access airflow web interface on your local machine at with user name
and passwordadmin
.Setup mysql database in MySQL Workbench with Host
, port23306
, userroot
and password blank(leave empty), default schemaairflow
.Stopping breeze
root@f3619b74c59a:/opt/airflow# stop_airflow
root@f3619b74c59a:/opt/airflow# exit
$ breeze stop
- Knowing more about Breeze
$ breeze --help
For more information visit : Breeze documentation
Following are some of important topics of Breeze documentation:
- Debugging an example DAG
In Visual Studio Code open airflow project, directory
of local machine is by default mounted to docker machine when breeze airflow is started. So any DAG file present in this directory will be picked automatically by scheduler running in docker machine and same can be seen onhttp://
.Copy any example DAG present in the
directory to/files/dags/
.Add a
block at the end of your DAG file to make it runnable. It will run aback_fill
job:if __name__ == "__main__": dag.clear() dag.run()
"AIRFLOW__CORE__EXECUTOR": "DebugExecutor"
to the"env"
field of Debug configuration.Using the
view click onCreate a launch.json file
to point to an example dag and add"env"
fields to the new Python configuration
Now Debug an example dag and view the entries in tables such as
dag_run, xcom
etc in mysql workbench.
Click on the branch symbol in the status bar
Give a name to a branch and checkout
Setup and develop using Gitpod online workspaces
Goto https://github.com/apache/airflow/ and fork the project.
Goto your github account's fork of airflow click on
and copy the clone link.Add goto https://gitpod.io/#<copied-url> as shown.
- Breeze is already initialized in one of the terminals in Gitpod
- Once the breeze environment is initialized, create airflow tables and users from the breeze CLI.
airflow db reset
is required to execute at least once for Airflow Breeze to get the database/tables created.
This step is needed when you would like to run/use webserver.
root@b76fcb399bb6:/opt/airflow# airflow db reset
root@b76fcb399bb6:/opt/airflow# airflow users create --role Admin --username admin --password admin \
--email admin@example.com --firstname foo --lastname bar
- Closing Breeze environment. After successfully finishing above command will leave you in container,
to exit the container
root@b76fcb399bb6:/opt/airflow# exit
$ breeze stop
Gitpod default image have all the required packages installed.
- Add following line to ~/.bashrc in order to call breeze command from anywhere.
export PATH=${PATH}:"/workspace/airflow"
source ~/.bashrc
- Starting breeze environment using
breeze start-airflow
starts Breeze environment with last configuration run. It also automatically starts webserver, backend and scheduler. It drops you in tmux with scheduler in bottom left and webserver in bottom right. Use[Ctrl + B] and Arrow keys
to navigate.
$ breeze start-airflow
Use CI image.
Branch name: main
Docker image: ghcr.io/apache/airflow/main/ci/python3.8:latest
Airflow source version: 2.4.0.dev0
Python version: 3.8
Backend: mysql 5.7
Port forwarding:
Ports are forwarded to the running docker containers for webserver and database
* 12322 -> forwarded to Airflow ssh server -> airflow:22
* 28080 -> forwarded to Airflow webserver -> airflow:8080
* 25555 -> forwarded to Flower dashboard -> airflow:5555
* 25433 -> forwarded to Postgres database -> postgres:5432
* 23306 -> forwarded to MySQL database -> mysql:3306
* 21433 -> forwarded to MSSQL database -> mssql:1443
* 26379 -> forwarded to Redis broker -> redis:6379
Here are links to those services that you can use on host:
* ssh connection for remote debugging: ssh -p 12322 airflow@ pw: airflow
* Webserver:
* Flower:
* Postgres: jdbc:postgresql://
* Mysql: jdbc:mysql://
* MSSQL: jdbc:sqlserver://;databaseName=airflow;user=sa;password=Airflow123
* Redis: redis://
- You can access the ports as shown
Click on the branch symbol in the status bar
Give a name to a branch and checkout
All Tests are inside ./tests
Running Unit tests inside Breeze environment.
Just run
pytest filepath+filename
to run the tests.
root@4a2143c17426:/opt/airflow# pytest tests/utils/test_session.py
======================================= test session starts =======================================
platform linux -- Python 3.7.12, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /usr/local/bin/python
cachedir: .pytest_cache
rootdir: /opt/airflow, configfile: pytest.ini
plugins: anyio-3.3.4, flaky-3.7.0, asyncio-0.16.0, cov-3.0.0, forked-1.3.0, httpx-0.15.0, instafail-0.4.2, rerunfailures-9.1.1, timeouts-1.2.1, xdist-2.4.0, requests-mock-1.9.3
setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s
collected 4 items
tests/utils/test_session.py::TestSession::test_raised_provide_session PASSED [ 25%]
tests/utils/test_session.py::TestSession::test_provide_session_without_args_and_kwargs PASSED [ 50%]
tests/utils/test_session.py::TestSession::test_provide_session_with_args PASSED [ 75%]
tests/utils/test_session.py::TestSession::test_provide_session_with_kwargs PASSED [100%]
====================================== 4 passed, 11 warnings in 33.14s ======================================
- Running All the tests with Breeze by specifying required Python version, backend, backend version
$ ./breeze-legacy --backend mysql --mysql-version 5.7 --python 3.8 --db-reset --test-type All tests
- Running specific test in container using shell scripts. Testing in container scripts are located in
root@4a2143c17426:/opt/airflow# ls ./scripts/in_container/
_in_container_script_init.sh quarantine_issue_header.md run_mypy.sh
_in_container_utils.sh run_anything.sh run_prepare_airflow_packages.sh
airflow_ci.cfg run_ci_tests.sh run_prepare_provider_documentation.sh
bin run_docs_build.sh run_prepare_provider_packages.sh
check_environment.sh run_extract_tests.sh run_resource_check.sh
check_junitxml_result.py run_fix_ownership.sh run_system_tests.sh
configure_environment.sh run_flake8.sh run_tmux_welcome.sh
entrypoint_ci.sh run_generate_constraints.sh stop_tmux_airflow.sh
entrypoint_exec.sh run_init_script.sh update_quarantined_test_status.py
prod run_install_and_test_provider_packages.sh
root@df8927308887:/opt/airflow# ./scripts/in_container/run_docs_build.sh
Running specific type of test
- Types of tests
- Running specific type of test
Before starting a new instance, let's clear the volume and databases "fresh like a daisy". You can do this by:
$ ./breeze-legacy --backend mysql --mysql-version 5.7 --python 3.8 --db-reset --test-type Core
Running Integration test for specific test type
- Running an Integration Test
$ ./breeze-legacy --backend mysql --mysql-version 5.7 --python 3.8 --db-reset --test-type All --integration mongo