Skip to content

Commit

Permalink
Add pip caching for faster build (#35026)
Browse files Browse the repository at this point in the history
---------

Co-authored-by: Arthur Volant <arthur.volant@adevinta.com>
Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
  • Loading branch information
3 people authored Oct 31, 2023
1 parent ab80ca8 commit 66871a0
Show file tree
Hide file tree
Showing 7 changed files with 42 additions and 13 deletions.
26 changes: 18 additions & 8 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -553,9 +553,9 @@ function common::install_pip_version() {
echo "${COLOR_BLUE}Installing pip version ${AIRFLOW_PIP_VERSION}${COLOR_RESET}"
echo
if [[ ${AIRFLOW_PIP_VERSION} =~ .*https.* ]]; then
pip install --disable-pip-version-check --no-cache-dir "pip @ ${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check "pip @ ${AIRFLOW_PIP_VERSION}"
else
pip install --disable-pip-version-check --no-cache-dir "pip==${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
fi
mkdir -p "${HOME}/.local/bin"
}
Expand Down Expand Up @@ -1123,7 +1123,7 @@ if [[ -n "${_PIP_ADDITIONAL_REQUIREMENTS=}" ]] ; then
>&2 echo " the container starts, so it is only useful for testing and trying out"
>&2 echo " of adding dependencies."
>&2 echo
pip install --root-user-action ignore --no-cache-dir ${_PIP_ADDITIONAL_REQUIREMENTS}
pip install --root-user-action ignore ${_PIP_ADDITIONAL_REQUIREMENTS}
fi


Expand Down Expand Up @@ -1201,7 +1201,8 @@ SHELL ["/bin/bash", "-o", "pipefail", "-o", "errexit", "-o", "nounset", "-o", "n
ARG PYTHON_BASE_IMAGE
ENV PYTHON_BASE_IMAGE=${PYTHON_BASE_IMAGE} \
DEBIAN_FRONTEND=noninteractive LANGUAGE=C.UTF-8 LANG=C.UTF-8 LC_ALL=C.UTF-8 \
LC_CTYPE=C.UTF-8 LC_MESSAGES=C.UTF-8
LC_CTYPE=C.UTF-8 LC_MESSAGES=C.UTF-8 \
PIP_CACHE_DIR=/tmp/.cache/pip

ARG DEV_APT_DEPS=""
ARG ADDITIONAL_DEV_APT_DEPS=""
Expand Down Expand Up @@ -1386,8 +1387,15 @@ WORKDIR ${AIRFLOW_HOME}
COPY --from=scripts install_from_docker_context_files.sh install_airflow.sh \
install_additional_dependencies.sh /scripts/docker/

# hadolint ignore=SC2086, SC2010
RUN if [[ ${INSTALL_PACKAGES_FROM_CONTEXT} == "true" ]]; then \
# Useful for creating a cache id based on the underlying architecture, preventing the use of cached python packages from
# an incorrect architecture.
ARG TARGETARCH
# Value to be able to easily change cache id and therefore use a bare new cache
ARG PIP_CACHE_EPOCH="0"

# hadolint ignore=SC2086, SC2010, DL3042
RUN --mount=type=cache,id=$PYTHON_BASE_IMAGE-$AIRFLOW_PIP_VERSION-$TARGETARCH-$PIP_CACHE_EPOCH,target=/tmp/.cache/pip,uid=${AIRFLOW_UID} \
if [[ ${INSTALL_PACKAGES_FROM_CONTEXT} == "true" ]]; then \
bash /scripts/docker/install_from_docker_context_files.sh; \
fi; \
if ! airflow version 2>/dev/null >/dev/null; then \
Expand All @@ -1405,8 +1413,10 @@ RUN if [[ ${INSTALL_PACKAGES_FROM_CONTEXT} == "true" ]]; then \
# In case there is a requirements.txt file in "docker-context-files" it will be installed
# during the build additionally to whatever has been installed so far. It is recommended that
# the requirements.txt contains only dependencies with == version specification
RUN if [[ -f /docker-context-files/requirements.txt ]]; then \
pip install --no-cache-dir --user -r /docker-context-files/requirements.txt; \
# hadolint ignore=DL3042
RUN --mount=type=cache,id=additional-requirements-$PYTHON_BASE_IMAGE-$AIRFLOW_PIP_VERSION-$TARGETARCH-$PIP_CACHE_EPOCH,target=/tmp/.cache/pip,uid=${AIRFLOW_UID} \
if [[ -f /docker-context-files/requirements.txt ]]; then \
pip install --user -r /docker-context-files/requirements.txt; \
fi

##############################################################################################
Expand Down
4 changes: 2 additions & 2 deletions Dockerfile.ci
Original file line number Diff line number Diff line change
Expand Up @@ -513,9 +513,9 @@ function common::install_pip_version() {
echo "${COLOR_BLUE}Installing pip version ${AIRFLOW_PIP_VERSION}${COLOR_RESET}"
echo
if [[ ${AIRFLOW_PIP_VERSION} =~ .*https.* ]]; then
pip install --disable-pip-version-check --no-cache-dir "pip @ ${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check "pip @ ${AIRFLOW_PIP_VERSION}"
else
pip install --disable-pip-version-check --no-cache-dir "pip==${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
fi
mkdir -p "${HOME}/.local/bin"
}
Expand Down
3 changes: 3 additions & 0 deletions docs/docker-stack/build-arg-ref.rst
Original file line number Diff line number Diff line change
Expand Up @@ -278,3 +278,6 @@ Docker context files.
| | | This allows to optimize iterations for |
| | | Image builds and speeds up CI builds. |
+------------------------------------------+------------------------------------------+------------------------------------------+
| ``PIP_CACHE_EPOCH`` | ``"0"`` | Allow to invalidate cache by passing a |
| | | new argument. |
+------------------------------------------+------------------------------------------+------------------------------------------+
12 changes: 12 additions & 0 deletions docs/docker-stack/build.rst
Original file line number Diff line number Diff line change
Expand Up @@ -972,3 +972,15 @@ The architecture of the images

You can read more details about the images - the context, their parameters and internal structure in the
`IMAGES.rst <https://github.com/apache/airflow/blob/main/IMAGES.rst>`_ document.


Pip packages caching
....................

To enable faster iteration when building the image locally (especially if you are testing different combination of
python packages), pip caching has been enabled. The caching id is based on four different parameters:

1. ``PYTHON_BASE_IMAGE``: Avoid sharing same cache based on python version and target os
2. ``AIRFLOW_PIP_VERSION``
3. ``TARGETARCH``: Avoid sharing architecture specific cached package
4. ``PIP_CACHE_EPOCH``: Enable changing cache id by passing ``PIP_CACHE_EPOCH`` as ``--build-arg``
4 changes: 4 additions & 0 deletions docs/docker-stack/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,10 @@ here so that users affected can find the reason for the changes.
Airflow 2.7
~~~~~~~~~~~

* 2.7.4

* PIP caching for local builds has been enabled to speed up local custom image building

* 2.7.3

* Add experimental feature for select type of MySQL Client libraries during the build custom image via ``INSTALL_MYSQL_CLIENT_TYPE``
Expand Down
4 changes: 2 additions & 2 deletions scripts/docker/common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -77,9 +77,9 @@ function common::install_pip_version() {
echo "${COLOR_BLUE}Installing pip version ${AIRFLOW_PIP_VERSION}${COLOR_RESET}"
echo
if [[ ${AIRFLOW_PIP_VERSION} =~ .*https.* ]]; then
pip install --disable-pip-version-check --no-cache-dir "pip @ ${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check "pip @ ${AIRFLOW_PIP_VERSION}"
else
pip install --disable-pip-version-check --no-cache-dir "pip==${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
fi
mkdir -p "${HOME}/.local/bin"
}
2 changes: 1 addition & 1 deletion scripts/docker/entrypoint_prod.sh
Original file line number Diff line number Diff line change
Expand Up @@ -308,7 +308,7 @@ if [[ -n "${_PIP_ADDITIONAL_REQUIREMENTS=}" ]] ; then
>&2 echo " the container starts, so it is only useful for testing and trying out"
>&2 echo " of adding dependencies."
>&2 echo
pip install --root-user-action ignore --no-cache-dir ${_PIP_ADDITIONAL_REQUIREMENTS}
pip install --root-user-action ignore ${_PIP_ADDITIONAL_REQUIREMENTS}
fi


Expand Down

0 comments on commit 66871a0

Please sign in to comment.