Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dockerfile): Add pip caching for faster build #35026

Merged
merged 11 commits into from
Oct 31, 2023

Conversation

V0lantis
Copy link
Contributor

Pip caching

For our corporation, we are doing a custom build. I noticed that every time, docker was downloading again and again the python packages and I thought it was a little bit too slow. I might want to enlarge this usage of pip caching for our CI as well.

wdyt?


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added the area:production-image Production image improvements and fixes label Oct 18, 2023
@V0lantis V0lantis force-pushed the feat/add-caching-while-building-image branch from 8753e0c to bceb2fc Compare October 18, 2023 15:35
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to learn more about your use case and see how and whether it can help at all.

I am not sure if it changes anything and it adds some assumptions that the image will be built several times on the same machine while also for some reason it will not use the already cached layers.

Could you please show some usage where it actually helps and improve things ? Some benchmarks showing what you've done and before/after would be good.

Quite for sure caching when installing pip (first part of your change) has no effect.

This is something that is only done once to upgrade to latest version and then (regardless of caching) pip does not get re-installed again, because it is already in the target version. And (unless you have some other scenario in mind) re-using cache will only actually work when you use the same docker engine. And in this case - unless you change your docker file parameters, it is already handled by docker layer caching. I.e. if you are building your docker image twice in the same docker engine with the same parameters, then caching is done at the "container layer" level. So when you run your build again, on the same docker engine you should not see it trying to install Python again. You should instead see that CACHED layer is used. And this is all without having to mount cache volume.

So I wonder why you would like to cache pip upgrade with cache.

The second case might be a bit more interesting - but there also you will see reinstallation only happening if you change requirements.txt. And yes - in this case it will be a bit faster when you just add one requirement, but this will also work for the same user when rebuilding the image over and over again on the same machine, which is generally not something that you should use airflow's dockerfile for (usually) - usually this can be done by extending the image (FROM apache/airflow) and adding lines in your own Dockerfile (where you could use indeed --mount-type cache etc.).

I wonder if you could describe your case in more detail and show some benchmark example of how much time is saved for particular scenarios.

I am not totally against it, but I would love to understand your case, because a) maybe you do not gain as much as you think or b) maybe you are doing somethign that prevents you from using docker layer caching or c) maybe you are using the Dockerfile of Airlfow in unintended way.

Looking forward for more detailed explanation of your case.

So caching pip install makes no sense at all.

@hussein-awala
Copy link
Member

but there also you will see reinstallation only happening if you change requirements.txt. And yes - in this case it will be a bit faster when you just add one requirement, but this will also work for the same user when rebuilding the image over and over again on the same machine

I agree, it will be useful in the CI when moby/buildkit#1512 will be implemented, in that case we can combine it with Github cache to persist the mounted cache. This could work now if we replace --mount=type=cache,target=<path> by --mount=type=bind,source=<path>,target=<path>, but it will impact the users who don't have a cache folder in their host.

I am not totally against it, but I would love to understand your case, because a) maybe you do not gain as much as you think or b) maybe you are doing somethign that prevents you from using docker layer caching or c) maybe you are using the Dockerfile of Airlfow in unintended way.

This might be useful to someone playing with the requirements file to resolve a conflict, so if there is no potential risk from this change I would be OK with merging it.

@potiuk
Copy link
Member

potiuk commented Oct 18, 2023

This might be useful to someone playing with the requirements file to resolve a conflict, so if there is no potential risk from this change I would be OK with merging it.

Yep I think this one is not a bad idea if there are no side-effects. I would still love to see the scenarios and ways how it is used.

@V0lantis
Copy link
Contributor Author

V0lantis commented Oct 19, 2023

I would love to learn more about your use case and see how and whether it can help at all.

The goal here is to try every possible way to improve the build time of Airflow. We are using custom building and in our CI, the build doesnt use the cached layers 😭 So we have an average build time of 8minutes. This is not much but a lot of people are working on our repository and we would like to improve their developer experience as a whole.

I am not sure if it changes anything and it adds some assumptions that the image will be built several times on the same machine while also for some reason it will not use the already cached layers.

Yes, in our CI, the layers are not cached. We are trying to use the buildkit experimental feature :

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
        with:
          version: latest
          driver-opts: image=moby/buildkit:latest
          buildkitd-flags: --debug

and in our build command:

docker buildx build .... \ 
--cache-to=type=gha,scope=platform-$ARCH,mode=max \
--cache-from=type=gha,scope=platform-$ARCH"

but Github Actions is still not re-using the cached layer 😞

Could you please show some usage where it actually helps and improve things ? Some benchmarks showing what you've done and before/after would be good.

Ok, so it's not a proper benchmark, I just used the time bash method, but it can give you an idea

Without mounting cache:

$ time  docker build \
  --build-arg PYTHON_BASE_IMAGE=python:3.11-slim-bullseye \
  --build-arg AIRFLOW_VERSION=2.6.3 \
  --build-arg AIRFLOW_HOME=/usr/local/airflow \
  --build-arg AIRFLOW_EXTRAS=async,celery,redis,google_auth,hashicorp,statsd \
  --build-arg 'ADDITIONAL_PYTHON_DEPS=apache-airflow-providers-amazon==8.3.1 apache-airflow-providers-celery==3.2.1 apache-airflow-providers-common-sql==1.6.0 apache-airflow-providers-cncf-kubernetes==7.4.2 apache-airflow-providers-datadog==3.3.1 apache-airflow-providers-docker==3.7.1 apache-airflow-providers-ftp==3.4.2 apache-airflow-providers-github==2.3.1 apache-airflow-providers-google==10.3.0 apache-airflow-providers-http==4.4.2 apache-airflow-providers-imap==3.2.2 apache-airflow-providers-mysql==5.1.1 apache-airflow-providers-postgres==5.5.2 apache-airflow-providers-sftp==4.3.1 apache-airflow-providers-slack==7.3.1 apache-airflow-providers-ssh==3.7.1 apache-airflow-providers-tableau==4.2.1 apache-airflow-providers-zendesk==4.3.1 apache-airflow-providers-salesforce==5.4.1' \
  --build-arg 'ADDITIONAL_RUNTIME_APT_DEPS=groff gettext git' \
  --build-arg AIRFLOW_CONSTRAINTS_LOCATION=/docker-context-files/constraints-airflow.txt \
  --build-arg DOCKER_CONTEXT_FILES=docker-context-files \
  --build-arg INSTALL_MYSQL_CLIENT=true \
  --build-arg INSTALL_MSSQL_CLIENT=false \
  --build-arg ADDITIONAL_DEV_APT_DEPS=pkgconf \
  --no-cache

33.74s user 15.55s system 11% cpu 7:01.16 total

In the above, all pip requirements have been downloaded before being installed. In my case, I don't have at the moment a good internet connection, so it is even more slower than someone with a good internet connection

WITH mounting cache:

$ time  docker build
  ...
  --no-cache

17.52s user 7.62s system 7% cpu 5:16.20 total

The --no-cache method argument above doesn't influence the mounted pip cache.

Quite for sure caching when installing pip (first part of your change) has no effect.

Sure, mistake from my side, I removed the line in 77c6813

This is something that is only done once to upgrade to latest version and then (regardless of caching) pip does not get re-installed again, because it is already in the target version. And (unless you have some other scenario in mind) re-using cache will only actually work when you use the same docker engine. And in this case - unless you change your docker file parameters, it is already handled by docker layer caching. I.e. if you are building your docker image twice in the same docker engine with the same parameters, then caching is done at the "container layer" level. So when you run your build again, on the same docker engine you should not see it trying to install Python again. You should instead see that CACHED layer is used. And this is all without having to mount cache volume.

So I wonder why you would like to cache pip upgrade with cache.

The second case might be a bit more interesting - but there also you will see reinstallation only happening if you change requirements.txt. And yes - in this case it will be a bit faster when you just add one requirement, but this will also work for the same user when rebuilding the image over and over again on the same machine, which is generally not something that you should use airflow's dockerfile for (usually) - usually this can be done by extending the image (FROM apache/airflow) and adding lines in your own Dockerfile (where you could use indeed --mount-type cache etc.).

I wonder if you could describe your case in more detail and show some benchmark example of how much time is saved for particular scenarios.

I am not totally against it, but I would love to understand your case, because a) maybe you do not gain as much as you think or b) maybe you are doing somethign that prevents you from using docker layer caching or c) maybe you are using the Dockerfile of Airlfow in unintended way.

Looking forward for more detailed explanation of your case.

So caching pip install makes no sense at all.

Here is more about optimising the build for people trying to fix constraints, or just when we would like to re-build the image after adding one more requirement (as you said and as @hussein-awala pointed out).

It is always nicer when it is really optimised at its fullest in my opinion, but I understand that the trouble might not be worth it.
This pull request is also more about a proposal and a discussion, maybe, it should have been better to just open an issue, but since I was testing this optimisations on my local machine, I wanted to show it directly here.

@V0lantis
Copy link
Contributor Author

Also, I don't have the feeling that your CI is using cache layers even though it should?. As @hussein-awala said, once buildkit will have implemented a way to use binded cache in CI, it would also help to improve build speed

@potiuk
Copy link
Member

potiuk commented Oct 19, 2023

Also, I don't have the feeling that your CI is using cache layers even though it should?. As @hussein-awala said, once buildkit will have implemented a way to use binded cache in CI, it would also help to improve build speed

It's because you were extremely unlucky when checking it. Python just released a new base image 4 hours ago, and our cache will refresh with the new base image after we merge something to main and the build in main succeeds:

image

The way how we are using remote caching is specifically designed to automatically handle such cache refreshes. The whole CI of ourse is designed to automatically upgrade to latest versions of the base images, apt dependencies and Python dependencies as soon as:

  • they are released
  • all our tests pass in main (also called canary) run

This allows us to automtically upgrade and test our software and test suite to "latest", "most secure" and "consistent" set of dependencies (including base dependencies for the image), while our PRs keeps on using the "stable and confirmed to work in the past" set of python dependencies.

This way we can detect breaking changes in incoming dependencies. We detected many breaking changes this way "early" - hence canary name (including even couple of breaking changes in Python patchlevel releases)

Indeed the build you see does not use the cache that corresponds to the latest pyhon version that's why it takes 14 minutes. Because cache is not yet refreshed. But it still works (which is great) while the cache is being rebuilt and refreshed.

You can see the same job from 5 hours ago https://github.com/apache/airflow/actions/runs/6565914251/job/17835531939#step:6:1070 which lasted 1 minute and actually used the cache.

@potiuk
Copy link
Member

potiuk commented Oct 19, 2023

Looking at the discussion @V0lantis @hussein-awala and thinking a bit - yeah, I think it's actually quite fine to use the local cache (even for pip installation) as long as it does not interfere with any of the caching mechanisms of ours and ability to build different images (especially for multiple python versions).

As long as the remote cache works as it did (it should), aadding those local caches should not impact the build. And possibly can speed up some local build cases.

I have one worry though. Knowing that "cache invalidation is the most difficult thing". I am not 100% sure how it will work - how does the local cache gets allocated when you have differnet arguments passed. Do you know how the local cache is determined in case we change parameters/arguments and use the same Dockerfile?

To be honest I am quite worried about the case where base python changes (see my previous comment), about this case:

docker build --build-arg PYTHON_BASE_IMAGE=python:3.11-slim-bullseye ... Dockerfile

Then (note Python version change):

docker build --build-arg PYTHON_BASE_IMAGE=python:3.10-slim-bullseye ... Dockerfile

And about the case where pip version changes. Will the pip cache be usable in this case? Will they interfere? Do we have to have mechanism to invalidate the cache in this case?

I am asking all those questions, because if any problem like that happens, someone who will have it, will inevitably open isssue in our repo and will (rightfully) complain and ask us what to do.

I do recall cases in the past that pip installation broke when the same cache has been used for different python version s - including different Python patchlevel versions. I recall CI failure after Python upgrade from 3.6.12 to 3.6.13 which required us to bump the cache key see: #14430. That was of course extreme case, but this might happen. I also recall there were some issues when new releases of pip introduced some breaking changes or caused irrecoverable errors when cache gets broken - example case here pypa/pip#11985 . Even very recently pip has changed the format of their cache storage - see 23.3 release notes (https://pip.pypa.io/en/stable/news/). This new format is kept in a different directory - so it should be safe to upgrade, but still that makes me a little worried.

The main reason why we have --no-cache there is that I did not want to to worry about all those cases and scenarios.

Curious about your thoughts on whether this is something we should worry about ? Maybe cache location be dependent on Python version and PIP_VERSION to mitigate those simply ? Maybe we should have a way to clear the cache when things get broken (other than nuking whole docker installation). And finally possibly we should document it somewhere what to do when it gets broken.

WDYT?

@hterik
Copy link
Contributor

hterik commented Oct 20, 2023

The second case might be a bit more interesting - but there also you will see reinstallation only happening if you change requirements.txt.

Or changes to content used in any of the layers above pip install. It seems like it could be quite many, eg COPY scripts, COPY ${AIRFLOW_SOURCES_FROM} happen above the pip install.
Refactoring the dockerfile to only COPY requirements first and frequently changed sources later is another good optimization technique. But i haven't looked at the dockerfile in detail to see if that is already done.

I have one worry though. Knowing that "cache invalidation is the most difficult thing". I am not 100% sure how it will work - how does the local cache gets allocated when you have differnet arguments passed. Do you know how the local cache is determined in case we change parameters/arguments and use the same Dockerfile?

To be honest I am quite worried about the case where base python changes (see my previous comment), about this case:

docker build --build-arg PYTHON_BASE_IMAGE=python:3.11-slim-bullseye ... Dockerfile

Then (note Python version change):

docker build --build-arg PYTHON_BASE_IMAGE=python:3.10-slim-bullseye ... Dockerfile

YES! Good observation, this is a very possible issue. When doing similar caching with apt install, it's very easy to mix the base image Ubuntu version across caches and end up in bad situations.
To solve this, we usually follow convention of suffixing the version in the cache key, eg for Ubuntu 22.04:
RUN --mount=type=cache,sharing=locked,target=/var/cache/apt,id=aptcache2204 --mount=type=cache,sharing=locked,target=/var/lib/apt,id=aptlib2204 \
You can probably do this with variables to not forget it everywhere.
For pip, i don't know if cache is reusable across python versions.

@potiuk
Copy link
Member

potiuk commented Oct 20, 2023

Refactoring the dockerfile to only COPY requirements first and frequently changed sources later is another good optimization technique. But i haven't looked at the dockerfile in detail to see if that is already done.

There were likely 10s iterations or so of making the right sequence of COPY/RUN there and also - in the main stage the number of layers is minimised. But maybe some of it could be improved still - it's open to contributions :).

The whole Dockerfile is now designed with the "incremental rebuild" in mind - it does selectively COPY only those things that are supposed to be used in the next step. I think what --mount-cache solves is that you really iterate over requirements.txt only. Similarly apt cache could be used (and is only useful) when someone experiments and iterates over apt dependencies

Actually final main segment is really small and simple because it is optimized for production use - i.e. size. So what the final (main) segment does (except ARG/ENV setting) is literally three steps:

  • installing OS dependencies (runtime only - no build-essentials and dev libraries needed for compilation)
RUN bash /scripts/docker/install_os_dependencies.sh runtime
  • installling DB clients (parameterized with ARGs - only those that are needed) + setting up users
RUN bash /scripts/docker/install_mysql.sh prod \
    && bash /scripts/docker/install_mssql.sh prod \
    && bash /scripts/docker/install_postgres.sh prod \
    && adduser --gecos "First Last,RoomNumber,WorkPhone,HomePhone" --disabled-password \
           --quiet "airflow" --uid "${AIRFLOW_UID}" --gid "0" --home "${AIRFLOW_USER_HOME_DIR}" \
# Make Airflow files belong to the root group and are accessible. This is to accommodate the guidelines from
# OpenShift https://docs.openshift.com/enterprise/3.0/creating_images/guidelines.html
    && mkdir -pv "${AIRFLOW_HOME}" \
    && mkdir -pv "${AIRFLOW_HOME}/dags" \
    && mkdir -pv "${AIRFLOW_HOME}/logs" \
    && chown -R airflow:0 "${AIRFLOW_USER_HOME_DIR}" "${AIRFLOW_HOME}" \
    && chmod -R g+rw "${AIRFLOW_USER_HOME_DIR}" "${AIRFLOW_HOME}" \
    && find "${AIRFLOW_HOME}" -executable -print0 | xargs --null chmod g+x \
    && find "${AIRFLOW_USER_HOME_DIR}" -executable -print0 | xargs --null chmod g+x
  • copying installed airflow (including all dependencies, .so libraries needed by those etc.) from build segment (we never install in the main segment - we always copy the installation from the build segment, this saves ~200 MB
COPY --from=airflow-build-image --chown=airflow:0 \
     "${AIRFLOW_USER_HOME_DIR}/.local" "${AIRFLOW_USER_HOME_DIR}/.local"

And then we add startup scripts ..

I believe we could add --mount-cache for all those steps - providing that we account for those cache validity problems.

@potiuk
Copy link
Member

potiuk commented Oct 20, 2023

Just to complete it - explaining the build segment.

The airflow-build-image segment is slightly more complicated: - it allows for more sophisticated caching (used during our CI build) but essentially it does this:

  • Installing OS dependencies (dev) = build-essentials + dev libraries
RUN bash /scripts/docker/install_os_dependencies.sh dev
  • Installing database clients (parameterized with args) including dev libries for those
RUN bash /scripts/docker/install_mysql.sh dev && \
    bash /scripts/docker/install_mssql.sh dev && \
    bash /scripts/docker/install_postgres.sh dev
  • Branch Tip caching - (this is skipped by default in regular builds) - pre-installing pip dependencies from the tip of the branch - used in CI only (this is a "poor man's" version of incremental caching for CI (so what @hussein-awala mentioned as something might be implemented via Allow controlling cache mounts storage location moby/buildkit#1512 ). This allows in our CI to use Docker layer cache instead of local cache for incremental builds when new requirements are added.

This layer is not invalidated when setup.py/requirements change, so we use it to pre-install the dependencies in main, so that if someone adds new dependency, it will not reinstall everything from scratch.

RUN bash /scripts/docker/install_pip_version.sh; \
    if [[ ${AIRFLOW_PRE_CACHED_PIP_PACKAGES} == "true" && \
        ${INSTALL_PACKAGES_FROM_CONTEXT} == "false" && \
        ${UPGRADE_TO_NEWER_DEPENDENCIES} == "false" ]]; then \
        bash /scripts/docker/install_airflow_dependencies_from_branch_tip.sh; \
    fi
  • Installing airflow and packages (depends on args - it will install them either from pypi or from paackages in docker-context files and alows to also specify ADDITIONAL_PYTHON_DEPS
RUN if [[ ${INSTALL_PACKAGES_FROM_CONTEXT} == "true" ]]; then \
        bash /scripts/docker/install_from_docker_context_files.sh; \
    fi; \
    if ! airflow version 2>/dev/null >/dev/null; then \
        bash /scripts/docker/install_airflow.sh; \
    fi; \
    if [[ -n "${ADDITIONAL_PYTHON_DEPS}" ]]; then \
        bash /scripts/docker/install_additional_dependencies.sh; \
    fi; \
    find "${AIRFLOW_USER_HOME_DIR}/.local/" -name '*.pyc' -print0 | xargs -0 rm -f || true ; \
    find "${AIRFLOW_USER_HOME_DIR}/.local/" -type d -name '__pycache__' -print0 | xargs -0 rm -rf || true ; \
    # make sure that all directories and files in .local are also group accessible
    find "${AIRFLOW_USER_HOME_DIR}/.local" -executable -print0 | xargs --null chmod g+x; \
    find "${AIRFLOW_USER_HOME_DIR}/.local" -print0 | xargs --null chmod g+rw
  • Installing from requirements.txt placed in docker-context-files
RUN if [[ -f /docker-context-files/requirements.txt ]]; then \
        pip install --no-cache-dir --user -r /docker-context-files/requirements.txt; \
    fi

So any "local" caching of python installation should be done here - across the "airflow-build-image" segment @V0lantis - if you want to pursue it and attempt to do it in the way that we avoid the potential cache invalidation issues I mentioned (we could also add apt-caching as suggested by @hterik - those adding/iterating with new apt dependencies is less likely.

@V0lantis
Copy link
Contributor Author

Thank you @potiuk for this detailed comment on how the Dockerfile is built. I came up with the following (d952c9c5b1):

To invalidate cache between two different python versions, I am using the id param from --mount=type=cache argument. This allow to specify the version of python and the architecture on which it is built. I hope this will solve the issue of cached invalidation you mentioned above.

Result :

=> [airflow-build-image 15/16] RUN --mount=type=cache,[id=dockerhub.containers.mpi-internal.com/python:3.11-slim-bullseye-arm64,target=/home/airflow/.cache/pip,uid=50000](http://id%3Ddockerhub.containers.mpi-internal.com/python:3.11-slim-bullseye-arm64,target=/home/airflow/.cache/pip,uid=50000)   if [[ false == "true" ]]; then         bash /scr  7.7s

@potiuk
Copy link
Member

potiuk commented Oct 23, 2023

Thank you @potiuk for this detailed comment on how the Dockerfile is built. I came up with the following (d952c9c):

To invalidate cache between two different python versions, I am using the id param from --mount=type=cache argument. This allow to specify the version of python and the architecture on which it is built. I hope this will solve the issue of cached invalidation you mentioned above.

Result :

=> [airflow-build-image 15/16] RUN --mount=type=cache,[id=dockerhub.containers.mpi-internal.com/python:3.11-slim-bullseye-arm64,target=/home/airflow/.cache/pip,uid=50000](http://id%3Ddockerhub.containers.mpi-internal.com/python:3.11-slim-bullseye-arm64,target=/home/airflow/.cache/pip,uid=50000)   if [[ false == "true" ]]; then         bash /scr  7.7s

Yep. I like this approach a lot.

If we make it a little more elaborated, I am perfectly ok with it and solves all my caching problems. There are two things I would like to add:

  • We should also add AIRFLOW_PIP_VERSION ARG to be part of the ID
  • Also similarly to
    ARG DEPENDENCIES_EPOCH_NUMBER="10"
    we can add (for example) "PIP_CACHE_EPOCH" arg to be part of the Dockerfile and also make it part of the ID - this way we will be able to invalidate all caches with simply bumping the PIP_CACHE_EPOCH default value or passing a new value with --build-arg

And of course we need to add documentation to https://github.com/apache/airflow/blob/main/docs/docker-stack/build.rst and https://github.com/apache/airflow/blob/main/docs/docker-stack/build-arg-ref.rst and release notes here: https://github.com/apache/airflow/blob/main/docs/docker-stack/changelog.rst (targetting it for 2.7.3)

@potiuk
Copy link
Member

potiuk commented Oct 23, 2023

Also re- architecture - I think it is not really needed. Building prod image on a different architecture is unllikely to happen on the same machine (building the image on ARM for X86 is ~ 16 times slower with emulation so it takes literally hours).

@potiuk
Copy link
Member

potiuk commented Oct 23, 2023

Also re- architecture - I think it is not really needed. Building prod image on a different architecture is unllikely to happen on the same machine (building the image on ARM for X86 is ~ 16 times slower with emulation so it takes literally hours).

On the other hand: I read this one: moby/buildkit#2598 , and yeah we should leave it in since TARGETARCH ARG is conveniently set by default, so it's perfectly OK to leave it in

@V0lantis
Copy link
Contributor Author

On the other hand: I read this one: moby/buildkit#2598 , and yeah we should leave it in since TARGETARCH ARG is conveniently set by default, so it's perfectly OK to leave it in

Good. Thanks for the hint

@V0lantis V0lantis force-pushed the feat/add-caching-while-building-image branch from fbdcf34 to b14f42e Compare October 30, 2023 15:06
@V0lantis
Copy link
Contributor Author

I updated the documentation in b14f42e784 but I am not very familiar with changelog updates. Would you be so kind to show me the way?

Thank so much otherwise for all your hard work and hints @potiuk 💪

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic. Looks great . Thanks for being responsive. For changelog - I think we just need new line here: https://github.com/apache/airflow/blob/main/docs/docker-stack/changelog.rst#airflow-27

Add it for 2.7.3 please so that this change is cherry-pickable and we will have the last chance to merge it (2.7.3 is just being cut).

UPDATE: I will add it to do it faster.

Arthur Volant and others added 6 commits October 30, 2023 16:29
@potiuk potiuk force-pushed the feat/add-caching-while-building-image branch from b14f42e to eafb6be Compare October 30, 2023 15:34
@potiuk potiuk added this to the Airflow 2.7.3 milestone Oct 30, 2023
@V0lantis
Copy link
Contributor Author

I guess similar changes should then be applied to the CI image too. Might be benefitial as well in some cases (same scripts are used in both).

Yes, but I don't understand what makes the CI fail right now.

  docker: Error response from daemon: No such image: ghcr.io/apache/airflow/main/ci/python3.8:4db4408536c0cb4d6dae098cb0fcba574887e7cf.

@potiuk
Copy link
Member

potiuk commented Oct 30, 2023

I guess similar changes should then be applied to the CI image too. Might be benefitial as well in some cases (same scripts are used in both).

Yes, but I don't understand what makes the CI fail right now.

  docker: Error response from daemon: No such image: ghcr.io/apache/airflow/main/ci/python3.8:4db4408536c0cb4d6dae098cb0fcba574887e7cf.

Unfold previous step and see the output of docker build comand - it fails to build. Possibly there is an error somewhere there because (independent from that) in the previous step because it's the previous step that should fail.

Try it locally breeze ci-image build locally to reproduce the failure.

@potiuk
Copy link
Member

potiuk commented Oct 31, 2023

FYI @V0lantis -> you found a bug in build-ci-image with timeout: #35282 . This is why "Build CI image" did not report the problem - only the next step did.

@V0lantis
Copy link
Contributor Author

FYI @V0lantis -> you found a bug in build-ci-image with timeout: #35282 . This is why "Build CI image" did not report the problem - only the next step did.

Nice catch !

@V0lantis
Copy link
Contributor Author

Hey !
I came up with a better solution in 65318ca:

I am passing pip cache directory directly to an env variable (PIP_CACHE_DIR=/tmp/.cache/pip). This allow to avoid passing the following arg: --cache-dir=/tmp/.cache/pip to every pip install command : I could forget one and loose the potential of caching whereas with the globa ENV definition, no forgetting 😲 . I tested it and it seems to be working fine.

Wdyt guys?

@potiuk
Copy link
Member

potiuk commented Oct 31, 2023

Yep.

Hey ! I came up with a better solution in 65318ca:

I am passing pip cache directory directly to an env variable (PIP_CACHE_DIR=/tmp/.cache/pip). This allow to avoid passing the following arg: --cache-dir=/tmp/.cache/pip to every pip install command : I could forget one and loose the potential of caching whereas with the globa ENV definition, no forgetting 😲 . I tested it and it seems to be working fine.

Wdyt guys?

Yep. Good idea.

@potiuk potiuk merged commit 66871a0 into apache:main Oct 31, 2023
62 checks passed
@potiuk
Copy link
Member

potiuk commented Oct 31, 2023

Congrats 🎉

@potiuk potiuk modified the milestones: Airflow 2.7.4, Airflow 2.7.3 Nov 1, 2023
@potiuk potiuk added the type:improvement Changelog: Improvements label Nov 1, 2023
potiuk added a commit to potiuk/airflow that referenced this pull request Nov 1, 2023
The apache#35026 has been cherry-picked to 2.7.3 (with updated changelog)
so we need to also update changelog in main.
@ephraimbuddy ephraimbuddy added type:misc/internal Changelog: Misc changes that should appear in change log and removed type:improvement Changelog: Improvements labels Nov 1, 2023
ephraimbuddy pushed a commit that referenced this pull request Nov 1, 2023
---------

Co-authored-by: Arthur Volant <arthur.volant@adevinta.com>
Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
(cherry picked from commit 66871a0)
potiuk added a commit that referenced this pull request Nov 1, 2023
The #35026 has been cherry-picked to 2.7.3 (with updated changelog)
so we need to also update changelog in main.
romsharon98 pushed a commit to romsharon98/airflow that referenced this pull request Nov 10, 2023
The apache#35026 has been cherry-picked to 2.7.3 (with updated changelog)
so we need to also update changelog in main.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:production-image Production image improvements and fixes type:misc/internal Changelog: Misc changes that should appear in change log
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants