Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache not found from package installs in previous multistage stages #3246

Open
joeauty opened this issue Jul 11, 2024 · 10 comments
Open

Cache not found from package installs in previous multistage stages #3246

joeauty opened this issue Jul 11, 2024 · 10 comments
Labels
area/apt all bug for apt install related commands area/caching For all bugs related to cache issues area/multi-stage builds issues related to kaniko multi-stage builds kind/bug Something isn't working priority/p1 Basic need feature compatibility with docker build. we should be working on this next.

Comments

@joeauty
Copy link

joeauty commented Jul 11, 2024

Actual behavior
Apt install commands are not being cached running Kaniko as a Kubernetes job, making image builds very slow

Expected behavior
Cache layer hit/being found since, as far as I can tell, there should be no filesystem differences between builds

To Reproduce
Steps to reproduce the behavior:

  1. Run the Kubernetes job:
image: gcr.io/kaniko-project/executor:v1.23.1-debug
command: ["/kaniko/executor"]
args:
  - "--context=/workspace/build/buildkite/${BUILDKITE_ORGANIZATION_SLUG}/${BUILDKITE_PIPELINE_SLUG}"
  - "--destination=[repo URL]"
  - "--cache-repo=[cache repo URL]"
  - "--cache=true"
  - "--cleanup=true"
  - "--ignore-path=.git"

Additional Information

  • Dockerfile
# ---- BASE IMAGE ----
FROM ruby:3.3.3-slim-bullseye as base-image

ENV INSTALL_PATH /data/go
ENV GETTEXT_LOCALES_PATH  $INSTALL_PATH/config/gettext_locales
ENV GETTEXT_CLIENT_LOCALES_PATH $INSTALL_PATH/client/locales
WORKDIR $INSTALL_PATH

RUN apt-get update && apt-get install -y libicu-dev libpq-dev python3-pip python-dev build-essential --no-install-recommends && apt-get clean \
  && pip install --upgrade setuptools pip \
  && pip install awscli \
  && gem update --system 3.5.13 \
  && gem install bundler:2.5.13

# ---- BUILD DEPENDENCIES ----
FROM base-image as build-dependencies

ENV INSTALL_PATH /data/go
ENV NODE_MAJOR 18
WORKDIR $INSTALL_PATH

SHELL ["/bin/bash", "-lc"]

RUN apt-get update && apt-get install -y curl gnupg ca-certificates --no-install-recommends && apt-get clean
COPY ./.tool-versions $INSTALL_PATH

The apt-get update && apt-get install -y curl gnupg ca-certificates --no-install-recommends && apt-get clean command is not being loaded from cache, all other commands up to this point are loaded from cache. The logs show:

No cached layer found for cmd RUN apt-get update && apt-get install -y curl gnupg ca-certificates --no-install-recommends && apt-get clean

This same Dockerfile built in Docker makes use of the cache, making build times much faster.

Triage Notes for the Maintainers

Description Yes/No
Please check if this a new feature you are proposing
Please check if the build works in docker but not in kaniko
Please check if this error is seen when you use --cache flag
Please check if your dockerfile is a multistage dockerfile
@joeauty
Copy link
Author

joeauty commented Jul 11, 2024

Perhaps what might help me here is a more fundamental understanding of how caching works. How does the cache algorithm know what the output of the Dockerfile command is going to be before it is run to know whether the cached layer is valid for the command?

I'm wondering if the issue here has something to do with apt, pip, or gem package lists or the like.

@rcaillon-Iliad
Copy link

Over the past few days, I've also encountered this cache problem in multi-stage builds. It worked fine for months before and I haven't changed anything.

@aaron-prindle aaron-prindle added area/apt all bug for apt install related commands area/caching For all bugs related to cache issues kind/bug Something isn't working priority/p1 Basic need feature compatibility with docker build. we should be working on this next. area/multi-stage builds issues related to kaniko multi-stage builds labels Jul 16, 2024
@ukd1
Copy link

ukd1 commented Aug 12, 2024

I'm also getting this, and I can see the cache hash it's looking for -does- exist, but it's not pulled:

...
INFO[2024-08-12T02:44:10Z] Checking for cached layer 
registry.gitlab.xxxx.xxxx/xxxx/cache:68130d05ac234eaae199bd7052a9898bb4df73c3517b830fb2e98923e488fcc3... 
INFO[2024-08-12T02:44:10Z] No cached layer found for cmd RUN apt-get update -qq &&     apt-get install --no-install-recommends -y build-essential curl git libpq-dev libvips pkg-config unzip 
...

68130d05ac234eaae199bd7052a9898bb4df73c3517b830fb2e98923e488fcc3 exists, but isn't used:

image

@leeeunsang-tmobi
Copy link

leeeunsang-tmobi commented Aug 21, 2024

The tag keep changing even though there are no changes.
When I run the docker build command with the same Dockerfile, cache layer works expected.

  • kaniko-project/executor:v1.18.0-debug
[36mINFO�[0m[0017] Executing 0 build triggers                   
�[36mINFO�[0m[0018] Building stage 'base' [idx: '1', base-idx: '0'] 
�[36mINFO�[0m[0018] Checking for cached layer xxxxxxxxxxxxxxxxxxxxxxxxxxxx:39478ea256ca812a762b7e6c93725c317e9f646dd50a3d105f91bc87cc690958... 
�[36mINFO�[0m[0018] No cached layer found for cmd RUN xxxxxxxxxxxxxxxxxxxxxxxxxxxx

The analysis is wrong. Ignore the following.

  • kaniko-project/executor:v1.18.0-debug
  • aws ecr
[36mINFO�[0m[0022] Checking for cached layer xxxxxxxxxxxxxxxxxxxxxxxxxxxx:afcca8762ebd17a73cf848ea14994828b0a71fa6bf103135a70b3e1844ebdb2d... 
[36mINFO�[0m[0022] No cached layer found for xxxxxxxxxxxxxxxxxxxxxxxx
스크린샷 2024-08-21 오전 11 31 57

@hleal18
Copy link

hleal18 commented Aug 27, 2024

Seems related to #3254

Our tests revealed that using WORKDIR on a multi-stage build causes this issue, specially with RUN instructions for apt-get update/install commands like:

...
WORKDIR /app

FROM base as build

RUN apt-get update -qq && \
    apt-get install --no-install-recommends -y build-essential curl git libpq-dev node-gyp pkg-config python-is-python3

...

A workaround that worked for us was to either remove the workdir directive, or duplicate it across all stages. After that the RUN instruction started using the cache correctly, as previously a different hash was being generated even when nothing changed.

This was definitely not an issue before.

@joeauty
Copy link
Author

joeauty commented Aug 27, 2024

Seems related to #3254

Our tests revealed that using WORKDIR on a multi-stage build causes this issue, specially with RUN instructions for apt-get update/install commands like:

...
WORKDIR /app

FROM base as build

RUN apt-get update -qq && \
    apt-get install --no-install-recommends -y build-essential curl git libpq-dev node-gyp pkg-config python-is-python3

...

A workaround that worked for us was to either remove the workdir directive, or duplicate it across all stages. After that the RUN instruction started using the cache correctly, as previously a different hash was being generated even when nothing changed.

This was definitely not an issue before.

Unfortunately this did not fix my issue, unless there is some problem with using a variable to as an assignment WORKDIR?

# ---- BASE IMAGE ----
FROM ruby:3.3.4-slim-bullseye as base-image

ENV INSTALL_PATH /data/go
ENV GETTEXT_LOCALES_PATH  $INSTALL_PATH/config/gettext_locales
ENV GETTEXT_CLIENT_LOCALES_PATH $INSTALL_PATH/client/locales
WORKDIR $INSTALL_PATH

RUN apt-get update && apt-get install -y libicu-dev libpq-dev python3-pip python-dev build-essential --no-install-recommends && apt-get clean \
  && rm -rf /var/lib/apt/lists/* \
  && pip install --upgrade setuptools pip \
  && pip install awscli \
  && pip cache purge \
  && gem update --system 3.5.13 \
  && gem install bundler:2.5.13

# ---- BUILD DEPENDENCIES ----
FROM base-image as build-dependencies

ENV INSTALL_PATH /data/go
ENV NODE_MAJOR 20
WORKDIR $INSTALL_PATH

SHELL ["/bin/bash", "-lc"]

RUN apt-get update && apt-get install -y curl gnupg ca-certificates --no-install-recommends && apt-get clean

@nielsavonds
Copy link

We ended up moving the WORKDIR directive after any RUN directive wherever possible and it resolved it for us.

@joeauty
Copy link
Author

joeauty commented Oct 8, 2024

We ended up moving the WORKDIR directive after any RUN directive wherever possible and it resolved it for us.

Unfortunately, that does not work for me. Of course, I'm stating the obvious, but being able to drop in Kaniko without touching the Dockerfile at all would be ideal.

@mzihlmann
Copy link

mzihlmann commented Oct 11, 2024

make sure that the directory exists before calling WORKDIR.
If the directory does not exist kaniko is kind enough to create it for you, but not kind enough to also put that layer into cache (come to think of it I should probably open a bug ticket for that). Which means that a new layer is emitted every time you pass the workdir instruction. Inside the same build it's not immediately obvious as you will get 100% cache hitrate, however all the layers are new so push will be slower and you will pull a completely new image thereafter. In multistage builds, or builds that run on top of other images created with kaniko this causes huge problems, as now the cache gets invalidated for them.
workaround is simple enough:

RUN mkdir $INSTALL_PATH
WORKDIR $INSTALL_PATH

@mzihlmann
Copy link

there you go #3340

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/apt all bug for apt install related commands area/caching For all bugs related to cache issues area/multi-stage builds issues related to kaniko multi-stage builds kind/bug Something isn't working priority/p1 Basic need feature compatibility with docker build. we should be working on this next.
Projects
None yet
Development

No branches or pull requests

8 participants