-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mount cache directory inside build container #969
Comments
We do have a cache warmer feature for caching base layers. The intermediate layers are caches remotely. Are you looking for caching layers locally? Would that help? |
Layer cache is invalidated on any change so wanted to mount shared directory and keep for example python pip cache there, so it doesn't require to fetch that from network even one dependency has changed. Something like -v for docker to mount a directory/image (which is also not supported for |
@urbaniak going on the example of pip caching (which I don't know a ton about) I'm assuming pip looks for certain directories to check if a cache already exists. I would imagine you could just mount a volume at that directory into the kaniko docker container. IIRC kaniko has some special handling for mounted volumes so I don't think it would cause issues for the image build if those cache files aren't directly referenced by the docker build, but I'm not positive. In any case, I see no reason why it shouldn't work so if it doesn't we can certainly look into it |
I second this need, what you explained @cvgw is correct:
When using kaniko, the build itself has no access to the host filesystem, so it would be neat to be able to instruct kaniko to mount a certain directory into the build container so that pip and co can take advantage of it. |
What about supporting RUN --mount=type=cache? |
@glen-84 I'm not exactly sure how this feature interact with the host filesystem? Because the use case I had in mind take advantage of the fact that the host filesystem (i.e., the CI machine) will have a directory with the cache available. I'm not sure if |
If I understand correctly, the Docker engine on the host would manage the directory (sort of like a named volume). If you follow the link, they show an example of caching Go packages. See also https://stackoverflow.com/a/57934504/221528. (PS. You might want to consider up-voting this issue to indicate your interest.) |
@glen-84 ok, so in the case of kaniko, since there are no docker engine on the host, I suspect it would mean adding an option to the kaniko cli to specify which directory to use when Still there would be some thinking to do as to how this interacts with the |
My thoughts were to "reuse" the syntax, as opposed to designing something completely specific to Kaniko. Kaniko could be updated to understand these types of mounts, and to manage the directories outside of the image (i.e. inside the Kaniko container). |
@glen-84 yes, that was my point too :) |
podman build -v /tmp/cache:/cache . Using this with I would love to see this supported in kaniko. |
I've indicated my interest. Locally we build using Docker (it is just easier, sorry), for use in a e2e test using Docker Compose. Using Contrarily, on CI/CD we build using Kaniko before we ship. However, if our Dockerfile specifies this experimental syntax, it breaks the Kaniko build:
Therefore, to use BuildKit, I need to maintain 2 separate Dockerfiles, one for Kaniko and one for Docker BuildKit. This is cumbersome. A great first step would be if Kaniko didn't choke on the syntax, and adding support would be even greater! |
At our project, we're also interested in this. We have a similar scenario as @hermanbanken: we build using Docker locally and build with Kaniko before we ship, in our CI/CD pipeline. In our case, however, we use As @hermanbanken said, It'd be great if Kaniko didn't choke on the syntax -- also suggested in #1568 (comment) -- and even better if it supported it. This is a very relevant issue for us and I can imagine that there are a lot more scenarios that would benefit from this 😄 Dockerfile
|
also interested in this feature, that's what's missing for me to move from buildkit to kaniko atm |
Also interested in this feature, this will speed up our builds |
This is a must have feature. At minimum it should not err when buildkit flags are present. Right now i use a step that runs sed on the dockerfile to remove those options but I can not do this from skaffold (skaffold generates a single step). |
Hey folks, kaniko is in maintenance mode and I am not actively working on it. |
im also very interested in this feature. |
Mounting a local directory as a volume inside Kaniko is a valid use case and can be achieved with a little bit of overhead. In Google Cloud Build the working directory is already automatically mounted in the Kaniko (and any other container) under If you want to mount additional directories (e.g. because you want the cache in a specific location) you can always mount a volume in Kaniko, for example with:
at this point we just need to pre-populate the volume with the cache content, for example downloading it from a bucket:
or from a local directory:
Hope this helps until Cloud Build will allow to mount a local directory as a volume (maybe it already does, but I wasn't able to find a way). |
@tejal29 I'm a bit motivated to do something regarding this but I'm a bit lost in the code base. |
FYI Buildah >= 1.24 (which is shipped with Podman >= 4) supports RUN --mount=type=cache. |
Is there a solution out for this? Looks like it's been a year since the last commenter, but I can't imagine why there wouldn't be support for cache mounts? |
I was able to get around this limitation using ARG BUILD_ENV=local
# Install poetry and setup environment
# This is done in an upstream stage because the ONBUILD COPY will be inserted directly after
# the FROM of the downstream build and we don't want to have to re-install poetry every time
# the cache is downloaded (every build).
# see https://docs.docker.com/engine/reference/builder/#onbuild
FROM python:3.7-slim-bullseye as poetry
ENV POETRY_VERSION=1.5.0
RUN pip install poetry==${POETRY_VERSION}
ENV VIRTUAL_ENV /venv
RUN python -m venv ${VIRTUAL_ENV}
ENV PATH="${VIRTUAL_ENV}/bin:${PATH}"
ENV POETRY_NO_INTERACTION=true \
POETRY_VIRTUALENVS_IN_PROJECT=false \
POETRY_VIRTUALENVS_PATH=${VIRTUAL_ENV} \
POETRY_VIRTUALENVS_CREATE=false
# If running on CI, copy the poetry cache from the GitLab project root to the container
FROM poetry as poetry_ci
ONBUILD COPY .cache/pypoetry /root/.cache/pypoetry/
# If running on local, don't do anything
FROM poetry as poetry_local
# Install the project
FROM poetry_${BUILD_ENV} as venv
COPY pyproject.toml poetry.lock ./
RUN touch README.md && \
poetry install --only main
# Build final image
FROM python:3.7-slim-bullseye as final
ENV PATH="/venv/bin:${PATH}"
COPY --from=venv /venv /venv
# Copy in the app, set user, entrypoint, etc In GitLab: build:
stage: build
image:
name: gcr.io/kaniko-project/executor:debug
entrypoint: [""]
variables:
POETRY_CACHE_DIR: /root/.cache/pypoetry
PROJECT_POETRY_CACHE_DIR: ${CI_PROJECT_DIR}/.cache/pypoetry
cache:
- key: ${CI_JOB_NAME}
paths:
- ${PROJECT_POETRY_CACHE_DIR}
script:
- mkdir -p ${PROJECT_POETRY_CACHE_DIR}
- /kaniko/executor --skip-unused-stages=true --cache=true --context ${CI_PROJECT_DIR} --dockerfile ${CI_PROJECT_DIR}/Dockerfile --destination <destination> --build-arg BUILD_ENV=ci --ignore-path ${POETRY_CACHE_DIR}
- rm -rf ${PROJECT_POETRY_CACHE_DIR}
- cp -a ${POETRY_CACHE_DIR} ${PROJECT_POETRY_CACHE_DIR} The |
Another alternative that we are using for per-process caching which doesn't require the cache to be filled in advance (however it does not work for the first install of each project, but we are ok with that, it can be mixed with @trevorlauder approach). We just store somewhere the last image built (SSM parameters) for each project. And then using multistage copy we just copy the cache folder (poetry in our case) from the latest-known valid image (in our case we just copy the venv to speed it up even more). PREVIOUS_IMAGE can be scratch or the "name" of the previous built image for that project.
|
Need support for |
Any update on this? |
This is a must needed feature still missing, which is making it hard to choose Kaniko. |
I don't think we actually need to copy the cache folder explicitly as in @trevorlauder's approach above.
All in all, to make it work, just tell Gitlab CI to mount the cache as usual into the host container, then use it inside the Dockerfile during build (I am using build args to pass the cache path). Here is my stripped-down config for employing a pip cache during kaniko build within Gitlab CI. gitlab-ci.yml build_and_publish_container:
stage: build
image:
name: gcr.io/kaniko-project/executor:v1.23.1-debug
entrypoint: [""]
cache:
paths:
- .cache/pip
script:
# build and push container, pass cache folder as build arg
- /kaniko/executor
--context "${CI_PROJECT_DIR}"
--build-arg "PIP_CACHE_DIR=${CI_PROJECT_DIR}/.cache/pip"
--destination "${CI_REGISTRY_IMAGE}:latest" Dockerfile FROM python:3.12-slim
# Path to pip cache on host container
ARG PIP_CACHE_DIR
COPY requirements.txt /app/
WORKDIR /app
RUN pip install -r requirements.txt For compatibility with local docker builds, I believe one could add |
Thanks to @pe224 i followed his approach and it worked. Let me add one more thing when you source code is located in the root of ${CI_PROJECT_DIR} for gitlab and you also have a test job for instance before the build job and you want m2 cache to be shared between jobs. if you do something like this for a java maven project ARG CI_PROJECT_DIR
COPY . /app
WORKDIR /app
RUN mvn clean package -Dmaven.repo.local=${CI_PROJECT_DIR}/.m2/repository Then you will also copy the m2 cache in a docker layer and not use the host one this means it will not be updated on the host and not cached by gitlab runner. What i did on my side is move my code to code folder and do this instead ARG CI_PROJECT_DIR
COPY code/ /app
WORKDIR /app
RUN mvn clean package -Dmaven.repo.local=${CI_PROJECT_DIR}/.m2/repository if you don't want to have code folder i assume you can still do something like this ARG CI_PROJECT_DIR
COPY pom.xml settings.xml /app
COPY src/ /app/src
WORKDIR /app
RUN mvn clean package -Dmaven.repo.local=${CI_PROJECT_DIR}/.m2/repository This way the cache stays on the host and will be updated during the build. |
Trying to figure out some way of sharing cache between builds, though about mounting some directory like /cache inside the build container, so we can then have a shared cache there for things like pip, npm or cargo.
Will be possible to implement something like that?
The text was updated successfully, but these errors were encountered: