-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/go: unclear how to cache transitive dependencies in a Docker image #27719
Comments
I don't think all makes sense in a non GOPATH world; previously all
expanded to GOPATH/src/..., taking into account that GOPATH may be a list.
I think you should use ./...
…On 18 September 2018 at 07:40, Greg Wedow ***@***.***> wrote:
What version of Go are you using (go version)?
go version go1.11 linux/amd64
Does this issue reproduce with the latest release?
yes
What did you do?
I'm attempting to populate a Docker cache layer with compiled dependencies
based on the contents of go.mod. The general recommendation with Docker
is to use go mod download however this only provides caching of sources.
go build all can be used to compile these sources but instead of relying
on go.mod contents, it requires my application source to be present to
determine which deps to build. This causes a cache invalidation on every
code change and renders the step useless.
Here's a Dockerfile demonstrating my issue:
FROM golang:1.11-alpineRUN apk add git
ENV CGO_ENABLED=0 GOOS=linux
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
# this failsRUN go build all# => go: warning: "all" matched no packages
COPY . .
# this now works but isn't neededRUN go build all
# compile app along with any unbuilt depsRUN go build
From package lists and patterns
<https://golang.org/cmd/go/#hdr-Package_lists_and_patterns>:
When using modules, "all" expands to all packages in the main module and
their dependencies, including dependencies needed by tests of any of those.
where the main module
<https://golang.org/cmd/go/#hdr-The_main_module_and_the_build_list> is
defined by the contents of go.mod (if I'm understanding this correctly).
Since "the main module's go.mod file defines the precise set of packages
available for use by the go command", I would expect go build all to rely
on go.mod and build any packages listed within.
Other actions which support "all" have this issue but some have flags
which resolve it (go list -m all).
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#27719>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAcAyLHppxayNq5m1rql-f_LWe-13YDks5ucBbmgaJpZM4Wsxed>
.
|
Thanks Dave, |
Can you explain more about your use case? Perhaps with an example repo to demonstrate what you are doing. Thanks.
… On 18 Sep 2018, at 13:27, Greg Wedow ***@***.***> wrote:
Thanks Dave, go build ./... is a bit of an improvement since it doesn't include the test dependencies that all does. However it still requires my application source to be present and gives go: warning: "./..." matched no packages if run with only go.mod and go.sum present.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
For sure. I've found in most previous projects that dependency build times are fast enough to not be an issue so in the end the existing behaviour is probably fine. Part of my current project is the creation of a custom Terraform Provider for managing some of our internal systems. Building the Terraform packages only happens once locally so not a big deal, but they need to be rebuilt every time a new docker image is built. When these packages are already compiled, Some time can be saved by using Based on the existing module documentation, I would expect the We do similar things with projects in other languages for building Docker images. The flow is generally:
This lets us avoid having to rebuild dependencies on every commit. It would be nice if this could be replicated with the Go module system. Here's an example repo: https://github.com/wedow/docker-go-build To see the issue we're having, clone it and run |
I think what you're after here is:
This will populate the build cache ( If you want to install
|
Yes, that is working as designed: in module mode,
If the code changes are only in your The build artifact cache is separate from the module cache: the former is controlled by Can you confirm that both the build cache and the module cache are present and populated in your docker image after the first |
Thanks guys, I think there may be some confusion about which caches are being affected and when. The issue is in how docker caches layers after each operation. When my source files are changed, all side effects which occur after the The I'm looking for a command which can compile the dependencies listed in the |
I'm unclear why you say it must come after the copy - please can you explain?
|
@myitcv, note that |
You have to export both Example: FROM golang:1.11-alpine AS mod
RUN apk add -U git
WORKDIR /src
COPY go.mod .
COPY go.sum .
RUN go mod download
FROM golang:1.11-alpine
COPY --from=mod $GOCACHE $GOCACHE
COPY --from=mod $GOPATH/pkg/mod $GOPATH/pkg/mod
WORKDIR /src
COPY . .
RUN go build |
this has the disadvantage of being a large image (because it's based on golang:alpine and not alpine, plus the fact that building the project creates various cache files, though these cache files are the things we need in order to speed up compiling the project on the main go build line. most of the compilation time is spent on the dependenent libraries (i.e. discordgo, crypto, stdlib, etc). there is not a functional method of compiling dependencies from a bare go.mod and go.sum file[1], you must have a valid go project in the directory for go build all to work, at which point the docker layer cache has been invalidated by `COPY . /app`. the proposed `go list -export $(go list -m)/...` does not compile all dependencies either, showcased by checking the size of $GOCACHE before and after running `go build all` without doing funky stuff like bind-mounting a volume container into the build container[2], inflating the image size for faster compiling seems to be the best tradeoff, as the image will only stay local anyway [1] golang/go#27719 [2] https://github.com/banzaicloud/docker-golang
this has the disadvantage of being a large image (because it's based on golang:alpine and not alpine, plus the fact that building the project creates various cache files, though these cache files are the things we need in order to speed up compiling the project on the main go build line. most of the compilation time is spent on the dependenent libraries (i.e. discordgo, crypto, stdlib, etc). there is not a functional method of compiling dependencies from a bare go.mod and go.sum file[1], you must have a valid go project in the directory for go build all to work, at which point the docker layer cache has been invalidated by `COPY . /app`. the proposed `go list -export $(go list -m)/...` does not compile all dependencies either, showcased by checking the size of $GOCACHE before and after running `go build all` without doing funky stuff like bind-mounting a volume container into the build container[2], inflating the image size for faster compiling seems to be the best tradeoff, as the image will only stay local anyway [1] golang/go#27719 [2] https://github.com/banzaicloud/docker-golang
@myitcv the so we're looking for a way to get the stuff listed in think of it as a 2 phase build |
@dbudworth it doesn't really seem possible to do what we're looking to do with the currently available tooling. I came up with a hacky workaround to get the results I was looking for and just updated my example repo to illustrate it. The basic idea is the use of a dummy import file which can trigger the compilation of dependencies when run through While I'd much prefer a way to compile dependencies separate from application code as part of the official toolchain, this method does dramatically reduce subsequent docker image build times for our project and has really sped up our CI process. |
Same issue here. This would be somewhat easier to handle if |
Would it be possible to add a |
@benweissmann, that seems like it would have significant overlap with |
The Go build cache is content-addressed, and contains intermediate artifacts. If you are correctly storing the build cache (as @hinshun describes), then it should not recompile dependencies whose sources are unchanged.
You can use |
Please try the above approach (saving both |
@bcmills I'm kind of at a loss on how to explain the issue in a different way. The Similarly, @hinshun's approach of copying You mention an Personally, |
Yes, you'd need to prime the cache in your Docker image from a specific version of your application source, and changing that source would invalidate the image caching. (I suspect that you could discard that source from the final image, but I don't use Docker much so I'm a bit fuzzy on the details.) You could also use
|
Following the advice given by @hinshun, @nicollecastrog and @arjunpur, I made a PR to Kubeapps that I think solves this exact problem. Here is the # syntax = docker/dockerfile:experimental
FROM golang:1.13 as builder
WORKDIR /go/src/github.com/kubeapps/kubeapps
COPY go.mod go.sum ./
COPY vendor vendor
COPY pkg pkg
COPY cmd cmd
ARG VERSION
# With the trick below, Go's build cache is kept between builds.
# https://github.com/golang/go/issues/27719#issuecomment-514747274
RUN --mount=type=cache,target=/go/pkg/mod \
--mount=type=cache,target=/root/.cache/go-build \
CGO_ENABLED=0 go build -installsuffix cgo -ldflags "-X main.version=$VERSION" ./cmd/tiller-proxy
FROM scratch
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=builder /go/src/github.com/kubeapps/kubeapps/tiller-proxy /proxy
EXPOSE 8080
CMD ["/proxy"] Compared to having |
More simple way: FROM golang:1.13-alpine
RUN apk update \
&& apk add --no-cache git
WORKDIR /attacker
COPY ./go.mod .
RUN go mod graph | cut -d '@' -f 1 | cut -d ' ' -f 2 | sort | uniq | tr '\n' ' ' | xargs go get -v
COPY . .
RUN CGO_ENABLED=0 go test -c
CMD ./attacker.test |
Took me a while to find out that such a simple thing that all other languages that I ever used made me take for granted can't be done in Go natively. It is quite a pain having to switch our entire CI ecosystem because of a missing command option, but it seems like the only way for the modules and caching to have any meaning besides allowing code outside of |
@Fryuni, what is the “missing command option” to which you refer? Most of the recent progress on this issue has been folks figuring out the proper |
@bcmills A command to build the dependency cache. Or a flag to do it with In Python, for example,
The same is equally simple in Node, Java, Ruby, etc.
Everyone is figuring a way to use other docker feature to compensate this missing feature. Using the experimental features to sidestep docker layer architecture just to inject a cache along with a RUN command. That is exactly why we are having to change our CI ecosystem. We currently use managed solutions, but those (very wisely) do not allow experimental features to be enabled on dockerd on your CI pipeline. We are changing to a self-hosted solution in order to use them. |
Agree with @Fryuni. FWIW my first attempt was to do this with |
I'd like to avoid experimental docker features to cache go module compilation artifacts, so I tried @Feresey's approach of using
However, this failed on this
To repro without a go mode file: $ go get gopkg.in/DataDog/dd-trace-go.v1
go get gopkg.in/DataDog/dd-trace-go.v1: no Go source files I'm not sure what's going on here. All the actual imports are |
I also posted this to StackOverflow: https://stackoverflow.com/questions/60200363/create-docker-container-to-run-go-test-with-all-module-dependencies-downloaded |
@jschaf, Packages are not 1:1 with modules: a module contains packages — often many of them, and often many that are not going to be relevant to building the packages in your module. That's why much of the discussion above (for example, #27719 (comment)) focuses on packages rather than modules. It's also why a flag to |
Honestly, prebuilding a cache that has more than what I'm gonna need is way better then not building any cache at all. After all, that is the build image, having extra data there is not a problem, is expected. The final binary should be moved to another image in a multi-stage build, as per best practices to have small docker images at the end. Also, I never expected it to build only the cache of what I'm going to use, but the cache of the dependencies declared, whether my code use them or not. This is cache done before the code is added to the image, it obviously cannot optimize for the code. Similar to what happen with typescript, you install all your dependencies and transitive dependencies entirely, but when you compile it to JS it only includes what is actually used. |
I'm running into this now that we've switched to using modules, whereas before we could use:
Now it seems our only option is to use experimental docker engine that isn't supported by our CI or some of our devs machines or live with slow builds. I think there's been a lot of confusion about docker cache vs build cache and go module source cache vs go module build cache. To reiterate the issue for @bcmills what we all really want is:
or
This would allow us leverage existing docker versions non-experimental caching layers that have been around forever the same way we use it to avoid re-downloading the modules source every time we build an image. For example here is an example FROM golang:1-alpine AS build
ARG COMMIT_HASH
WORKDIR /example-app
COPY ./go.mod ./go.mod
COPY ./go.sum ./go.sum
RUN go mod download
COPY ./*.go ./
ENV GOARCH=amd64
ENV CGO_ENABLED=0
ENV GOOS=linux
RUN go build -o example .
FROM scratch
WORKDIR /app
ENV PATH=/bin/
COPY --from=build /example-app/example ./example
ENTRYPOINT ["./example"] |
Correct me if I'm wrong but isn't the requested extension to the go mod download --json | jq -r '"\(.Path)@\(.Version)"' | xargs go get -v This obviously depends on jq to transform the JSON output so it would still be nice to have it built into the Example of usage in a Dockerfile: FROM golang:1.14-alpine AS build
WORKDIR /go/src/app
ENV CGO_ENABLED=0
RUN apk add --no-cache jq
COPY go.mod go.sum ./
RUN go mod download --json | jq -r '"\(.Path)@\(.Version)"' | xargs go get -v
COPY . .
RUN go build -o /go/bin/app
FROM gcr.io/distroless/base
COPY --from=build /go/bin/app /
ENTRYPOINT ["/app"] |
@futek unfortunately that doesn't always work and results in this:
reproduction repo: https://github.com/montanaflynn/golang-docker-cache |
@montanaflynn That looks like broken software you're trying to build, not an issue with the jq kludge.
That would be very simple to write as a |
@tv42 If you remove the line:
Then it works and actually downloads far less dependencies, presumably just what's needed for the resulting binary. I think that there are edge cases and associated logic that is included in the Example Dockerfile and There are other one-liner shell solutions in this comment thread as well that try to use the dependencies from
Which kind of worked for my reproduction, except while it installed even more dependencies than Example Dockerfile and I think these one-liner combinations of |
Right, it appears that passing all indirect dependencies to go mod graph | grep "^$(go mod edit -json | jq -r .Module.Path) " | cut -d ' ' -f 2 | xargs go get -v (i.e. grab the module name from go.mod and use it to filter direct dependencies in the output of
Now that I look at this one again it seems like it's trying to do exactly the same by relying on the fact that the root package doesn't have a version suffix ( go mod graph | cut -d '@' -f 1 | cut -s -d ' ' -f 2 | xargs go get -v Building on that, this is the "simplest" version I can come up with that also retains the version: go mod graph | grep -v '@.*@' | cut -d ' ' -f 2 | xargs go get -v I'm sure there are a lot of ways to do this (which could break in various subtle ways) so I'm still voting for an official |
@futek I appreciate the thought but that command fails for drone's dependencies.
Even after installing
When removing For some projects it can certainly improve the docker built time dramatically but it doesn't work everywhere or for every project. I'll still be using it for a few projects where I know it works with their dependencies, in some cases I'm seeing a 10x docker image build speed! By the way I think this might be a little simpler to understand and only requires a single pipe to
|
I just spent time on this issue as well, and none of the workarounds works. Can't we just have a command to do this? |
Significantly speeds up repeated go builds. Ref: golang/go#27719
I'm in the same boat. I'm currently using the By the way, the go mod graph | awk '$1 !~ /@/ { print $2 }' | xargs -r go get I added the |
Significantly speeds up repeated go builds. Ref: golang/go#27719
Isn't |
If you're using COPY go.mod go.sum .
COPY vendor vendor
RUN go build ./vendor/... |
While Having the ability to produce a layer with just (That said, the other thing that would help would be enabling sharing the |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
What did you do?
I'm attempting to populate a Docker cache layer with compiled dependencies based on the contents of
go.mod
. The general recommendation with Docker is to usego mod download
however this only provides caching of sources.go build all
can be used to compile these sources but instead of relying ongo.mod
contents, it requires my application source to be present to determine which deps to build. This causes a cache invalidation on every code change and renders the step useless.Here's a Dockerfile demonstrating my issue:
From package lists and patterns:
where the main module is defined by the contents of
go.mod
(if I'm understanding this correctly).Since "the main module's go.mod file defines the precise set of packages available for use by the go command", I would expect
go build all
to rely ongo.mod
and build any packages listed within.Other actions which support "all" have this issue but some have flags which resolve it (
go list -m all
).The text was updated successfully, but these errors were encountered: