-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow leaf-node image in dependency graph to be built/tested/published on its own #186
Comments
This is not fixed. For example, you cannot build only https://github.com/dotnet/dotnet-docker/blob/master/3.0/sdk/alpine3.9/amd64/Dockerfile in an official build . It will attempt to reference the runtime image from the staging location instead of the MCR. |
The changes needed to implement this in Image Builder require a complicated command interface. The ability to build a partial dependency graph would be better done using Docker's build cache. We'll be deferring to that feature to get this functionality instead. |
Reopening this because I feel this is important to have working as part of dotnet/dotnet-docker#2122. Without these changes, a release of 3.1 would require republishing 5.0 and vice versa. |
OptionsOption 1: Use Image Info DataThis option makes use of the image info data that is available from prior builds to determine whether a Dockerfile needs to be built. Specifically, the Dockerfile commit SHA and base image digest values contained in this data indicate whether the output of the build would produce a different image compared to the previously built version of the image.
Pros:
Cons:
Implementation cost: large Option 2: Use External Cache Source FeatureExternal cache sources allow the builder to reuse layers that were generated from previous builds of an image stored in a registry. Implementation cost: small Option 2a: BuildKit builder engineUsage: $ export DOCKER_BUILDKIT=1
$ docker build -t myname/myapp --build-arg BUILDKIT_INLINE_CACHE=1 .
$ docker push myname/myapp # on another machine
$ docker build --cache-from myname/myapp . Pros:
Cons:
Option 2b: Docker builder engineUsage: $ docker push myname/myapp # on another machine
$ docker build --cache-from myname/myapp . Pros:
Cons:
Option 3: Partial Build Graph SupportThis option allows builds to be executed that explicitly define which portions of the Docker image graph are to be built.
While conceptually this is a simple process, to make this a clean implementation it actually requires a significant change to the infrastructure to be able to express this kind of partial graph. Pros:
Cons:
Implementation cost: large Package DifferencesIn all of these options, there's one aspect that's missing: detecting changes to installed packages. One of the benefits of rebuilding runtime-deps for each release of .NET Core is that will ensure that the latest package versions are available in that image. None of the options presented above can detect this without actually rebuilding the layer that installs the package which defeats the whole purpose being sought here. Basically, this is an orthogonal issue that would be made worse by implementing these changes. There's a separate issue for tracking the package update problem: dotnet/dotnet-docker#1455. Proposed SolutionI'm proposing that we go with option 1. Compared to option 3, it certainly seems like a better option because the detection of diffs is automated, it would have a cleaner UX, and the implementation cost would probably be a wash between the two options. For options 2a and 2b, the lack of Windows support is problematic. Even though progress is being made to add support to Windows, that's still a ways off because of dependencies needed in Windows containers. Option 1 seems to have the least downside. |
Nice proposal write-up @mthalman. I have a couple questions/comments.
|
I've clarified the cons as you pointed out. You're right about BuildKit not being experimental; I misread the documentation on that. |
@mthalman - Can you provide a more detailed breakdown on the cost of option 1 and if there are options within it? |
Option 1 could be implemented in 3 phases. The first phase would provide the complete level of caching that we're looking to obtain from this option but it would have some inefficiency due to potential wasted time spent doing unnecessary testing. The second phase would optimize the testing to only test what is needed. A third phase would optimize the build to trim any build jobs that would result in not producing any new images. Proposed DesignPhase 1BuildThe The TestThis phase proposes no changes to the amount of testing that is done. This means that any images that were result of a cache hit are still tested. This amounts to wasted testing effort since those images were tested when they were originally published. But in the interest of getting a working caching solution up and running quickly, this seems like a reasonable compromise in the short term. Phase 2 details how to resolve this testing inefficiency. When tests are run by the build infra, the PublishThe publish stage should continue to work without any changes since it will only be processing the images that are included in the image info file. Estimate2-3 days Phase 2This phase of work is just about resolving the issue of testing images that do not need to be tested. Phase 1 allowed for previously published images to be pulled and reused but the testing stage would continue to test all images. This is inefficient because images had already been tested when they were originally published. One of the goals in resolving the issue is to determine what needs to be tested when the test matrix is generated in order to avoid spinning up test jobs that may just result in no images being tested. In order to do that, the matrix generation logic will require the image info that was output by the build. That determines which images were built and, when compared against the dependency graph of the Dockerfiles, it can be determined which images are actually cached versions. By providing to the For example, consider a call to the
And let's say that the only new image produced by the build was sdk. The test matrix generation uses the image info to determine that only the sdk image was built by this build. The matrix output that it generates the specifies a test-categories field that indicates which categories should be passed to and executed by the test script, in this case "sdk". Example representation of a leg of the test matrix:
If a test leg results in there being no test categories due to all of the images being the result of cache hits, that test leg is excluded entirely from the matrix. This prevents the execution of test jobs that end up not needing to test anything. At this point, the test logic executes as it always has since it already supports the ability to test subsets of images based on test categories. Estimate2 days Phase 3This phase would optimize the build for cases where a particular build job would end up not producing any new images because all the images it would have been result in cache hits. To do this, the same logic that the Estimate1 day |
Currently all the images in a dependency graph must be built, tested, and published as a set. As an example, this means we can't make a change to the SDK image without also building the runtime-deps image and that means the runtime-deps image will also get published even though it hasn't necessarily changed. Similarly, the SDK image can't be tested without also building the ASP.NET image because the tests require it. This prevents us from making more fine-grained releases. It forces us to publish images that haven't even been changed.
This should be fixed by allowing a leaf-node image to be built on its own. Any dependencies it has should be retrieved from the publicly released images in MCR.
The text was updated successfully, but these errors were encountered: