Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Fix ci/docker_cache.py #18056

Merged
merged 2 commits into from
Apr 15, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion ci/build.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ def get_docker_binary(use_nvidia_docker: bool) -> str:


def build_docker(platform: str, docker_binary: str, registry: str, num_retries: int, no_cache: bool,
cache_intermediate: bool) -> str:
cache_intermediate: bool = False) -> str:
Copy link
Contributor

@marcoabreu marcoabreu Apr 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be true. Otherwise, a single change will result in a full rebuild, drastically increasing build times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind changing it to True, but the existing behaviour before introducing the flag was False.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue I noted during local development when setting this to True, is that my disk can fill up very quickly because Docker will not clean any of the cached layers automatically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well there has been some back and forth between refactors - maybe somebody removed it :/. Initially, I designed it that way that intermediary layers were kept since that's the whole point of the partial cache. This test validates that assumption: https://github.com/apache/incubator-mxnet/blob/master/ci/test_docker_cache.py#L184

Note that this test suite is not run in CI...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well you can't have everything - either partial caching and manual cleanup or automatic cleanup but no partial caching. I think it is fine to expect that people use docker purge every now and then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it should be possible. It will just cause issues if the test fails since that might mean that stuff gets left dangling, but due to the mentioned lifecycle, that shouldn't have a big impact.

Feel free to enable it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I can't fully recall what was the thing around --rm=false, but I recall that I wrote tests to cover all these unknowns. Rule of thumb is that if the tests pass, we're good to go (given they use the same parameters of course).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI does not docker purge automatically. But since the lifecycle of unix slaves is very very short, it's not an issue

If the intermediate containers are only kept on the slave, and we only generate them in PRs that change the Dockerfile, I think we don't need to use them at all on CI. --rm=false then seems only useful for development.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep exactly. But please make sure that the tests pass before merging - otherwise we might break the partial caching.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR does not change the status quo. There is no risk.

"""
Build a container for the given platform
:param platform: Platform
Expand Down
3 changes: 3 additions & 0 deletions ci/docker/Dockerfile.build.ubuntu_nightly_cpu
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@ RUN /work/ubuntu_perl.sh
COPY install/ubuntu_clang.sh /work/
RUN /work/ubuntu_clang.sh

COPY install/ubuntu_gcc8.sh /work/
RUN /work/ubuntu_gcc8.sh

COPY install/ubuntu_caffe.sh /work/
RUN /work/ubuntu_caffe.sh

Expand Down
3 changes: 3 additions & 0 deletions ci/docker/Dockerfile.build.ubuntu_nightly_gpu
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@ RUN /work/ubuntu_perl.sh
COPY install/ubuntu_clang.sh /work/
RUN /work/ubuntu_clang.sh

COPY install/ubuntu_gcc8.sh /work/
RUN /work/ubuntu_gcc8.sh

COPY install/ubuntu_tvm.sh /work/
RUN /work/ubuntu_tvm.sh

Expand Down
1 change: 1 addition & 0 deletions ci/docker/runtime_functions.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2084,6 +2084,7 @@ publish_scala_build() {
pushd .
scala_prepare
source /opt/rh/devtoolset-7/enable
export USE_SYSTEM_CUDA=1
./ci/publish/scala/build.sh
popd
}
Expand Down
2 changes: 1 addition & 1 deletion julia/deps/build.jl
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ if haskey(ENV, "MXNET_HOME")
# In case of macOS, if user build libmxnet from source and set the MXNET_HOME,
# the output is still named as `libmxnet.so`.
lib = Libdl.find_library(["libmxnet.$(Libdl.dlext)", "libmxnet.so"],
[joinpath(MXNET_HOME, "lib"), MXNET_HOME])
[joinpath(MXNET_HOME, "lib"), joinpath(MXNET_HOME, "build"), MXNET_HOME])
if !isempty(lib)
@info("Existing libmxnet detected at $lib, skip building...")
libmxnet_detected = true
Expand Down
1 change: 1 addition & 0 deletions julia/src/base.jl
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ const grad_req_map = Dict{Symbol,GRAD_REQ}(
################################################################################
const MXNET_LIB = Libdl.find_library(["libmxnet.$(Libdl.dlext)", "libmxnet.so"], # see build.jl
[joinpath(get(ENV, "MXNET_HOME", ""), "lib"),
joinpath(get(ENV, "MXNET_HOME", ""), "build"),
get(ENV, "MXNET_HOME", ""),
joinpath(@__DIR__, "..",
"deps", "usr", "lib")])
Expand Down