-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
Hey @leezu , Thanks for submitting the PR
CI supported jobs: [edge, windows-cpu, centos-gpu, unix-cpu, unix-gpu, website, centos-cpu, sanity, windows-gpu, clang, miscellaneous] Note: |
@@ -71,7 +71,7 @@ def get_docker_binary(use_nvidia_docker: bool) -> str: | |||
|
|||
|
|||
def build_docker(platform: str, docker_binary: str, registry: str, num_retries: int, no_cache: bool, | |||
cache_intermediate: bool) -> str: | |||
cache_intermediate: bool = False) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has to be true. Otherwise, a single change will result in a full rebuild, drastically increasing build times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind changing it to True
, but the existing behaviour before introducing the flag was False
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One issue I noted during local development when setting this to True
, is that my disk can fill up very quickly because Docker will not clean any of the cached layers automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well there has been some back and forth between refactors - maybe somebody removed it :/. Initially, I designed it that way that intermediary layers were kept since that's the whole point of the partial cache. This test validates that assumption: https://github.com/apache/incubator-mxnet/blob/master/ci/test_docker_cache.py#L184
Note that this test suite is not run in CI...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well you can't have everything - either partial caching and manual cleanup or automatic cleanup but no partial caching. I think it is fine to expect that people use docker purge every now and then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it should be possible. It will just cause issues if the test fails since that might mean that stuff gets left dangling, but due to the mentioned lifecycle, that shouldn't have a big impact.
Feel free to enable it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I can't fully recall what was the thing around --rm=false, but I recall that I wrote tests to cover all these unknowns. Rule of thumb is that if the tests pass, we're good to go (given they use the same parameters of course).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI does not docker purge automatically. But since the lifecycle of unix slaves is very very short, it's not an issue
If the intermediate containers are only kept on the slave, and we only generate them in PRs that change the Dockerfile, I think we don't need to use them at all on CI. --rm=false
then seems only useful for development.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep exactly. But please make sure that the tests pass before merging - otherwise we might break the partial caching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR does not change the status quo. There is no risk.
* Fix Julia tests always testing master branch instead of PR branches. * Fix ci/docker_cache.py * Fix Dockerfiles for nightly tests
Description
#17984 added a
cache_intermediate
option tobuild_docker
enabling developers to speed up local docker builds during development. As no default value was provided, this broke the use ofbuild_docker
function in other places.Reference: Failing
ci/jenkins/restricted-docker-cache-refresh
on master after merge: http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/restricted-docker-cache-refresh/detail/master/2789/pipelineAlso
Fixes #18061