proof of concept using docker buildx for streamlined images without ONBUILD to simplify things and make the recipe for creating images more transparent.
goals:
- better instruction cache (if only changing 'start' don't reinstall conda environment)
- cache conda packages
- if one package changes don't re-download everything
- reuse downloaded packages from pangeo-notebook when creating ml-notebook
- remote Dockerfile / buildcontext for different repositories?
take advantage of new Dockerfile features:
- https://www.docker.com/blog/dockerfiles-now-support-multiple-build-contexts/
- https://www.docker.com/blog/image-rebase-and-improved-remote-cache-support-in-new-buildkit/
- https://www.docker.com/blog/advanced-dockerfiles-faster-builds-and-smaller-images-using-buildkit-and-multistage-builds/
Important Dockerfile Syntax Documentation https://github.com/moby/buildkit/blob/master/frontend/dockerfile/docs/syntax.md
# base-notebook
docker buildx build -f Dockerfile ./base-notebook -t base-notebook:test
# pangeo-notebook
docker buildx build -f Dockerfile ./pangeo-notebook -t pangeo-notebook:test --progress=plain
# remote dockerfile, local context
docker buildx build -f https://raw.githubusercontent.com/scottyhq/pangeo-buildx/main/Dockerfile ./base-notebook
# remote dockerfile, remote context?
docker buildx build -f https://raw.githubusercontent.com/scottyhq/pangeo-buildx/main/Dockerfile https://github.com/scottyhq/pangeo-buildx.git#main
# build everything
docker buildx bake
seems taking advantage of local volume cache not an option docker/setup-buildx-action#138 (yet)
can build a bunch of images in parallel in a single step https://github.com/docker/bake-action
repo2docker would need to update build machinery to support buildkit (jupyterhub/repo2docker#875)
possible solutions:
- alternative to docker-py that supports buildkit https://github.com/gabrieldemarmiesse/python-on-whales
- allow-the docker-saavy to bypass repo2docker altogether (for binderhubs) and pull images directly from public registries jupyterhub/binderhub#1298. Basically just pre-build dockerfiles without repo2docker and point to them. This is effectively what https://github.com/jupyterhub/repo2docker-action is doing.