Use docker to package the job environment dependencies

The system launches a deep learning job in one or more Docker containers. A Docker images is required in advance. The system provides a base Docker images with HDFS, CUDA and cuDNN support, based on which users can build their own custom Docker images.

To build a base Docker image, for example Dockerfile.build.base, run:

docker build -f Dockerfiles/Dockerfile.build.base -t pai.build.base:hadoop2.7.2-cuda8.0-cudnn6-devel-ubuntu16.04 Dockerfiles/

Then a custom docker image can be built based on it by adding FROM pai.build.base:hadoop2.7.2-cuda8.0-cudnn6-devel-ubuntu16.04 in the Dockerfile.

As an example, we customize a TensorFlow Docker image using Dockerfile.run.tensorflow:

docker build -f Dockerfiles/Dockerfile.run.tensorflow -t pai.run.tensorflow Dockerfiles/

Next, the built image is pushed to a docker registry for every node in the system to access that image:

docker tag pai.run.tensorflow your_docker_registry/pai.run.tensorflow
docker push your_docker_registry/pai.run.tensorflow

And the image is ready to serve. Note that above script assume the docker registry is deployed locally. Actual script can vary depending on the configuration of Docker registry.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

job_docker_env.md

job_docker_env.md

Use docker to package the job environment dependencies

Files

job_docker_env.md

Latest commit

History

job_docker_env.md

File metadata and controls

Use docker to package the job environment dependencies