who needs custom docker anyway

iterative · May 18, 2022 · f86e796 · f86e796
1 parent 230682d
commit f86e796
Showing 1 changed file with 18 additions and 112 deletions.
diff --git a/content/blog/2022-05-24-local-experiments-to-cloud-with-tpi-docker.md b/content/blog/2022-05-24-local-experiments-to-cloud-with-tpi-docker.md
@@ -2,15 +2,15 @@
 title:
   Moving Local Experiments to the Cloud with Terraform Provider Iterative (TPI)
   and Docker
-date: 2022-02-28
+date: 2022-05-24
 description: >
   Tutorial for easily running experiments in the cloud with the help of
   Terraform Provider Iterative (TPI) and Docker.
 descriptionLong: >
-  In this tutorial, learn how to build a Docker image and use it to run
-  experiments in the cloud with Terraform Provider Iterative (TPI).
+  In this tutorial, learn how to use Docker images to run experiments in the
+  cloud with Terraform Provider Iterative (TPI).
 picture: 2022-02-28/unsplash-containers.jpg
-author: maria_khalusova
+author: casper_dcl
 #  todo: commentsUrl:
 tags:
   - MLOps
@@ -54,9 +54,9 @@ Platform, and Kubernetes. Your Docker image is cloud provider-agnostic. There
 are thousands of [pre-defined Docker images online](https://hub.docker.com/)
 too.
 
-In the first part of this tutorial, we'll use an existing Docker image. The
-second part of this tutorial will then cover some basics for building and using
-your own Docker images.
+In this tutorial, we'll use an existing Docker image that comes with most of our
+requirements already installed. We'll then add add a few more dependencies on
+top and run our training pipeline in the cloud as before!
 
 ## Run GPU-enabled Docker containers
 
@@ -113,114 +113,20 @@ infrastructure, sync data and code to it, set up the environment, and run the
 training process. If you'd like to tinker with this example you can
 [find it on GitHub](https://github.com/iterative/blog-tpi-bees/tree/docker).
 
-## Building your own Docker image
+<admon type="tip">
 
-Suppose, we wanted our Docker container to include all the Python libraries
-required to run our training pipeline? Well, these would be quite specific to
-our project, so we'd have to build our own Docker image. Luckily we don't have
-to build one completely from scratch, instead we can use one of the existing
-publicly available images as a base to build on top of.
+Don't forget to `terraform refresh && terraform show` to check the status, and
+`terraform destroy` to download results & shut everything down.
 
-For this example, let's use `nvidia/cuda:11.3.0-cudnn8-runtime-ubuntu20.04` from
-NVIDIA. Earlier, we didn't need to install Docker on the local machine because
-it was only used on the cloud machine, but if you're going to build your own
-image, you'll need to start by
-[installing Docker locally](https://docs.docker.com/get-docker/).
-
-To build an image, create a file called `Dockerfile` in your project's root
-directory. This is a text-based script of instructions that will be used to
-create a container image. In our case it'll be quite simple as most of the
-complex setup is taken care of by NVIDIA's image we're using as a base:
-
-```
-FROM --platform=linux/amd64 nvidia/cuda:11.3.0-cudnn8-runtime-ubuntu20.04
-
-# Install wget and python
-RUN apt-get update && apt-get install -y wget python3-pip
-
-# Add the required packages to the environment
-COPY requirements.txt .
-RUN pip3 install -r requirements.txt && rm requirements.txt
-
-WORKDIR /tpi
-```
-
-Unlike Terraform, where you had a declarative description of the desired
-infrastructure in the main.tf, `Dockerfile` is a script, so the order of the
-steps matters. The `FROM` instruction sets the base image for subsequent
-instructions, and it is a required first instruction. To execute commands in a
-new layer on top of the current image and commit the results, we use the `RUN`
-instruction. The resulting committed image will be used for the next step in the
-`Dockerfile`.
-
-To install the project-specific Python packages, we'll need to copy the
-`requirements.txt` to the filesystem of the container, and we can do this with
-the `COPY` instruction. Then we'll use `RUN` again to install them.
-
-Finally, with `WORKDIR` we can specify what we want to call the directory inside
-the container where we'll run things inside the container.
-
-To build your own image locally use the following command:
-`docker build -t bees:latest .`, where `bees` is the name of my Docker image but
-you can name yours however you like.
-
-Hooray! We've got our own Docker image, but we're not done just yet. To use this
-image from the Terraform config file, we should publish it on
-[Docker Hub](https://hub.docker.com/). Your company may already have a business
-account for Docker Hub, but if you're learning Docker and want to try it as an
-individual, you can create your own free
-[personal account here](https://hub.docker.com/).
-
-Once you have ayour account,
-
-1. Log in to the Docker public registry on your local machine from command line
-   with `docker login`.
-2. Tag your image (aka give it a name):
-   `docker tag bees [YOUR_DOCKERHUB_ID]/bees:bees`.
-3. Publish the image: `docker push [YOUR_DOCKERHUB_ID]/bees:bees`
+</admon>
 
-Once complete, this bees image becomes publicly available which means we can now
-use it in our Terraform config, and it is going to be nearly identical to the
-previous example:
+Now you know the basics of using convenient Docker images together with
+[TPI][tpi] for provisioning your MLOps infrastructure!
 
-```hcl
-terraform {
-   required_providers { iterative = { source = "iterative/iterative"} }
-}
-provider "iterative" {}
-
-resource "iterative_task" "tpi-docker-examples" {
-   name      = "tpi-docker-examples"
-   cloud     = "aws"
-   region    = "us-east-2"
-   machine   = "m+k80"
-   workdir { input = "." }
-
-   script = <<-END
-   #!/bin/bash
-   sudo apt update -qq && sudo apt install -yqq software-properties-common build-essential ubuntu-drivers-common
-   sudo ubuntu-drivers autoinstall
-   sudo curl -fsSL https://get.docker.com | sudo sh -
-   sudo usermod -aG docker ubuntu
-   sudo setfacl --modify user:ubuntu:rw /var/run/docker.sock
-   curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
-   curl -s -L https://nvidia.github.io/nvidia-docker/ubuntu20.04/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
-   sudo apt update -qq && sudo apt install -yqq nvidia-docker2
-   sudo systemctl restart docker
-
-   nvidia-smi
-
-   docker run --rm --gpus all -v "$PWD":/tpi [YOUR_DOCKERHUB_ID]/bees:bees python3 src/train.py
-   END
-}
-```
+<admon type="tip">
 
-Note how we are mounting the volume to the working directory that we have
-specified when creating an image. This is not strictly required, but makes
-things neat and tidy. Find this example together with the Dockerfile in this
-[GitHub repo](https://github.com/iterative/blog-tpi-bees/tree/own-docker-image).
+If you have a lot of custom dependencies that rarely change (e.g. a large
+`requirements.txt` that is rarely updated), it's a good idea to put build it
+into your own custom Docker image. Let us know if you'd like a tutorial on this!
 
-Now you know the basics of using Docker images together with Iterative Terraform
-Provider for provisioning your MLOps infrastructure, and you can start building
-your own Docker images too! Hope you found this tutorial useful, now it's time
-to build great things!
+</admon>