Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update instructions to build with nvidia cuda runtime image for ONNX #2435

Merged
merged 19 commits into from
Jul 29, 2023
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 14 additions & 2 deletions docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ Use `build_image.sh` script to build the docker images. The script builds the `p
|-h, --help|Show script help|
|-b, --branch_name|Specify a branch name to use. Default: master |
|-g, --gpu|Build image with GPU based ubuntu base image|
|-bi, --baseimage specify base docker image. Example: nvidia/cuda:11.7.0-cudnn8-runtime-ubuntu20.04|
agunapal marked this conversation as resolved.
Show resolved Hide resolved
|-bt, --buildtype|Which type of docker image to build. Can be one of : production, dev, ci, codebuild|
|-t, --tag|Tag name for image. If not specified, script uses torchserve default tag names.|
|-cv, --cudaversion| Specify to cuda version to use. Supported values `cu92`, `cu101`, `cu102`, `cu111`, `cu113`, `cu116`, `cu117`, `cu118`. Default `cu117`|
Expand All @@ -52,10 +53,12 @@ Creates a docker image with publicly available `torchserve` and `torch-model-arc
./build_image.sh
```

- To create a GPU based image with cuda 10.2. Options are `cu92`, `cu101`, `cu102`, `cu111`, `cu113`, `cu116`, `cu117`
- To create a GPU based image with cuda 10.2. Options are `cu92`, `cu101`, `cu102`, `cu111`, `cu113`, `cu116`, `cu117`, `cu118`

- GPU images are built with NVIDIA CUDA base image. If you want to use ONNX, please specify the base image as shown in the next section.

```bash
./build_image.sh -g -cv cu102
./build_image.sh -g -cv cu117
```

- To create an image with a custom tag
Expand All @@ -64,6 +67,15 @@ Creates a docker image with publicly available `torchserve` and `torch-model-arc
./build_image.sh -t torchserve:1.0
```

**NVIDIA CUDA RUNTIME BASE IMAGE**

To make use of ONNX, we need to use [NVIDIA CUDA runtime](https://github.com/NVIDIA/nvidia-docker/wiki/CUDA) as the base image.
This will increase the size of your Docker Image

```bash
./build_image.sh -bi nvidia/cuda:11.7.0-cudnn8-runtime-ubuntu20.04 -g -cv cu117
```

**DEVELOPER ENVIRONMENT IMAGES**

Creates a docker image with `torchserve` and `torch-model-archiver` installed from source.
Expand Down
14 changes: 14 additions & 0 deletions docker/build_image.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ BRANCH_NAME="master"
DOCKER_TAG="pytorch/torchserve:latest-cpu"
BUILD_TYPE="production"
BASE_IMAGE="ubuntu:20.04"
USER_BASE_IMAGE="ubuntu:20.04"
agunapal marked this conversation as resolved.
Show resolved Hide resolved
UPDATE_BASE_IMAGE=false
USE_CUSTOM_TAG=false
CUDA_VERSION=""
USE_LOCAL_SERVE_FOLDER=false
Expand All @@ -21,6 +23,7 @@ do
echo "-h, --help show brief help"
echo "-b, --branch_name=BRANCH_NAME specify a branch_name to use"
echo "-g, --gpu specify to use gpu"
echo "-bi, --baseimage specify base docker image. Example: nvidia/cuda:11.7.0-cudnn8-runtime-ubuntu20.04 "
echo "-bt, --buildtype specify to created image for codebuild. Possible values: production, dev, codebuild."
echo "-cv, --cudaversion specify to cuda version to use"
echo "-t, --tag specify tag name for docker image"
Expand All @@ -47,6 +50,12 @@ do
CUDA_VERSION="cu117"
shift
;;
-bi|--baseimage)
USER_BASE_IMAGE="$2"
UPDATE_BASE_IMAGE=true
agunapal marked this conversation as resolved.
Show resolved Hide resolved
shift
shift
;;
-bt|--buildtype)
BUILD_TYPE="$2"
shift
Expand Down Expand Up @@ -135,6 +144,11 @@ then
DOCKER_TAG=${CUSTOM_TAG}
fi

if [ "$UPDATE_BASE_IMAGE" = true ]
then
BASE_IMAGE=${USER_BASE_IMAGE}
fi

if [ "${BUILD_TYPE}" == "production" ]
then
DOCKER_BUILDKIT=1 docker build --file Dockerfile --build-arg BASE_IMAGE="${BASE_IMAGE}" --build-arg CUDA_VERSION="${CUDA_VERSION}" --build-arg PYTHON_VERSION="${PYTHON_VERSION}" -t "${DOCKER_TAG}" --target production-image .
Expand Down
2 changes: 2 additions & 0 deletions docs/performance_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ At a high level what TorchServe allows you to do is
2. Load those weights from `base_handler.py` using `ort_session = ort.InferenceSession(self.model_pt_path, providers=providers, sess_options=sess_options)` which supports reasonable defaults for both CPU and GPU inference
3. Allow you define custom pre and post processing functions to pass in data in the format your onnx model expects with a custom handler

To use ONNX with GPU on TorchServe Docker, we need to build an image with [NVIDIA CUDA runtime](https://github.com/NVIDIA/nvidia-docker/wiki/CUDA) as the base image as shown [here](https://github.com/pytorch/serve/blob/master/docker/README.md#create-torchserve-docker-image)

<h4>TensorRT<h4>

TorchServe also supports models optimized via TensorRT. To leverage the TensorRT runtime you can convert your model by [following these instructions](https://github.com/pytorch/TensorRT) and once you're done you'll have serialized weights which you can load with [`torch.jit.load()`](https://pytorch.org/TensorRT/getting_started/getting_started_with_python_api.html#getting-started-with-python-api).
Expand Down
9 changes: 9 additions & 0 deletions examples/large_models/deepspeed/Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,3 +44,12 @@ torchserve --start --ncs --model-store model_store --models opt.tar.gz
```bash
curl "http://localhost:8080/predictions/opt" -T sample_text.txt
```

### Running using TorchServe Docker Image

To use DeepSpeed with GPU on TorchServe Docker, we need to build an image with [NVIDIA CUDA dev ](https://github.com/NVIDIA/nvidia-docker/wiki/CUDA) as the base image as shown [here](https://github.com/pytorch/serve/blob/master/docker/README.md#create-torchserve-docker-image)
agunapal marked this conversation as resolved.
Show resolved Hide resolved

Example:
```
./build_image.sh -bi nvidia/cuda:11.7.0-devel-ubuntu20.04 -g -cv cu117 -t pytorch/torchserve:latest-gpu
agunapal marked this conversation as resolved.
Show resolved Hide resolved
```
1 change: 1 addition & 0 deletions ts_scripts/spellcheck_conf/wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1064,3 +1064,4 @@ ActionSLAM
statins
ci
chatGPT
baseimage