This module contains docker file to build djl-serving docker image. This docker image is compatible with SageMaker hosting.
- Install Amazon Corretto 17
- Install Docker and Docker Compose Plugin
# example for AL2023
sudo yum install -y docker
sudo mkdir -p /usr/local/lib/docker/cli-plugins/
sudo curl -SL https://github.com/docker/compose/releases/latest/download/docker-compose-linux-x86_64 -o /usr/local/lib/docker/cli-plugins/docker-compose
sudo chmod +x /usr/local/lib/docker/cli-plugins/docker-compose
Currently, we created docker compose to simplify the building experience. Just run
./gradlew --refresh-dependencies :serving:dockerDeb -Psnapshot
cd serving/docker
export DJL_VERSION=$(awk -F '=' '/djl / {gsub(/ ?"/, "", $2); print $2}' ../../gradle/libs.versions.toml)
export SERVING_VERSION=$(awk -F '=' '/serving / {gsub(/ ?"/, "", $2); print $2}' ../../gradle/libs.versions.toml)
docker compose build --build-arg djl_version=${DJL_VERSION} --build-arg djl_serving_version=${SERVING_VERSION} <compose-target>
You can find different compose-target
in docker-compose.yml
, like cpu
, lmi
...
You can find DJL latest release docker image on dockerhub. DJLServing also publishes nightly publish to the dockerhub nightly. You can just pull the image you need from there.
djl-serving will load all models stored in /opt/ml/model
folder. You only need to
download your model files and mount the model folder to /opt/ml/model
in the docker container.
Here are a few examples to run djl-serving docker image:
docker pull deepjavalibrary/djl-serving:0.30.0
mkdir models
cd models
curl -O https://resources.djl.ai/test-models/pytorch/bert_qa_jit.tar.gz
docker run -it --rm -v $PWD:/opt/ml/model -p 8080:8080 deepjavalibrary/djl-serving:0.30.0
docker pull deepjavalibrary/djl-serving:0.30.0-pytorch-gpu
mkdir models
cd models
curl -O https://resources.djl.ai/test-models/pytorch/bert_qa_jit.tar.gz
docker run -it --runtime=nvidia --shm-size 2g -v $PWD:/opt/ml/model -p 8080:8080 deepjavalibrary/djl-serving:0.30.0-pytorch-gpu
docker pull deepjavalibrary/djl-serving:0.30.0-pytorch-inf2
mkdir models
cd models
curl -O https://resources.djl.ai/test-models/pytorch/resnet18_inf2_2_4.tar.gz
docker run --device /dev/neuron0 -it --rm -v $PWD:/opt/ml/model -p 8080:8080 deepjavalibrary/djl-serving:0.30.0-pytorch-inf2
docker pull deepjavalibrary/djl-serving:0.30.0-aarch64
mkdir models
cd models
curl -O https://resources.djl.ai/test-models/pytorch/resnet18_inf2_2_4.tar.gz
docker run --device /dev/neuron0 -it --rm -v $PWD:/opt/ml/model -p 8080:8080 deepjavalibrary/djl-serving:0.30.0-aarch64
You can pass command line arguments to djl-serving
directly when you using docker run
docker run -it --rm -p 8080:8080 deepjavalibrary/djl-serving:0.30.0 djl-serving -m "djl://ai.djl.huggingface.pytorch/sentence-transformers/all-MiniLM-L6-v2"