Added sample to deploy online endpoint running inference (#2)

* Added inference samples. * Use latest mlflow versions * Use latest ultralytics version. * quick cleanup. * cleanup. * Updated libraries version in notebook. * renamed inference environment folder. * Added infoo about request_timeout_ms.
ouphi · Sep 21, 2023 · 59c5bb6 · 59c5bb6
1 parent 7c302bf
commit 59c5bb6
Show file tree

Hide file tree

Showing 12 changed files with 597 additions and 419 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,2 +1,5 @@
 yolov8n.pt
-coco128/*
+coco128/*
+.vscode
+.idea
+__pycache__
diff --git a/README.md b/README.md
@@ -9,3 +9,67 @@ You can find the detailed instructions to train the yolov8 model with the az cli
 ## Azure machine learning python SDK
 
 [Here is a notebook](instructions-python-sdk.ipynb) showing how to train the yolov8 model with the python SDK.
+
+## Deploy model for inference
+
+### Register the model from the workspace UI
+You can register the model resulting from a training job. 
+Go to your job Overview and select "Register model".
+Select model of type Unspecified type enable "Show all default outputs" > and select best.pt.
+(Note that your training environment needs azureml-mlflow==1.52.0 and mlflow==2.4.2 to enable mlflow logging and being able to retrieve the model)
+
+### Create the deployment
+In `azureml/deployment.yaml`, specify your model
+
+You can either specify a registered model.
+
+```yaml
+model: azureml:<your-model-name>:<version>
+```
+
+Or specify the relative path of a local .pt file:
+
+```yaml
+model:
+  path: <model-relative-path-to-azureml-folder>
+```
+
+Note that you might need to increase the request_timeout_ms by specifying it in the deployment.yaml if running your 
+inference takes time
+### Deploy your model for inference 
+
+To deploy your endpoint in your azureml workspace:
+
+Configure your default resource group and azureml workspace:
+
+```bash
+az configure --defaults group=$YOUR_RESOURCE_GROUP workspace=$YOUR_AZ_ML_WORKSPACE
+```
+
+```bash
+./deploy-endpoint.sh
+```
+
+Note your endpoint name and score uri (you can retrieve them from the azure workspace).
+
+### Test the endpoint and allocate traffic
+
+To be able invoke our endpoint with an http client, you need to allocate traffic to your endpoint. (For more information [see this doc](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-safely-rollout-online-endpoints?view=azureml-api-2&tabs=azure-cli#confirm-your-existing-deployment))
+
+```bash
+az ml online-endpoint show -n $ENDPOINT_NAME --query traffic
+```
+
+You can see that 0% is allocated to the blue deployment, so let's allocate 100% traffic to our unique blue deployment:
+
+```bash
+az ml online-endpoint update --name $ENDPOINT_NAME --traffic "blue=100"
+```
+
+Now you should be able to call your endpoint with curl.
+You need to retrieve your endpoint key from the azure ml workspace in Endpoints > Consume > Basic consumption info. 
+
+```bash
+ENDPOINT_KEY=$YOUR_ENDPOINT_KEY
+curl --request POST "$SCORING_URI" --header "Authorization: Bearer $ENDPOINT_KEY" --header 'Content-Type: application/json' --data '{"image_url": "https://ultralytics.com/images/bus.jpg"}'
+```
diff --git a/azureml-environment/Dockerfile b/azureml-environment/Dockerfile
@@ -13,6 +13,6 @@ RUN apt install --no-install-recommends -y gcc git zip curl htop libgl1-mesa-glx
 # https://security.snyk.io/vuln/SNYK-UBUNTU1804-OPENSSL-3314796
 RUN apt upgrade --no-install-recommends -y openssl tar
 
-RUN pip install ultralytics==8.0.88
-RUN pip install azureml-mlflow==1.50.0
-RUN pip install mlflow==2.2.2
+RUN pip install ultralytics==8.0.180
+RUN pip install azureml-mlflow==1.52.0
+RUN pip install mlflow==2.4.2
diff --git a/azureml/deployment.yaml b/azureml/deployment.yaml
@@ -0,0 +1,17 @@
+$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json
+name: yolodeployment
+endpoint_name: yolovendpoint
+model:
+  #azureml:<your-model-name>:1
+  #path: <relative local path to your .pt model from the azureml folder>
+code_configuration:
+  code: ../inference-code
+  scoring_script: score.py
+environment:
+  build:
+    path: ../inference-environment
+    dockerfile_path: Dockerfile
+instance_type: Standard_DS3_v2
+instance_count: 1
+# Note that you might need to increase the request_timeout_ms if running the inference takes time.
+# request_timeout_ms: 10000
diff --git a/azureml/endpoint.yaml b/azureml/endpoint.yaml
@@ -0,0 +1,3 @@
+$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineEndpoint.schema.json
+name: my-endpoint
+auth_mode: key
diff --git a/deploy-endpoint.sh b/deploy-endpoint.sh
@@ -0,0 +1,9 @@
+set -ex
+
+export ENDPOINT_NAME=endpt-`echo $RANDOM`
+az ml online-endpoint create -n $ENDPOINT_NAME -f azureml/endpoint.yaml
+az ml online-deployment create -n blue --endpoint $ENDPOINT_NAME -f azureml/deployment.yaml
+az ml online-endpoint show -n $ENDPOINT_NAME
+SCORING_URI=$(az ml online-endpoint show -n $ENDPOINT_NAME -o tsv --query scoring_uri)
+echo "Endpoint name: $ENDPOINT_NAME"
+echo "Scoring uri: $SCORING_URI"
diff --git a/deploy-local-endpoint.sh b/deploy-local-endpoint.sh
@@ -0,0 +1,52 @@
+set -ex
+
+export ENDPOINT_NAME=endpt-`echo $RANDOM`
+az ml online-endpoint create --local -n $ENDPOINT_NAME -f azureml/endpoint.yaml
+
+# <create_deployment>
+az ml online-deployment create --local -n blue --endpoint $ENDPOINT_NAME -f azureml/deployment.yaml
+# </create_deployment>
+
+# <get_status>
+az ml online-endpoint show -n $ENDPOINT_NAME --local
+# </get_status>
+
+# check if create was successful
+endpoint_status=`az ml online-endpoint show --local --name $ENDPOINT_NAME --query "provisioning_state" -o tsv`
+echo $endpoint_status
+if [[ $endpoint_status == "Succeeded" ]]
+then
+  echo "Endpoint created successfully"
+else
+  echo "Endpoint creation failed"
+  exit 1
+fi
+
+deploy_status=`az ml online-deployment show --local --name blue --endpoint $ENDPOINT_NAME --query "provisioning_state" -o tsv`
+echo $deploy_status
+if [[ $deploy_status == "Succeeded" ]]
+then
+  echo "Deployment completed successfully"
+else
+  echo "Deployment failed"
+  exit 1
+fi
+
+# <test_endpoint>
+az ml online-endpoint invoke --local --name $ENDPOINT_NAME --request-file inference-sample-request.json
+# </test_endpoint>
+
+# <test_endpoint_using_curl>
+SCORING_URI=$(az ml online-endpoint show --local -n $ENDPOINT_NAME -o tsv --query scoring_uri)
+
+curl --request POST "$SCORING_URI" --header 'Content-Type: application/json' --data @inference-sample-request.json
+
+# <get_logs>
+#az ml online-deployment get-logs --local -n blue --endpoint $ENDPOINT_NAME
+# </get_logs>
+
+curl -X POST -H "Content-Type: application/json" -d '{"image_url": "https://ultralytics.com/images/bus.jpg"}' $SCORING_URI
+
+# <delete_endpoint>
+#az ml online-endpoint delete --local --name $ENDPOINT_NAME --yes
+# </delete_endpoint>
diff --git a/inference-code/score.py b/inference-code/score.py
@@ -0,0 +1,18 @@
+import os
+import json
+from ultralytics import YOLO
+
+def init():
+    global model
+    model_path = os.path.join(
+        os.getenv("AZUREML_MODEL_DIR"), "best.pt"
+    )
+    model = YOLO(model_path)
+
+
+def run(raw_data):
+    image_url = json.loads(raw_data)["image_url"]
+    results = model(image_url)
+    result = results[0]
+    serialized_result = json.loads(result.tojson())
+    return serialized_result
diff --git a/inference-environment/Dockerfile b/inference-environment/Dockerfile
@@ -0,0 +1,9 @@
+FROM mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest
+
+ENV DEBIAN_FRONTEND noninteractive
+RUN apt update
+RUN TZ=Etc/UTC apt install -y tzdata
+RUN apt install --no-install-recommends -y gcc git zip curl htop libgl1-mesa-glx libglib2.0-0 libpython3-dev gnupg g++
+RUN apt upgrade --no-install-recommends -y openssl tar
+RUN pip install azureml-inference-server-http
+RUN pip install ultralytics==8.0.180
diff --git a/inference-sample-request.json b/inference-sample-request.json
@@ -0,0 +1,3 @@
+{
+  "image_url": "https://ultralytics.com/images/bus.jpg"
+}
diff --git a/instructions-az-cli.md b/instructions-az-cli.md
@@ -47,9 +47,9 @@ RUN apt install --no-install-recommends -y gcc git zip curl htop libgl1-mesa-glx
 # https://security.snyk.io/vuln/SNYK-UBUNTU1804-OPENSSL-3314796
 RUN apt upgrade --no-install-recommends -y openssl tar
 
-RUN pip install ultralytics==8.0.88
-RUN pip install azureml-mlflow==1.50.0
-RUN pip install mlflow==2.2.2
+RUN pip install ultralytics==8.0.132
+RUN pip install azureml-mlflow==1.52.0
+RUN pip install mlflow==2.4.2
 ```
 
 Note that Ultralytics provides [Dockerfiles for different platform](https://github.com/ultralytics/ultralytics/tree/main/docker). Here we used the same base image and installed the same linux dependencies than the [amd64 Dockerfile](https://github.com/ultralytics/ultralytics/blob/main/docker/Dockerfile), but we installed the ultralytics package with pip install to control the version we install and make sure the package version is deterministic. To track hyperparameters and metrics in AzureML, we installed [mlflow](https://pypi.org/project/mlflow/) and [azureml-mlflow](https://pypi.org/project/azureml-mlflow/). This enables us to evaluate our model performance easily and compare models from various training runs in AzureML studio.