Skip to content

Commit

Permalink
Added sample to deploy online endpoint running inference (#2)
Browse files Browse the repository at this point in the history
* Added inference samples.

* Use latest mlflow versions

* Use latest ultralytics version.

* quick cleanup.

* cleanup.

* Updated libraries version in notebook.

* renamed inference environment folder.

* Added infoo about request_timeout_ms.
  • Loading branch information
ouphi authored Sep 21, 2023
1 parent 7c302bf commit 59c5bb6
Show file tree
Hide file tree
Showing 12 changed files with 597 additions and 419 deletions.
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,5 @@
yolov8n.pt
coco128/*
coco128/*
.vscode
.idea
__pycache__
64 changes: 64 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,67 @@ You can find the detailed instructions to train the yolov8 model with the az cli
## Azure machine learning python SDK

[Here is a notebook](instructions-python-sdk.ipynb) showing how to train the yolov8 model with the python SDK.

## Deploy model for inference

### Register the model from the workspace UI
You can register the model resulting from a training job.
Go to your job Overview and select "Register model".
Select model of type Unspecified type enable "Show all default outputs" > and select best.pt.
(Note that your training environment needs azureml-mlflow==1.52.0 and mlflow==2.4.2 to enable mlflow logging and being able to retrieve the model)

### Create the deployment
In `azureml/deployment.yaml`, specify your model

You can either specify a registered model.

```yaml
model: azureml:<your-model-name>:<version>
```
Or specify the relative path of a local .pt file:
```yaml
model:
path: <model-relative-path-to-azureml-folder>
```
Note that you might need to increase the request_timeout_ms by specifying it in the deployment.yaml if running your
inference takes time
### Deploy your model for inference
To deploy your endpoint in your azureml workspace:
Configure your default resource group and azureml workspace:
```bash
az configure --defaults group=$YOUR_RESOURCE_GROUP workspace=$YOUR_AZ_ML_WORKSPACE
```

```bash
./deploy-endpoint.sh
```

Note your endpoint name and score uri (you can retrieve them from the azure workspace).

### Test the endpoint and allocate traffic

To be able invoke our endpoint with an http client, you need to allocate traffic to your endpoint. (For more information [see this doc](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-safely-rollout-online-endpoints?view=azureml-api-2&tabs=azure-cli#confirm-your-existing-deployment))

```bash
az ml online-endpoint show -n $ENDPOINT_NAME --query traffic
```

You can see that 0% is allocated to the blue deployment, so let's allocate 100% traffic to our unique blue deployment:

```bash
az ml online-endpoint update --name $ENDPOINT_NAME --traffic "blue=100"
```

Now you should be able to call your endpoint with curl.
You need to retrieve your endpoint key from the azure ml workspace in Endpoints > Consume > Basic consumption info.

```bash
ENDPOINT_KEY=$YOUR_ENDPOINT_KEY
curl --request POST "$SCORING_URI" --header "Authorization: Bearer $ENDPOINT_KEY" --header 'Content-Type: application/json' --data '{"image_url": "https://ultralytics.com/images/bus.jpg"}'
```
6 changes: 3 additions & 3 deletions azureml-environment/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,6 @@ RUN apt install --no-install-recommends -y gcc git zip curl htop libgl1-mesa-glx
# https://security.snyk.io/vuln/SNYK-UBUNTU1804-OPENSSL-3314796
RUN apt upgrade --no-install-recommends -y openssl tar

RUN pip install ultralytics==8.0.88
RUN pip install azureml-mlflow==1.50.0
RUN pip install mlflow==2.2.2
RUN pip install ultralytics==8.0.180
RUN pip install azureml-mlflow==1.52.0
RUN pip install mlflow==2.4.2
17 changes: 17 additions & 0 deletions azureml/deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json
name: yolodeployment
endpoint_name: yolovendpoint
model:
#azureml:<your-model-name>:1
#path: <relative local path to your .pt model from the azureml folder>
code_configuration:
code: ../inference-code
scoring_script: score.py
environment:
build:
path: ../inference-environment
dockerfile_path: Dockerfile
instance_type: Standard_DS3_v2
instance_count: 1
# Note that you might need to increase the request_timeout_ms if running the inference takes time.
# request_timeout_ms: 10000
3 changes: 3 additions & 0 deletions azureml/endpoint.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineEndpoint.schema.json
name: my-endpoint
auth_mode: key
9 changes: 9 additions & 0 deletions deploy-endpoint.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
set -ex

export ENDPOINT_NAME=endpt-`echo $RANDOM`
az ml online-endpoint create -n $ENDPOINT_NAME -f azureml/endpoint.yaml
az ml online-deployment create -n blue --endpoint $ENDPOINT_NAME -f azureml/deployment.yaml
az ml online-endpoint show -n $ENDPOINT_NAME
SCORING_URI=$(az ml online-endpoint show -n $ENDPOINT_NAME -o tsv --query scoring_uri)
echo "Endpoint name: $ENDPOINT_NAME"
echo "Scoring uri: $SCORING_URI"
52 changes: 52 additions & 0 deletions deploy-local-endpoint.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
set -ex

export ENDPOINT_NAME=endpt-`echo $RANDOM`
az ml online-endpoint create --local -n $ENDPOINT_NAME -f azureml/endpoint.yaml

# <create_deployment>
az ml online-deployment create --local -n blue --endpoint $ENDPOINT_NAME -f azureml/deployment.yaml
# </create_deployment>

# <get_status>
az ml online-endpoint show -n $ENDPOINT_NAME --local
# </get_status>

# check if create was successful
endpoint_status=`az ml online-endpoint show --local --name $ENDPOINT_NAME --query "provisioning_state" -o tsv`
echo $endpoint_status
if [[ $endpoint_status == "Succeeded" ]]
then
echo "Endpoint created successfully"
else
echo "Endpoint creation failed"
exit 1
fi

deploy_status=`az ml online-deployment show --local --name blue --endpoint $ENDPOINT_NAME --query "provisioning_state" -o tsv`
echo $deploy_status
if [[ $deploy_status == "Succeeded" ]]
then
echo "Deployment completed successfully"
else
echo "Deployment failed"
exit 1
fi

# <test_endpoint>
az ml online-endpoint invoke --local --name $ENDPOINT_NAME --request-file inference-sample-request.json
# </test_endpoint>

# <test_endpoint_using_curl>
SCORING_URI=$(az ml online-endpoint show --local -n $ENDPOINT_NAME -o tsv --query scoring_uri)

curl --request POST "$SCORING_URI" --header 'Content-Type: application/json' --data @inference-sample-request.json

# <get_logs>
#az ml online-deployment get-logs --local -n blue --endpoint $ENDPOINT_NAME
# </get_logs>

curl -X POST -H "Content-Type: application/json" -d '{"image_url": "https://ultralytics.com/images/bus.jpg"}' $SCORING_URI

# <delete_endpoint>
#az ml online-endpoint delete --local --name $ENDPOINT_NAME --yes
# </delete_endpoint>
18 changes: 18 additions & 0 deletions inference-code/score.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
import os
import json
from ultralytics import YOLO

def init():
global model
model_path = os.path.join(
os.getenv("AZUREML_MODEL_DIR"), "best.pt"
)
model = YOLO(model_path)


def run(raw_data):
image_url = json.loads(raw_data)["image_url"]
results = model(image_url)
result = results[0]
serialized_result = json.loads(result.tojson())
return serialized_result
9 changes: 9 additions & 0 deletions inference-environment/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
FROM mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest

ENV DEBIAN_FRONTEND noninteractive
RUN apt update
RUN TZ=Etc/UTC apt install -y tzdata
RUN apt install --no-install-recommends -y gcc git zip curl htop libgl1-mesa-glx libglib2.0-0 libpython3-dev gnupg g++
RUN apt upgrade --no-install-recommends -y openssl tar
RUN pip install azureml-inference-server-http
RUN pip install ultralytics==8.0.180
3 changes: 3 additions & 0 deletions inference-sample-request.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"image_url": "https://ultralytics.com/images/bus.jpg"
}
6 changes: 3 additions & 3 deletions instructions-az-cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,9 @@ RUN apt install --no-install-recommends -y gcc git zip curl htop libgl1-mesa-glx
# https://security.snyk.io/vuln/SNYK-UBUNTU1804-OPENSSL-3314796
RUN apt upgrade --no-install-recommends -y openssl tar

RUN pip install ultralytics==8.0.88
RUN pip install azureml-mlflow==1.50.0
RUN pip install mlflow==2.2.2
RUN pip install ultralytics==8.0.132
RUN pip install azureml-mlflow==1.52.0
RUN pip install mlflow==2.4.2
```

Note that Ultralytics provides [Dockerfiles for different platform](https://github.com/ultralytics/ultralytics/tree/main/docker). Here we used the same base image and installed the same linux dependencies than the [amd64 Dockerfile](https://github.com/ultralytics/ultralytics/blob/main/docker/Dockerfile), but we installed the ultralytics package with pip install to control the version we install and make sure the package version is deterministic. To track hyperparameters and metrics in AzureML, we installed [mlflow](https://pypi.org/project/mlflow/) and [azureml-mlflow](https://pypi.org/project/azureml-mlflow/). This enables us to evaluate our model performance easily and compare models from various training runs in AzureML studio.
Expand Down
Loading

0 comments on commit 59c5bb6

Please sign in to comment.