Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example showing how to deploy MNIST using kserve with minikube #2718

Merged
merged 2 commits into from
Oct 16, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
164 changes: 164 additions & 0 deletions kubernetes/kserve/examples/mnist/MNIST.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
# Digit recognition model with MNIST dataset using a Kubernetes cluster

In this example, we show how to use a pre-trained custom MNIST model to perform real time Digit recognition with TorchServe.
We will be serving the model using Kserve deployed using [minikube](https://minikube.sigs.k8s.io/docs/start/).

The inference service would return the digit inferred by the model in the input image.


## Install kserve

Start minikube cluster

```
minikube start
```

For this example, we need to git clone [kserve](https://github.com/kserve/kserve)
Run the commands given in following steps from the parent directory of the root of the repository. For example, if you cloned the repository into /home/my_path/kserve, run the steps from /home/my_path/kserve

Run the following for quick install of kserve
```
./hack/quick_install.sh
```

Make sure kserve is installed on minikube cluster using

```
kubectl get pods -n kserve
```

This should result in
```
NAME READY STATUS RESTARTS AGE
kserve-controller-manager-57574b4878-rnsjn 2/2 Running 0 17s
```

TorchServe supports KServe V1 and V2 protocol. We show how to deploy with both for Mnist.

## KServe V1 protocol

Deploy `InferenceService` with Kserve V1 protocol

```
kubectl apply -f docs/samples/v1beta1/torchserve/v1/torchserve.yaml
```

results in

```
inferenceservice.serving.kserve.io/torchserve created
```

We need to wait till the pod is up

```
kubectl get pods
NAME READY STATUS RESTARTS AGE
torchserve-predictor-00001-deployment-8d66f9c-dkdhr 2/2 Running 0 8m19s
```

We need to set the following

```
MODEL_NAME=mnist
SERVICE_HOSTNAME=$(kubectl get inferenceservice torchserve -o jsonpath='{.status.url}' | cut -d "/" -f 3)
```

```
export INGRESS_HOST=localhost
export INGRESS_PORT=8080
```

```
INGRESS_GATEWAY_SERVICE=$(kubectl get svc --namespace istio-system --selector="app=istio-ingressgateway" --output jsonpath='{.items[0].metadata.name}')
kubectl port-forward --namespace istio-system svc/${INGRESS_GATEWAY_SERVICE} 8080:80 &
```

Make an inference request

```
curl -H "Content-Type: application/json" -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict -d @./docs/samples/v1beta1/torchserve/v1/mnist.json
```

Expected output is

```
{"predictions":[2]}
```

## KServe V2 protocol

Deploy `InferenceService` with Kserve V2 protocol

```
kubectl apply -f docs/samples/v1beta1/torchserve/v2/mnist.yaml
```

results in

```
inferenceservice.serving.kserve.io/torchserve-mnist-v2 created
```

We need to check the pod is running with

```
kubectl get pods
NAME READY STATUS RESTARTS AGE
torchserve-mnist-v2-predictor-00001-deployment-6c8c684dcb-4mfmr 2/2 Running 0 2m37s
```

Inspecting the logs of the pods to check the version of TorchServe

```
kubectl logs torchserve-mnist-v2-predictor-00001-deployment-6c8c684dcb-4mfmr
Defaulted container "kserve-container" out of: kserve-container, queue-proxy, storage-initializer (init)
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2023-10-12T20:50:39,466 [WARN ] main org.pytorch.serve.util.ConfigManager - Your torchserve instance can access any URL to load models. When deploying to production, make sure to limit the set of allowed_urls in config.properties
2023-10-12T20:50:39,468 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...
2023-10-12T20:50:39,659 [INFO ] main org.pytorch.serve.metrics.configuration.MetricConfiguration - Successfully loaded metrics configuration from /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml
2023-10-12T20:50:39,779 [INFO ] main org.pytorch.serve.ModelServer -
Torchserve version: 0.8.2
TS Home: /home/venv/lib/python3.9/site-packages
Current directory: /home/model-server
Temp directory: /home/model-server/tmp
Metrics config path: /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml

```

We need to set the following

```
MODEL_NAME=mnist
SERVICE_HOSTNAME=$(kubectl get inferenceservice torchserve-mnist-v2 -o jsonpath='{.status.url}' | cut -d "/" -f 3)
```

```
export INGRESS_HOST=localhost
export INGRESS_PORT=8080
```

```
INGRESS_GATEWAY_SERVICE=$(kubectl get svc --namespace istio-system --selector="app=istio-ingressgateway" --output jsonpath='{.items[0].metadata.name}')
kubectl port-forward --namespace istio-system svc/${INGRESS_GATEWAY_SERVICE} 8080:80 &
```

Make an inference request with tensor input

```
curl -v -H "Content-Type: application/json" -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/${MODEL_NAME}/infer -d @./docs/samples/v1beta1/torchserve/v2/tensor_conv/mnist_v2.json
```

Expected output is

```
{"model_name":"mnist","model_version":null,"id":"d3b15cad-50a2-4eaf-80ce-8b0a428bd298","parameters":null,"outputs":[{"name":"input-0","shape":[1],"datatype":"INT64","parameters":null,"data":[1]}]}
```

## Stop and Delete the cluster

```
minikube stop
minikube delete
```