This tutorial shows how to use TensorFlow Serving components running in Docker containers to serve the TensorFlow ResNet model and how to deploy the serving cluster with Kubernetes.
To learn more about TensorFlow Serving, we recommend TensorFlow Serving basic tutorial and TensorFlow Serving advanced tutorial.
To learn more about TensorFlow ResNet model, we recommend reading ResNet in TensorFlow.
- Part 1 gets your environment setup
- Part 2 shows how to run the local Docker serving image
- Part 3 shows how to deploy in Kubernetes.
Before getting started, first install Docker.
Let's clear our local models directory in case we already have one:
rm -rf /tmp/resnet
Deep residual networks, or ResNets for short, provided the breakthrough idea of identity mappings in order to enable training of very deep convolutional neural networks. For our example, we will download a TensorFlow SavedModel of ResNet for the ImageNet dataset.
mkdir /tmp/resnet
curl -s http://download.tensorflow.org/models/official/20181001_resnet/savedmodels/resnet_v2_fp32_savedmodel_NHWC_jpg.tar.gz | \
tar --strip-components=2 -C /tmp/resnet -xvz
We can verify we have the SavedModel:
$ ls /tmp/resnet/*
saved_model.pb variables
Now we want to take a serving image and
commit all
changes to a new image $USER/resnet_serving
for Kubernetes deployment.
First we run a serving image as a daemon:
docker run -d --name serving_base tensorflow/serving
Next, we copy the ResNet model data to the container's model folder:
docker cp /tmp/resnet serving_base:/models/resnet
Finally we commit the container to serving the ResNet model:
docker commit --change "ENV MODEL_NAME resnet" serving_base \
$USER/resnet_serving
Now let's stop the serving base container
docker kill serving_base
docker rm serving_base
Now let's start the container with the ResNet model so it's ready for serving, exposing the gRPC port 8500:
docker run -p 8500:8500 -t $USER/resnet_serving &
For the client, we will need to clone the TensorFlow Serving GitHub repo:
git clone https://github.com/tensorflow/serving
cd serving
Query the server with resnet_client_grpc.py. The client downloads an image and sends it over gRPC for classification into ImageNet categories.
tools/run_in_docker.sh python tensorflow_serving/example/resnet_client_grpc.py
This should result in output like:
outputs {
key: "classes"
value {
dtype: DT_INT64
tensor_shape {
dim {
size: 1
}
}
int64_val: 286
}
}
outputs {
key: "probabilities"
value {
dtype: DT_FLOAT
tensor_shape {
dim {
size: 1
}
dim {
size: 1001
}
}
float_val: 2.41628322328e-06
float_val: 1.90121829746e-06
float_val: 2.72477100225e-05
float_val: 4.42638565801e-07
float_val: 8.98362372936e-07
float_val: 6.84421956976e-06
float_val: 1.66555237229e-05
...
float_val: 1.59407863976e-06
float_val: 1.2315689446e-06
float_val: 1.17812135159e-06
float_val: 1.46365800902e-05
float_val: 5.81210713335e-07
float_val: 6.59980651108e-05
float_val: 0.00129527016543
}
}
model_spec {
name: "resnet"
version {
value: 1538687457
}
signature_name: "serving_default"
}
It works! The server successfully classifies a cat image!
In this section we use the container image built in Part 0 to deploy a serving cluster with Kubernetes in the Google Cloud Platform.
Here we assume you have created and logged in a
gcloud project named
tensorflow-serving
.
gcloud auth login --project tensorflow-serving
First we create a Google Kubernetes Engine cluster for service deployment.
$ gcloud container clusters create resnet-serving-cluster --num-nodes 5
Which should output something like:
Creating cluster resnet-serving-cluster...done.
Created [https://container.googleapis.com/v1/projects/tensorflow-serving/zones/us-central1-f/clusters/resnet-serving-cluster].
kubeconfig entry generated for resnet-serving-cluster.
NAME ZONE MASTER_VERSION MASTER_IP MACHINE_TYPE NODE_VERSION NUM_NODES STATUS
resnet-serving-cluster us-central1-f 1.1.8 104.197.163.119 n1-standard-1 1.1.8 5 RUNNING
Set the default cluster for gcloud container command and pass cluster credentials to kubectl.
gcloud config set container/cluster resnet-serving-cluster
gcloud container clusters get-credentials resnet-serving-cluster
which should result in:
Fetching cluster endpoint and auth data.
kubeconfig entry generated for resnet-serving-cluster.
Let's now push our image to the Google Container Registry so that we can run it on Google Cloud Platform.
First we tag the $USER/resnet_serving
image using the Container Registry
format and our project name,
docker tag $USER/resnet_serving gcr.io/tensorflow-serving/resnet
Next we push the image to the Registry,
gcloud docker -- push gcr.io/tensorflow-serving/resnet
The deployment consists of 3 replicas of resnet_inference
server controlled by
a Kubernetes Deployment.
The replicas are exposed externally by a
Kubernetes Service along with
an
External Load Balancer.
We create them using the example Kubernetes config resnet_k8s.yaml.
kubectl create -f tensorflow_serving/example/resnet_k8s.yaml
With output:
deployment "resnet-deployment" created
service "resnet-service" created
To view status of the deployment and pods:
$ kubectl get deployments
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
resnet-deployment 3 3 3 3 5s
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
resnet-deployment-bbcbc 1/1 Running 0 10s
resnet-deployment-cj6l2 1/1 Running 0 10s
resnet-deployment-t1uep 1/1 Running 0 10s
To view status of the service:
$ kubectl get services
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
resnet-service 10.239.240.227 104.155.184.157 8500/TCP 1m
It can take a while for everything to be up and running.
$ kubectl describe service resnet-service
Name: resnet-service
Namespace: default
Labels: run=resnet-service
Selector: run=resnet-service
Type: LoadBalancer
IP: 10.239.240.227
LoadBalancer Ingress: 104.155.184.157
Port: <unset> 8500/TCP
NodePort: <unset> 30334/TCP
Endpoints: <none>
Session Affinity: None
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1m 1m 1 {service-controller } Normal CreatingLoadBalancer Creating load balancer
1m 1m 1 {service-controller } Normal CreatedLoadBalancer Created load balancer
The service external IP address is listed next to LoadBalancer Ingress.
We can now query the service at its external address from our local host.
$ tools/run_in_docker.sh python \
tensorflow_serving/example/resnet_client_grpc.py \
--server=104.155.184.157:8500
outputs {
key: "classes"
value {
dtype: DT_INT64
tensor_shape {
dim {
size: 1
}
}
int64_val: 286
}
}
outputs {
key: "probabilities"
value {
dtype: DT_FLOAT
tensor_shape {
dim {
size: 1
}
dim {
size: 1001
}
}
float_val: 2.41628322328e-06
float_val: 1.90121829746e-06
float_val: 2.72477100225e-05
float_val: 4.42638565801e-07
float_val: 8.98362372936e-07
float_val: 6.84421956976e-06
float_val: 1.66555237229e-05
...
float_val: 1.59407863976e-06
float_val: 1.2315689446e-06
float_val: 1.17812135159e-06
float_val: 1.46365800902e-05
float_val: 5.81210713335e-07
float_val: 6.59980651108e-05
float_val: 0.00129527016543
}
}
model_spec {
name: "resnet"
version {
value: 1538687457
}
signature_name: "serving_default"
}
You have successfully deployed the ResNet model serving as a service in Kubernetes!