From f99f92bc414e49d41fd4408634576a63c5758fbc Mon Sep 17 00:00:00 2001 From: aramaim Date: Mon, 7 Mar 2022 01:49:37 +0400 Subject: [PATCH 1/3] docs for k8s_deployment_rt --- docs/source/using/k8s_deployment_rt.md | 191 +++++++++++++++++++++++++ 1 file changed, 191 insertions(+) diff --git a/docs/source/using/k8s_deployment_rt.md b/docs/source/using/k8s_deployment_rt.md index b2fcddba0c..dbe5229dd6 100644 --- a/docs/source/using/k8s_deployment_rt.md +++ b/docs/source/using/k8s_deployment_rt.md @@ -1,2 +1,193 @@ ## Set up Aim remote tracking on Kubernetes (K8S) +Aim introduced [Remote Tracking (RT)](../remote_tracking.md) starting from version `3.4.0`. It allows running experiments in a multi-host environment and collect tracked data in a centralized location. +Aim RT server as well as client script can be easily deployed to a K8S cluster! Hosting Aim RT on K8S comes with +several advantages: + +* multiple users of your organization can access Aim in a single spot, which removes the need for ML practitioners to + run Aim themselves +* Aim runs can be centralized on the Remote Tracking server, which provides additional support and encouragement for remote model + training and monitoring + +The following sections demonstrates how to deploy Aim RT server and client on K8S. +The Aim RT based on [gRPC](https://grpc.io/about/) protocol and this sections also illustrate hot route traffic to a gRPC service through the Ingress-NGINX controller. + +The sections assume: + +* a repository that can host Dockerfiles, such as Google Artifact Registry or Dockerhub +* a kubernetes cluster running, with + * an ingress-nginx-controller installed +* a domain name such as `rt-example.aimstack.io` that is configured to route traffic to the Ingress-NGINX controller. +* an SSL certificate for the ingress. So you need to have a valid SSL certificate, deployed as a Kubernetes secret of type tls, in the same namespace as the gRPC application. +TODO [AD] do we need an instruction for creating SSL certificates? I guess no + + +### Dockerfile + +The following Dockerfile image should suffice for getting Aim RT server running in a container: + +```Dockerfile +# python3.7 should be sufficient to run Aim +FROM python:3.7 +# install the `aim` package on the latest version + +RUN pip install --upgrade aim + +TODO [AD] aim init? + +# We run aim listening on 0.0.0.0 to expose all ports. +# Port 53800 is the default port of `aim server` but explicit is better than implicit. +CMD yes | aim server --host 0.0.0.0 --port 53800 +``` + +TODO [AD] our docker images hosts on dockerhub? + +Assuming you store the above in your current directory, the container can be built +using `docker build . -t aim-server-container:1` and pushed to your repository +with `docker push my-docker-repository.dev/deployments/aim-server:1`. + +### Deployment + +The main Aim deployment will have a single container that runs Aim RT Server. +This deployment will use docker file defined previously. +K8S deployment is: + +```YAML +# aim-server-deploy.yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + labels: + app: aim-server + name: aim-server +spec: + replicas: 1 + selector: + matchLabels: + app: aim-server + template: + metadata: + labels: + app: aim-server + spec: + containers: + - image: my-docker-repository.dev/deployments/aim-server:1 + name: aim-server + ports: + - containerPort: 53800 +``` + +This K8S deployment: + +* defines a pod with a single replica that runs the Aim remote tracker server defined by the Dockerfile +* starts up the Aim server on port 53800. + +You can save the above example manifest to a file with name `aim-server-deploy.yaml`. +You can create the k8s deployment with a kubectl command like this: + +```shell +$ kubectl apply -f aim-server-deploy.yaml +``` + +### Service + +The AIM service can use the following manifest to create a service of type ClusterIP. + +```YAML +# aim-server-service.yaml +apiVersion: v1 +kind: Service +metadata: + labels: + app: aim-server + name: aim-server +spec: + ports: + - port: 80 + protocol: TCP + targetPort: 53800 + selector: + app: aim-server + type: ClusterIP +``` + +The service definition can be applied via: + +```shell +$ kubectl create -f aim-server-service.yaml +``` + +### Ingress + +We need to create the Kubernetes Ingress resource for the `aim server` which is actually gRPC app. +Use the following example manifest of an ingress resource to create an ingress for `aim server`. +Make sure you have the required SSL-Certificate (`aim-server-tls`), existing in your Kubernetes cluster in the same namespace where the `aim server` is. +The certificate must be available as a kubernetes secret resource, of type "kubernete.io/tls". +We need SSL-Certificate here because we are terminating TLS on the ingress. + +```YAML +# aim-server-ingress.yaml +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + annotations: + nginx.ingress.kubernetes.io/ssl-redirect: "true" + nginx.ingress.kubernetes.io/backend-protocol: "GRPC" + name: fortune-ingress + namespace: default +spec: + ingressClassName: nginx + rules: + - host: rt-example.aimstack.io + http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: aim-server + port: + number: 80 + tls: + - secretName: aim-server-tls + hosts: + - rt-example.aimstack.io +``` + +If you save the above example manifest as a file named `aim-server-ingress.yaml`, you can create the ingress like this: + +```shell +$ kubectl create -f ingress.go-grpc-greeter-server.yaml +``` +* Note that we are not doing any TLS configuration on the server, because we are terminating TLS at the ingress level, gRPC traffic will travel unencrypted inside the cluster and arrive "insecure". +* The ingress are tagged with the annotation nginx.ingress.kubernetes.io/backend-protocol: "GRPC". This sets up the nginx to route http/2 traffic to `aim service`. +* We are terminating TLS at the ingress and have configured an SSL certificate `aim-server-tls`. + * The ingress matches traffic arriving as https://rt-example.aimstack.io:443 and routes unencrypted messages to the aim Kubernetes service. + +### Client + +Now you are ready to create aim client pod to track your experiment results. + +```YAML +apiVersion: v1 +kind: Pod +metadata: + name: my-super-aim-client +spec: + containers: + - name: my-super-aim-client + image: my-super-aim-client + env: + - name: __AIM_CLIENT_SSL_CERTIFICATE__ + valueFrom: + secretKeyRef: + name: aim-server-tls + key: cert +``` + +The `aim.Run` uses `__AIM_CLIENT_SSL_CERTIFICATE__` environment variable for secure channel establishment; it's a PEM-encoded root certificate. + +# TODO [AD] double check with TJ if this client's setup work for him +# and implement __AIM_CLIENT_SSL_CERTIFICATE__ +# (currently we have __AIM_CLIENT_SSL_CERTIFICATES_FILE__) + From fea52a4201776c84726a8e9d49af43270a094d58 Mon Sep 17 00:00:00 2001 From: aramaim Date: Mon, 14 Mar 2022 08:50:27 +0400 Subject: [PATCH 2/3] fix TODOs --- docs/source/using/k8s_deployment_rt.md | 13 ++----------- 1 file changed, 2 insertions(+), 11 deletions(-) diff --git a/docs/source/using/k8s_deployment_rt.md b/docs/source/using/k8s_deployment_rt.md index dbe5229dd6..5461b0640f 100644 --- a/docs/source/using/k8s_deployment_rt.md +++ b/docs/source/using/k8s_deployment_rt.md @@ -19,29 +19,20 @@ The sections assume: * an ingress-nginx-controller installed * a domain name such as `rt-example.aimstack.io` that is configured to route traffic to the Ingress-NGINX controller. * an SSL certificate for the ingress. So you need to have a valid SSL certificate, deployed as a Kubernetes secret of type tls, in the same namespace as the gRPC application. -TODO [AD] do we need an instruction for creating SSL certificates? I guess no - ### Dockerfile The following Dockerfile image should suffice for getting Aim RT server running in a container: ```Dockerfile -# python3.7 should be sufficient to run Aim -FROM python:3.7 -# install the `aim` package on the latest version - -RUN pip install --upgrade aim - -TODO [AD] aim init? +# See aim docker hub documentation https://hub.docker.com/r/aimstack/aim +FROM aimstack/aim:latest # We run aim listening on 0.0.0.0 to expose all ports. # Port 53800 is the default port of `aim server` but explicit is better than implicit. CMD yes | aim server --host 0.0.0.0 --port 53800 ``` -TODO [AD] our docker images hosts on dockerhub? - Assuming you store the above in your current directory, the container can be built using `docker build . -t aim-server-container:1` and pushed to your repository with `docker push my-docker-repository.dev/deployments/aim-server:1`. From 835c9e8961681db1c8536e165c312757915226b2 Mon Sep 17 00:00:00 2001 From: aramaim Date: Tue, 19 Apr 2022 22:58:39 +0400 Subject: [PATCH 3/3] minor fixes --- docs/source/using/k8s_deployment_rt.md | 26 +++++++++++--------------- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/docs/source/using/k8s_deployment_rt.md b/docs/source/using/k8s_deployment_rt.md index 5461b0640f..beda9a74da 100644 --- a/docs/source/using/k8s_deployment_rt.md +++ b/docs/source/using/k8s_deployment_rt.md @@ -4,12 +4,12 @@ Aim introduced [Remote Tracking (RT)](../remote_tracking.md) starting from versi Aim RT server as well as client script can be easily deployed to a K8S cluster! Hosting Aim RT on K8S comes with several advantages: -* multiple users of your organization can access Aim in a single spot, which removes the need for ML practitioners to +* multiple users of your organization can ac cess Aim in a single spot, which removes the need for ML practitioners to run Aim themselves * Aim runs can be centralized on the Remote Tracking server, which provides additional support and encouragement for remote model training and monitoring -The following sections demonstrates how to deploy Aim RT server and client on K8S. +The following sections demonstrates how to deploy Aim RT server and client on K8S. The Aim RT based on [gRPC](https://grpc.io/about/) protocol and this sections also illustrate hot route traffic to a gRPC service through the Ingress-NGINX controller. The sections assume: @@ -18,7 +18,7 @@ The sections assume: * a kubernetes cluster running, with * an ingress-nginx-controller installed * a domain name such as `rt-example.aimstack.io` that is configured to route traffic to the Ingress-NGINX controller. -* an SSL certificate for the ingress. So you need to have a valid SSL certificate, deployed as a Kubernetes secret of type tls, in the same namespace as the gRPC application. +* an SSL certificate for the ingress. So you need to have a valid SSL certificate, deployed as a Kubernetes secret of type tls, in the same namespace as the gRPC application. ### Dockerfile @@ -39,8 +39,8 @@ with `docker push my-docker-repository.dev/deployments/aim-server:1`. ### Deployment -The main Aim deployment will have a single container that runs Aim RT Server. -This deployment will use docker file defined previously. +The main Aim deployment will have a single container that runs Aim RT Server. +This deployment will use docker file defined previously. K8S deployment is: ```YAML @@ -73,7 +73,7 @@ This K8S deployment: * defines a pod with a single replica that runs the Aim remote tracker server defined by the Dockerfile * starts up the Aim server on port 53800. -You can save the above example manifest to a file with name `aim-server-deploy.yaml`. +You can save the above example manifest to a file with name `aim-server-deploy.yaml`. You can create the k8s deployment with a kubectl command like this: ```shell @@ -111,9 +111,9 @@ $ kubectl create -f aim-server-service.yaml ### Ingress We need to create the Kubernetes Ingress resource for the `aim server` which is actually gRPC app. -Use the following example manifest of an ingress resource to create an ingress for `aim server`. -Make sure you have the required SSL-Certificate (`aim-server-tls`), existing in your Kubernetes cluster in the same namespace where the `aim server` is. -The certificate must be available as a kubernetes secret resource, of type "kubernete.io/tls". +Use the following example manifest of an ingress resource to create an ingress for `aim server`. +Make sure you have the required SSL-Certificate (`aim-server-tls`), existing in your Kubernetes cluster in the same namespace where the `aim server` is. +The certificate must be available as a kubernetes secret resource, of type "kubernete.io/tls". We need SSL-Certificate here because we are terminating TLS on the ingress. ```YAML @@ -151,8 +151,8 @@ If you save the above example manifest as a file named `aim-server-ingress.yaml` $ kubectl create -f ingress.go-grpc-greeter-server.yaml ``` * Note that we are not doing any TLS configuration on the server, because we are terminating TLS at the ingress level, gRPC traffic will travel unencrypted inside the cluster and arrive "insecure". -* The ingress are tagged with the annotation nginx.ingress.kubernetes.io/backend-protocol: "GRPC". This sets up the nginx to route http/2 traffic to `aim service`. -* We are terminating TLS at the ingress and have configured an SSL certificate `aim-server-tls`. +* The ingress are tagged with the annotation nginx.ingress.kubernetes.io/backend-protocol: "GRPC". This sets up the nginx to route http/2 traffic to `aim service`. +* We are terminating TLS at the ingress and have configured an SSL certificate `aim-server-tls`. * The ingress matches traffic arriving as https://rt-example.aimstack.io:443 and routes unencrypted messages to the aim Kubernetes service. ### Client @@ -178,7 +178,3 @@ spec: The `aim.Run` uses `__AIM_CLIENT_SSL_CERTIFICATE__` environment variable for secure channel establishment; it's a PEM-encoded root certificate. -# TODO [AD] double check with TJ if this client's setup work for him -# and implement __AIM_CLIENT_SSL_CERTIFICATE__ -# (currently we have __AIM_CLIENT_SSL_CERTIFICATES_FILE__) -