From f62970200469915a98066e2c232d31d0de55e479 Mon Sep 17 00:00:00 2001
From: Dina Suehiro Jones <dina.s.jones@intel.com>
Date: Fri, 23 Aug 2024 01:04:57 -0700
Subject: [PATCH] Minor fixes for CodeGen Xeon and Gaudi Kubernetes
 codegen.yaml and doc updates (#613)

* Minor fixes for CodeGen Xeon and Gaudi Kubernetes codegen.yaml and doc updates

Signed-off-by: dmsuehir <dina.s.jones@intel.com>
(cherry picked from commit c25063f4bb6fc1fd21460366fbdc56106cb179c1)
---
 CodeGen/kubernetes/manifests/README.md          | 8 ++++++--
 CodeGen/kubernetes/manifests/gaudi/codegen.yaml | 2 ++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/CodeGen/kubernetes/manifests/README.md b/CodeGen/kubernetes/manifests/README.md
index f6a0763726..4e0a0e0b69 100644
--- a/CodeGen/kubernetes/manifests/README.md
+++ b/CodeGen/kubernetes/manifests/README.md
@@ -6,7 +6,8 @@
 
 > You can also customize the "MODEL_ID" if needed.
 
-> You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the CodeGEn workload is running. Otherwise, you need to modify the `codegen.yaml` file to change the `model-volume` to a directory that exists on the node.
+> You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the CodeGen workload is running. Otherwise, you need to modify the `codegen.yaml` file to change the `model-volume` to a directory that exists on the node.
+> Alternatively, you can change the `codegen.yaml` to use a different type of volume, such as a persistent volume claim.
 
 ## Deploy On Xeon
 
@@ -30,10 +31,13 @@ kubectl apply -f codegen.yaml
 
 To verify the installation, run the command `kubectl get pod` to make sure all pods are running.
 
-Then run the command `kubectl port-forward svc/codegen 7778:7778` to expose the CodeGEn service for access.
+Then run the command `kubectl port-forward svc/codegen 7778:7778` to expose the CodeGen service for access.
 
 Open another terminal and run the following command to verify the service if working:
 
+> Note that it may take a couple of minutes for the service to be ready. If the `curl` command below fails, you
+> can check the logs of the codegen-tgi pod to see its status or check for errors.
+
 ```
 kubectl get pods
 curl http://localhost:7778/v1/codegen -H "Content-Type: application/json" -d '{
diff --git a/CodeGen/kubernetes/manifests/gaudi/codegen.yaml b/CodeGen/kubernetes/manifests/gaudi/codegen.yaml
index 58f70192ea..b671594caf 100644
--- a/CodeGen/kubernetes/manifests/gaudi/codegen.yaml
+++ b/CodeGen/kubernetes/manifests/gaudi/codegen.yaml
@@ -271,6 +271,8 @@ spec:
           resources:
             limits:
               habana.ai/gaudi: 1
+              memory: 64Gi
+              hugepages-2Mi: 500Mi
       volumes:
         - name: model-volume
           hostPath: