Skip to content

Commit

Permalink
Minor fixes for CodeGen Xeon and Gaudi Kubernetes codegen.yaml and do…
Browse files Browse the repository at this point in the history
…c updates (#613)

* Minor fixes for CodeGen Xeon and Gaudi Kubernetes codegen.yaml and doc updates

Signed-off-by: dmsuehir <dina.s.jones@intel.com>
  • Loading branch information
dmsuehir committed Aug 23, 2024
1 parent 4f3be23 commit c25063f
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 2 deletions.
8 changes: 6 additions & 2 deletions CodeGen/kubernetes/manifests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@
> You can also customize the "MODEL_ID" if needed.
> You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the CodeGEn workload is running. Otherwise, you need to modify the `codegen.yaml` file to change the `model-volume` to a directory that exists on the node.
> You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the CodeGen workload is running. Otherwise, you need to modify the `codegen.yaml` file to change the `model-volume` to a directory that exists on the node.
> Alternatively, you can change the `codegen.yaml` to use a different type of volume, such as a persistent volume claim.
## Deploy On Xeon

Expand All @@ -30,10 +31,13 @@ kubectl apply -f codegen.yaml

To verify the installation, run the command `kubectl get pod` to make sure all pods are running.

Then run the command `kubectl port-forward svc/codegen 7778:7778` to expose the CodeGEn service for access.
Then run the command `kubectl port-forward svc/codegen 7778:7778` to expose the CodeGen service for access.

Open another terminal and run the following command to verify the service if working:

> Note that it may take a couple of minutes for the service to be ready. If the `curl` command below fails, you
> can check the logs of the codegen-tgi pod to see its status or check for errors.
```
kubectl get pods
curl http://localhost:7778/v1/codegen -H "Content-Type: application/json" -d '{
Expand Down
2 changes: 2 additions & 0 deletions CodeGen/kubernetes/manifests/gaudi/codegen.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,8 @@ spec:
resources:
limits:
habana.ai/gaudi: 1
memory: 64Gi
hugepages-2Mi: 500Mi
volumes:
- name: model-volume
hostPath:
Expand Down

0 comments on commit c25063f

Please sign in to comment.