Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor fixes for CodeGen Xeon and Gaudi Kubernetes codegen.yaml and doc updates #613

Merged
8 changes: 6 additions & 2 deletions CodeGen/kubernetes/manifests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@

> You can also customize the "MODEL_ID" if needed.

> You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the CodeGEn workload is running. Otherwise, you need to modify the `codegen.yaml` file to change the `model-volume` to a directory that exists on the node.
> You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the CodeGen workload is running. Otherwise, you need to modify the `codegen.yaml` file to change the `model-volume` to a directory that exists on the node.
> Alternatively, you can change the `codegen.yaml` to use a different type of volume, such as a persistent volume claim.

## Deploy On Xeon

Expand All @@ -30,10 +31,13 @@ kubectl apply -f codegen.yaml

To verify the installation, run the command `kubectl get pod` to make sure all pods are running.

Then run the command `kubectl port-forward svc/codegen 7778:7778` to expose the CodeGEn service for access.
Then run the command `kubectl port-forward svc/codegen 7778:7778` to expose the CodeGen service for access.

Open another terminal and run the following command to verify the service if working:

> Note that it may take a couple of minutes for the service to be ready. If the `curl` command below fails, you
> can check the logs of the codegen-tgi pod to see its status or check for errors.

```
kubectl get pods
curl http://localhost:7778/v1/codegen -H "Content-Type: application/json" -d '{
Expand Down
2 changes: 2 additions & 0 deletions CodeGen/kubernetes/manifests/gaudi/codegen.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,8 @@ spec:
resources:
limits:
habana.ai/gaudi: 1
memory: 64Gi
hugepages-2Mi: 500Mi
volumes:
- name: model-volume
hostPath:
Expand Down