Skip to content

Conversation

volatilemolotov
Copy link

This PR adds a AI starter kit helm chart that aims to provide a out of the box development solution for AI workloads. Uses RayServe, Ollama or Ramalama to run the LLMs and JupyterHub for the development.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 24, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: volatilemolotov
Once this PR has been reviewed and has the lgtm label, please assign soltysh for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Sep 24, 2025
@volatilemolotov
Copy link
Author

Here is the initial PR, currently in draft state. Think we should be able to send it for reviews

@janetkuo @gongmax @fcabrera23

Copy link
Member

@janetkuo janetkuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we want Cloud Build and Terraform as prerequisites. Suggest making this example more generic, like other AI examples. I'd like to focus on the Kubernetes manifests and make it customizable for different platforms.

@volatilemolotov
Copy link
Author

I'm not sure if we want Cloud Build and Terraform as prerequisites. Suggest making this example more generic, like other AI examples. I'd like to focus on the Kubernetes manifests and make it customizable for different platforms.

Removed the example values and ci folder. Hope makefile can stay, it can be useful

--mount --mount-string="/tmp/models-cache:/tmp/models-cache"
```

2. **Install the chart:**
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to clarify the path this command should be run within


2. **Install the chart:**
```bash
helm install ai-starter-kit . \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like helm dependency build is needed before install


1. **Start minikube with persistent storage:**
```bash
minikube start --cpus 4 --memory 15000 \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing a step to create the models-cache folder? mkdir -p /tmp/models-cache

@@ -0,0 +1,104 @@
{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is purpose of this notebook?

@@ -0,0 +1,798 @@
{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove all the "outputs" blob in this file

-f values.yaml
```

3. **Access JupyterHub:**
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which notebooks are runnable in the minikube environment? I assume the only one works is chat_bot? Can we add description in each notebook to detail in what environment it's runnable?

Comment on lines +85 to +88
helm install ai-starter-kit . \
--set huggingface.token="YOUR_HF_TOKEN" \
-f values.yaml \
-f values-gke.yaml
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janetkuo do you have concern to include the GKE specific setup in the example? Do you think we should remove this all?

helm install ai-starter-kit . \
--set huggingface.token="YOUR_HF_TOKEN" \
-f values.yaml \
-f values-gke.yaml
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all the helm install on GKE autopilot instructions here I got error admission webhook "warden-validating.common-webhooks.networking.gke.io" denied the request: GKE Warden rejected the request because it violates one or more constraints. Violations details: {"[denied by autopilot-persistent-volume-limitation]":["Persistent Volume sources can only be of types [csi nfs gcePersistentDisk] within Autopilot."]}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants