-
Notifications
You must be signed in to change notification settings - Fork 689
docs: dynamo cloud 0.4.1 k8s deployment on minikube #2990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: DFatadeNVIDIA <dfatade@nvidia.com>
athreesh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, request a double check on Istio/NGINX re: slack message
would be sick if you were able to create a Brev Launchable for this 👀
@julienmancuso @nealvaidya mind taking a look at this PR as well?
WalkthroughAdds a new README detailing end-to-end steps to deploy Dynamo Cloud on Minikube via Helm: prerequisites, optional GPU setup, Ingress/Istio, CRD and platform chart installs from NGC, secrets creation, deploying a DynamoGraphDeployment, exposing via Ingress, sample request, and cleanup. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Dev as Developer
participant MK as Minikube Cluster
participant Helm as Helm (NGC repo)
participant K8s as Kubernetes API
participant Dyn as Dynamo Platform
participant GW as Ingress/Istio GW
participant Client as Client (curl)
Dev->>MK: Start Minikube (+ optional GPU config)
Dev->>K8s: Configure StorageClass, verify
Dev->>K8s: Install Ingress / Istio components (optional)
Dev->>Helm: Add/fetch CRD & platform charts (NGC)
Dev->>K8s: Create secrets (NGC pull, HF token)
Dev->>K8s: helm install dynamo-crds
Dev->>K8s: helm install dynamo-platform
Note over K8s,Dyn: Dynamo controllers/operators and services become Ready
Dev->>K8s: kubectl apply DynamoGraphDeployment (model args)
K8s->>Dyn: Reconcile graph deployment
Dyn->>K8s: Create Pods/Services for inference
Dev->>K8s: Apply Ingress for frontend
K8s->>GW: Route external traffic
Client->>GW: HTTP request (/v1/chat/completions)
GW->>Dyn: Forward to inference service
Dyn-->>Client: Response
rect rgba(230,245,255,0.6)
Note over Dev,K8s: Cleanup: delete graph, remove ingress, uninstall charts
end
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Possibly related PRs
Pre-merge checks (2 passed, 1 warning)❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Poem
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (12)
examples/deployments/minikube/README.md (12)
5-8: Fix grammar and remove duplicated bullet.Use “its” (not “it’s”), pluralize “CRDs” (no apostrophe), capitalize consistently, and drop the repeated “managed deployment” bullet.
- - Contains the infrastructure components required for the Dynamo cloud platform - - Leverage the Dynamo Operator and it's exposed CRD's to deploy Dynamo inference graphs - - Provides a managed deployment experience + - Contains the infrastructure components required for the Dynamo Cloud platform. + - Leverages the Dynamo Operator and its exposed CRDs to deploy Dynamo inference graphs.
19-24: Link to the “general prerequisites”.Please add a concrete link to the canonical prerequisites doc so readers don’t guess.
55-60: Caution: unmounting /proc/driver/nvidia is risky and unexplained.Explain why this is needed, when to use it, and how to revert. Otherwise remove to avoid breaking host GPU visibility.
151-166: Typos and clarity.
- “buidling” → “building”.
- Tighten wording about image/tag parity with charts.
-Dynamo also supports buidling container runtimes from source and uploading them to a private registry. +Dynamo also supports building container runtimes from source and uploading them to a private registry.
258-266: Sample output should reflect chosen CRDS_VERSION.Update the example to avoid confusion (shows 0.4.0 right now).
270-296: Add --wait/--atomic to platform install; consider imagePullSecrets for all subcharts.Add wait flags; if nats/etcd images are private, document how to pass imagePullSecrets via values.
helm install dynamo-platform dynamo-platform-${RELEASE_VERSION}.tgz \ --set "dynamo-operator.imagePullSecrets[0].name=nvcrimagepullsecret" \ - --namespace ${NAMESPACE} + --namespace ${NAMESPACE} \ + --wait --atomic
311-314: Typos and path clarity.
- “args commmand” → “args command”.
- Consider showing the exact YAML snippet to edit (extraPodSpec.mainContainer.args) to minimize user error.
350-371: Optional: add ingress annotations for larger bodies/timeouts.If users send bigger prompts, consider adding NGINX annotations (proxy-body-size, proxy-read-timeout).
380-383: Grammar fix (“its” not “it’s”).-Once the ingress resource has been created, make sure to add the entry along with it's address +Once the ingress resource has been created, make sure to add the entry along with its address
387-397: Prefer a minimal curl example.Long payload makes copy/paste unwieldy. Suggest a short prompt; also format JSON with stream: false (space for readability).
-curl http://dynamo-vllm-agg-router.test/v1/chat/completions -H "Content-Type: application/json" -d '{ - "model": "Qwen/Qwen3-0.6B", - "messages": [ - { - "role": "user", - "content": "In the heart of Eldoria, ... - } - ], - "stream":false, - "max_tokens": 100 - }' +curl http://dynamo-vllm-agg-router.test/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d '{ + "model": "Qwen/Qwen3-0.6B", + "messages": [{"role": "user", "content": "Say hello from Dynamo on Minikube."}], + "stream": false, + "max_tokens": 64 + }'
404-426: Add cleanup for namespace and secrets (optional).Many users will want to fully tear down the environment.
# uninstall dynamo CRD's helm uninstall dynamo-crds -n default + +# (Optional) remove namespace and secrets created in this guide +kubectl delete namespace ${NAMESPACE}
82-122: Potential ingress overlap: NGINX vs Istio IngressGateway.You enable both NGINX Ingress and Istio. Clarify which gateway fronts traffic, or note potential port overlaps and how to disable one if needed.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
examples/deployments/minikube/README.md(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (2)
examples/deployments/minikube/README.md (2)
86-98: Ingress addon section looks good.
210-216: Chart versioning: CRDs and platform versions must be consistent and parameterized.You mix RELEASE_VERSION=0.4.1 with a hard-coded CRDs 0.4.0 later. Introduce a CRDS_VERSION (in case CRDs are versioned independently) and use helm “pull” consistently.
-# set release version -export RELEASE_VERSION=0.4.1 +# Set versions (CRDs may differ from platform) +export RELEASE_VERSION=0.4.1 +export CRDS_VERSION=0.4.1Likely an incorrect or invalid review comment.
I'm all in for a Brev Launchable personally 👀 NGINX would be required here just for exposing the service - I don't think we need both NGINX and Istio, I'll do a quick run and double check on my end |
Signed-off-by: DFatadeNVIDIA <dfatade@nvidia.com>
Signed-off-by: DFatadeNVIDIA <dfatade@nvidia.com>
|
Thank you for the contribution @DFatadeNVIDIA We already have: I'd prefer if you added any missing information/documentation there and or link out to the pre-existing documentation. |
Hey @tmonty12 - I hope you're doing well and thanks for taking the time to review and share feedback! That's good to know, I looked through all the files you linked and don't think this will add much value given what's already present in the repo. I'll get this MR closed out but happy to revisit if there are any other example needs in the future. |
Overview:
This document covers the process of deploying Dynamo Cloud and running inference in a vLLM distributed runtime within a Kubernetes environment. The Dynamo Cloud Platform provides a managed deployment experience.
Details:
This overview covers the setup process on a Minikube instance, including:
Where should the reviewer start?
examples/deployments/minikube/README.mdto review instructions for setting up Dynamo Cloud platform and deploying a vLLM inference graph.Summary by CodeRabbit