-
Notifications
You must be signed in to change notification settings - Fork 690
fix: bug 5501463 Dockerfile run permission problems #2881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: bug 5501463 Dockerfile run permission problems #2881
Conversation
|
This can't be merged to |
WalkthroughRenames the dev build target from local-dev to dev across Dockerfile, devcontainer config, and scripts. Adds --user to run.sh. Updates vLLM handler to default prefill off when absent and adds a prefill path. Simplifies Helm deployment by removing scripts, bumping chart versions, and revising docs. Adds streaming/cancellation to hello_world example. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client
participant DecodeHandler as DecodeWorkerHandler
participant PrefillClient as PrefillWorkerClient
participant Engine as DecodeEngine
Note over DecodeHandler: generate(request, sampling_params)
DecodeHandler->>DecodeHandler: _prefill_check_loop()
alt Prefill allowed
DecodeHandler->>PrefillClient: round_robin(prefill_request)
PrefillClient-->>DecodeHandler: prefill_response (kv_transfer_params)
DecodeHandler->>DecodeHandler: sampling_params.extra_args.kv_transfer_params = response.kv_transfer_params
else Prefill not allowed or client None
Note over DecodeHandler: Proceed without prefill
end
loop Stream tokens
DecodeHandler->>Engine: generate_tokens(...)
Engine-->>DecodeHandler: token/update
DecodeHandler-->>Client: stream token
end
rect rgba(255,230,230,0.5)
Note over DecodeHandler,Engine: On EngineDeadError → shutdown path (unchanged)
end
sequenceDiagram
autonumber
participant User
participant Endpoint as content_generator()
participant Ctx as Context
User->>Endpoint: request (string)
loop For each word
Endpoint->>Endpoint: await sleep(1s)
alt Cancelled?
Endpoint->>Ctx: is_stopped()/is_killed()
Ctx-->>Endpoint: true
Endpoint-->>User: return (stop streaming)
else Not cancelled
Endpoint-->>User: yield "Hello {word}!"
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
.devcontainer/README.md (1)
381-382: Fix stale troubleshooting command.Replace
--target local-devwith--target dev.- ./container/build.sh --target local-dev + ./container/build.sh --target devdocs/benchmarks/pre_deployment_profiling.md (1)
61-61: Fix path typo: stray closing brace in filename.The trailing
}makes the example path invalid.-* `${benchmark_result_dir}/selected_prefill_interpolation/raw_data.npz}` +* `${benchmark_result_dir}/selected_prefill_interpolation/raw_data.npz`benchmarks/profiler/README.md (1)
61-61: Same path typo here: remove the extra brace.Mirror the fix from the main doc.
-* `${benchmark_result_dir}/selected_prefill_interpolation/raw_data.npz}` +* `${benchmark_result_dir}/selected_prefill_interpolation/raw_data.npz`docs/guides/dynamo_deploy/installation_guide.md (1)
56-69: Use ‘helm pull’ instead of ‘helm fetch’.Align with Helm v3 terminology.
-helm fetch https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-crds-${RELEASE_VERSION}.tgz +helm pull https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-crds-${RELEASE_VERSION}.tgz helm install dynamo-crds dynamo-crds-${RELEASE_VERSION}.tgz --namespace default @@ -helm fetch https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform-${RELEASE_VERSION}.tgz +helm pull https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform-${RELEASE_VERSION}.tgz helm install dynamo-platform dynamo-platform-${RELEASE_VERSION}.tgz --namespace ${NAMESPACE}
🧹 Nitpick comments (28)
.devcontainer/devcontainer.json (1)
10-10: Fix comment to match new image naming (no longer "local dev").The comment says "latest VLLM local dev image" but the image is now "dynamo:latest-vllm". Update the comment for clarity.
- "image": "dynamo:latest-vllm", // Use the latest VLLM local dev image + "image": "dynamo:latest-vllm", // Use the latest VLLM dev image.devcontainer/README.md (1)
44-46: Update diagram to reflect new image tag.Change
dynamo:latest-vllm-local-dev→dynamo:latest-vllm.- IMAGE["Docker Image<br/>dynamo:latest-vllm-local-dev"] + IMAGE["Docker Image<br/>dynamo:latest-vllm"]container/run.sh (5)
46-46: Avoid shadowing shell’s $USER; use a distinct variable name.Using
USERcan conflict with the standard environment variable. Rename toCONTAINER_USER(orRUN_USER) to prevent surprises.-USER= +CONTAINER_USER=
128-135: Rename option storage and keep quoting to avoid word-splitting.Store into
CONTAINER_USERand preserve quoting.- --user) - if [ "$2" ]; then - USER="$2" + --user) + if [ "$2" ]; then + CONTAINER_USER="$2" shift else missing_requirement "$1" fi ;;
286-291: Build user flag safely and clearly.Use the renamed var and quote it.
- if [[ ${USER} == "" ]]; then + if [[ -z "${CONTAINER_USER}" ]]; then USER_STRING="" else - USER_STRING="--user ${USER}" + USER_STRING="--user ${CONTAINER_USER}" fi
328-329: Help text: clarify accepted formats.Mention UID:GID and username to match PR description.
- echo " [--user override the user for running the container]" + echo " [--user override the user (username or UID[:GID]) for running the container]"
239-265: HF cache mount may mismatch when running as non-root.When
--useris set to a non-root user, mounting to/root/.cache/huggingfacecan cause permission or path mismatches. Consider honoringHF_HOME(and setting it) or mounting to the effective user’s cache dir when--useris provided.- VOLUME_MOUNTS+=" -v $HF_CACHE:/root/.cache/huggingface" + # If a custom container user is requested, prefer HF_HOME under that user's home + if [ -n "${CONTAINER_USER}" ]; then + ENVIRONMENT_VARIABLES+=" -e HF_HOME=/home/ubuntu/.cache/huggingface" + VOLUME_MOUNTS+=" -v $HF_CACHE:/home/ubuntu/.cache/huggingface" + else + VOLUME_MOUNTS+=" -v $HF_CACHE:/root/.cache/huggingface" + fidocs/benchmarks/pre_deployment_profiling.md (1)
7-9: Good clarity on time expectations.Helpful note. Consider adding “duration varies by model size, GPU type, and TP search space” to set expectations.
> **Time Investment**: This profiling process is comprehensive and typically takes **a few hours** to complete. The script systematically tests multiple tensor parallelism configurations and load conditions to find optimal performance settings. This upfront investment ensures your deployment meets SLA requirements and operates efficiently. +> Actual duration varies by model size, GPU type, and the TP search space you enable.components/backends/vllm/src/dynamo/vllm/handlers.py (6)
152-164: Primecan_prefillat init and tighten the guard to avoid race on first requests.First requests within the initial 5s window may skip prefill even when workers exist. Also, be explicit about
prefill_worker_clientpresence.Apply:
@@ def __init__(...): - if self.prefill_worker_client is not None: - self._prefill_check_task = asyncio.create_task(self._prefill_check_loop()) + if self.prefill_worker_client is not None: + try: + self.can_prefill = len(self.prefill_worker_client.instance_ids()) + except Exception: + self.can_prefill = 0 + self._prefill_check_task = asyncio.create_task(self._prefill_check_loop()) @@ async def generate(...): - if self.can_prefill: + if self.prefill_worker_client is not None and self.can_prefill > 0: # Create a copy for prefill with specific modifications
171-176: Avoid double-await; add a timeout to prevent hangs.The current
await anext(await round_robin(...))is hard to read and lacks a timeout.- try: - prefill_response = await anext( - await self.prefill_worker_client.round_robin( - prefill_request, context=context - ) - ) + try: + async with asyncio.timeout(5): + rr_gen = await self.prefill_worker_client.round_robin( + prefill_request, context=context + ) + prefill_response = await anext(rr_gen)
177-184: Handle cancellation separately and avoid re-raising opaqueException.Treat
asyncio.CancelledErrorexplicitly, and consider mapping other exceptions to a structured error for upstreams.- except Exception as e: - # TODO: Cancellation does not propagate until the first token is received - if context.is_stopped() or context.is_killed(): - logger.debug(f"Aborted Remote Prefill Request ID: {request_id}") - # TODO: Raise asyncio.CancelledError into bindings - return - raise e + except asyncio.CancelledError: + logger.debug(f"Cancelled Remote Prefill Request ID: {request_id}") + raise + except Exception as e: + if context.is_stopped() or context.is_killed(): + logger.debug(f"Aborted Remote Prefill Request ID: {request_id}") + return + logger.error(f"Remote prefill failed: {e}") + # Fallback to local decode without KV handoff
185-195: Guard against missing/empty KV transfer params.If
kv_transfer_paramsis absent, decoding may later assume its presence.- prefill_response = MyRequestOutput.model_validate_json( - prefill_response.data() - ) + prefill_response = MyRequestOutput.model_validate_json(prefill_response.data()) @@ - sampling_params.extra_args[ - "kv_transfer_params" - ] = prefill_response.kv_transfer_params + kvp = getattr(prefill_response, "kv_transfer_params", None) or {} + if not kvp: + logger.warning(f"No kv_transfer_params from prefill; continuing without KV handoff. request_id={request_id}") + sampling_params.extra_args["kv_transfer_params"] = kvp
129-133: Cancel-and-drain background task to avoid task warnings on shutdown.If the loop raises after cancel, the task may log unhandled exceptions. Drain it when possible.
def cleanup(self): """Cancel background tasks.""" - if self._prefill_check_task is not None: - self._prefill_check_task.cancel() + if self._prefill_check_task is not None: + self._prefill_check_task.cancel() + # Best-effort: schedule a drain without making cleanup async + async def _drain(t): + try: + await t + except asyncio.CancelledError: + pass + except Exception as e: + logger.debug(f"Prefill check task closed with error: {e}") + try: + asyncio.get_running_loop().create_task(_drain(self._prefill_check_task)) + except RuntimeError: + # No running loop; ignore + pass super().cleanup()
206-211: Deduplicate EngineDeadError handling (minor).
generate_tokensalready catchesEngineDeadErrorand exits the process. Consider centralizing handling to one place.docs/guides/dynamo_deploy/metrics.md (1)
10-10: Link fix LGTM; minor typos in this doc.
- Line 23: “podmnonitors” → “PodMonitors”.
- Lines 75–76: sample text has typos (“ests”, “familt”); consider trimming or correcting to avoid distracting copy.
-# Values allow podmnonitors to be picked up that are outside of the kube-prometheus-stack helm release +# Values allow PodMonitors to be picked up outside the kube-prometheus-stack Helm releasedocs/guides/dynamo_deploy/dynamo_operator.md (1)
41-41: Tighten wording and fix minor grammar/link text in this page.Also clean up Line 55 phrasing.
-- A Kubernetes cluster with [Dynamo Cloud](installation_guide.md) installed +- A Kubernetes cluster with the [Dynamo Kubernetes Platform](installation_guide.md) installed-First, follow to [See Install Dynamo Cloud](README.md). +First, see [Install Dynamo Cloud](README.md).deploy/cloud/helm/platform/README.md.gotmpl (1)
60-60: Prefer absolute URL so this link works on Artifact Hub and other renderersRelative paths often break outside the repo context. Consider linking to GitHub (or docs site) directly.
- - [Dynamo Cloud Deployment Installation Guide](../../../../docs/guides/dynamo_deploy/installation_guide.md) + - [Dynamo Cloud Deployment Installation Guide](https://github.com/ai-dynamo/dynamo/blob/main/docs/guides/dynamo_deploy/installation_guide.md)docs/guides/dynamo_deploy/gke_setup.md (1)
148-152: Clarify chart path/working directory for the helm commandMake the path explicit to reduce confusion about where to run the command from.
-You can use it during helm installation: +From the repository root, you can use it during helm installation: @@ -helm upgrade --install ${RELEASE} platform/ -f values.yaml --namespace ${NAMESPACE} +helm upgrade --install ${RELEASE} deploy/cloud/helm/platform/ -f values.yaml --namespace ${NAMESPACE}deploy/cloud/helm/platform/values.yaml (1)
135-137: Pin image tag or document chart-managed tag source.You switched to repository: bitnamilegacy/etcd. If you intend to rely on the chart’s default tag, add a brief comment so operators know where the tag is controlled; otherwise explicitly pin to a version to prevent surprise upgrades.
etdc: image: - repository: bitnamilegacy/etcd + repository: bitnamilegacy/etcd + # tag: 3.5.15-debian-12-rXX # Optional: pin for reproducibilitydeploy/cloud/helm/crds/README.md (1)
18-20: Link/path LGTM; add trailing newline for linters.Content and relative link look correct. Add a newline at EOF to satisfy common markdown/style linters.
docs/guides/dynamo_deploy/README.md (2)
55-59: Namespace inconsistency: reuse the same NAMESPACE set above.Earlier you set NAMESPACE=dynamo-kubernetes. Here it switches to dynamo-cloud, which can confuse readers.
-# Set same namespace from platform install -export NAMESPACE=dynamo-cloud +# Set same namespace from platform install +export NAMESPACE=dynamo-kubernetes
107-134: Align dynamoNamespace values to a single namespace.Mixing my-llm and dynamo-dev can cause misplacement of resources. Use one namespace placeholder consistently.
services: Frontend: - dynamoNamespace: my-llm + dynamoNamespace: my-llm @@ VllmDecodeWorker: # or SGLangDecodeWorker, TrtllmDecodeWorker - dynamoNamespace: dynamo-dev + dynamoNamespace: my-llmdeploy/cloud/helm/README.md (1)
18-23: README looks good; add newline at EOF.Minor formatting nit to appease linters.
docs/guides/dynamo_deploy/installation_guide.md (1)
24-25: Minor grammar tweak.“Platform is installed using … helm chart” → “The platform is installed using the … Helm chart.”
-Platform is installed using Dynamo Kubernetes Platform [helm chart](../../../deploy/cloud/helm/platform/README.md). +The platform is installed using the Dynamo Kubernetes Platform [Helm chart](../../../deploy/cloud/helm/platform/README.md).docs/guides/dynamo_deploy/api_reference.md (1)
1-20: De-duplicate the “auto-generated” warning (header.md already adds it)Since deploy/cloud/operator/docs/header.md is prepended during the Makefile step, this file ends up with two consecutive auto-generation warnings. Keep it in header.md only to avoid redundancy.
Apply this diff to drop the duplicate notice here (the header will still render it):
-> -> **⚠️ Important**: This documentation is automatically generated from source code. -> -> Do not edit this file directly. -deploy/cloud/operator/Makefile (1)
288-294: Harden pathing: ensure target docs directory exists and avoid duplicate warnings
- The relative path ../../../docs/guides/dynamo_deploy/ looks correct from deploy/cloud/operator, but mkdir -p the target dir for robustness on clean clones.
- Given header.md already provides an auto-gen warning, consider removing the duplicate lines from the final api_reference.md (see separate comment). The Makefile step is fine; just avoid committing the duplicate text in the generated file.
Apply this diff to make the target path creation explicit:
@echo "✅ Generated API reference at ./docs/api_reference.md" # concatenate header.md and api_reference.md + @mkdir -p ../../../docs/guides/dynamo_deploy cat docs/header.md ./docs/api_reference.md > ../../../docs/guides/dynamo_deploy/api_reference.md rm ./docs/api_reference.md @echo "✅ Concatenated header.md and api_reference.md"examples/runtime/hello_world/hello_world.py (2)
31-36: Reduce cancel latency; avoidCheck cancel before and after sleep, use logger, and skip empty/whitespace tokens.
Apply:
- for word in request.split(","): - await asyncio.sleep(1) - if context.is_stopped() or context.is_killed(): - print("request got cancelled.") - return - yield f"Hello {word}!" + for raw in request.split(","): + # Early check to avoid ~1s latency on cancel + if context.is_stopped() or context.is_killed(): + logger.info("Request cancelled: id=%s", context.id()) + return + word = raw.strip() + if not word: + continue + await asyncio.sleep(1) + # Late check in case cancel occurred during sleep + if context.is_stopped() or context.is_killed(): + logger.info("Request cancelled after sleep: id=%s", context.id()) + return + yield f"Hello {word}!"
44-45: Harden against missing etcd client (static workers).Avoid potential
AttributeErrorifetcd_client()returnsNone.- lease_id = runtime.etcd_client().primary_lease_id() + etcd = runtime.etcd_client() + lease_id = etcd.primary_lease_id() if etcd else "N/A"
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (35)
.devcontainer/README.md(1 hunks).devcontainer/devcontainer.json(1 hunks)benchmarks/profiler/README.md(1 hunks)components/backends/vllm/src/dynamo/vllm/handlers.py(2 hunks)container/Dockerfile.vllm(1 hunks)container/build.sh(1 hunks)container/run.sh(5 hunks)deploy/README.md(0 hunks)deploy/README.md(1 hunks)deploy/cloud/helm/README.md(1 hunks)deploy/cloud/helm/crds/Chart.yaml(1 hunks)deploy/cloud/helm/crds/README.md(1 hunks)deploy/cloud/helm/deploy.sh(0 hunks)deploy/cloud/helm/dynamo-platform-values.yaml(0 hunks)deploy/cloud/helm/network-config-wizard.sh(0 hunks)deploy/cloud/helm/platform/README.md(1 hunks)deploy/cloud/helm/platform/README.md.gotmpl(1 hunks)deploy/cloud/helm/platform/values.yaml(1 hunks)deploy/cloud/operator/Makefile(1 hunks)deploy/cloud/operator/README.md(1 hunks)deploy/cloud/operator/docs/header.md(1 hunks)deploy/helm/chart/Chart.yaml(1 hunks)docs/architecture/sla_planner.md(1 hunks)docs/benchmarks/pre_deployment_profiling.md(1 hunks)docs/guides/dynamo_deploy/README.md(3 hunks)docs/guides/dynamo_deploy/api_reference.md(1 hunks)docs/guides/dynamo_deploy/dynamo_operator.md(2 hunks)docs/guides/dynamo_deploy/gke_setup.md(1 hunks)docs/guides/dynamo_deploy/grove.md(1 hunks)docs/guides/dynamo_deploy/installation_guide.md(4 hunks)docs/guides/dynamo_deploy/metrics.md(1 hunks)docs/guides/dynamo_deploy/minikube.md(1 hunks)docs/guides/dynamo_deploy/sla_planner_deployment.md(1 hunks)docs/index.rst(1 hunks)examples/runtime/hello_world/hello_world.py(1 hunks)
💤 Files with no reviewable changes (3)
- deploy/cloud/helm/deploy.sh
- deploy/cloud/helm/dynamo-platform-values.yaml
- deploy/cloud/helm/network-config-wizard.sh
🧰 Additional context used
🧠 Learnings (8)
📓 Common learnings
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2822
File: container/Dockerfile.vllm:343-352
Timestamp: 2025-09-03T01:10:12.599Z
Learning: In the dynamo project's local-dev Docker targets, USER_UID and USER_GID build args are intentionally left without default values to force explicit UID/GID mapping during build time, preventing file permission issues in local development environments where container users need to match host user permissions for mounted volumes.
Learnt from: julienmancuso
PR: ai-dynamo/dynamo#1474
File: deploy/cloud/operator/internal/controller/dynamocomponent_controller.go:1302-1306
Timestamp: 2025-06-11T21:18:00.425Z
Learning: In the Dynamo operator, the project’s preferred security posture is to set a Pod-level `PodSecurityContext` with `runAsUser`, `runAsGroup`, and `fsGroup` all set to `1000`, and then selectively override the user at the individual container level (e.g., `RunAsUser: 0` for Kaniko) when root is required.
📚 Learning: 2025-08-30T20:43:10.091Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: .devcontainer/devcontainer.json:12-12
Timestamp: 2025-08-30T20:43:10.091Z
Learning: In the dynamo project, devcontainer.json files use templated container names (like "dynamo-vllm-devcontainer") that are automatically processed by the copy_devcontainer.sh script to generate framework-specific configurations with unique names, preventing container name collisions.
Applied to files:
.devcontainer/devcontainer.json.devcontainer/README.md
📚 Learning: 2025-08-30T20:43:10.091Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: .devcontainer/devcontainer.json:12-12
Timestamp: 2025-08-30T20:43:10.091Z
Learning: In the dynamo project's devcontainer setup, hard-coded container names in devcontainer.json files serve as templates that are automatically processed by the copy_devcontainer.sh script to generate framework-specific configurations with unique names, preventing container name collisions.
Applied to files:
.devcontainer/devcontainer.json.devcontainer/README.md
📚 Learning: 2025-08-30T20:43:49.632Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: container/Dockerfile:437-449
Timestamp: 2025-08-30T20:43:49.632Z
Learning: In the dynamo project's devcontainer setup, the team prioritizes consistency across framework-specific Dockerfiles (like container/Dockerfile, container/Dockerfile.vllm, etc.) by mirroring their structure, even when individual optimizations might be possible, to maintain uniformity in the development environment setup.
Applied to files:
.devcontainer/devcontainer.json.devcontainer/README.mdcontainer/Dockerfile.vllm
📚 Learning: 2025-09-03T01:10:12.599Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2822
File: container/Dockerfile.vllm:343-352
Timestamp: 2025-09-03T01:10:12.599Z
Learning: In the dynamo project's local-dev Docker targets, USER_UID and USER_GID build args are intentionally left without default values to force explicit UID/GID mapping during build time, preventing file permission issues in local development environments where container users need to match host user permissions for mounted volumes.
Applied to files:
.devcontainer/devcontainer.jsoncontainer/build.sh.devcontainer/README.md
📚 Learning: 2025-07-18T16:04:31.771Z
Learnt from: julienmancuso
PR: ai-dynamo/dynamo#2012
File: deploy/cloud/helm/crds/templates/nvidia.com_dynamocomponentdeployments.yaml:92-98
Timestamp: 2025-07-18T16:04:31.771Z
Learning: CRD schemas in files like deploy/cloud/helm/crds/templates/*.yaml are auto-generated from Kubernetes library upgrades and should not be manually modified as changes would be overwritten during regeneration.
Applied to files:
deploy/cloud/helm/crds/README.mddeploy/cloud/helm/crds/Chart.yaml
📚 Learning: 2025-09-03T21:02:02.170Z
Learnt from: julienmancuso
PR: ai-dynamo/dynamo#2843
File: deploy/cloud/helm/platform/values.yaml:0-0
Timestamp: 2025-09-03T21:02:02.170Z
Learning: With etcd chart version 12.0.18, the global.security.allowInsecureImages setting is not required when using the bitnamilegacy repository, unlike older chart versions.
Applied to files:
deploy/cloud/helm/platform/values.yaml
📚 Learning: 2025-09-04T19:03:06.595Z
Learnt from: biswapanda
PR: ai-dynamo/dynamo#2872
File: examples/multimodal/deploy/agg_qwen.yaml:53-60
Timestamp: 2025-09-04T19:03:06.595Z
Learning: In the dynamo repository, Kubernetes Custom Resources use `gpu: "1"` format for GPU resource limits and requests, not the standard Kubernetes `nvidia.com/gpu: 1` format. This applies to DynamoGraphDeployment resources and other dynamo CRs.
Applied to files:
docs/guides/dynamo_deploy/README.md
🧬 Code graph analysis (2)
examples/runtime/hello_world/hello_world.py (2)
lib/bindings/python/src/dynamo/_core.pyi (1)
DistributedRuntime(31-54)lib/bindings/python/src/dynamo/runtime/__init__.py (2)
dynamo_endpoint(65-97)dynamo_worker(36-62)
container/run.sh (1)
container/build.sh (1)
missing_requirement(435-437)
🪛 LanguageTool
docs/guides/dynamo_deploy/dynamo_operator.md
[grammar] ~41-~41: There might be a mistake here.
Context: ... Cloud](installation_guide.md) installed - [FluxCD](https://fluxcd.io/flux/installat...
(QB_NEW_EN)
deploy/cloud/helm/platform/README.md
[grammar] ~104-~104: There might be a mistake here.
Context: ...namo Cloud Deployment Installation Guide](../../../../docs/guides/dynamo_deploy/installation_guide.md) - [NATS Documentation](https://docs.nats.io...
(QB_NEW_EN)
docs/guides/dynamo_deploy/README.md
[grammar] ~45-~45: There might be a mistake here.
Context: ... | Backend | Available Configurations | |---------|--------------------------| |...
(QB_NEW_EN)
[grammar] ~46-~46: There might be a mistake here.
Context: ...| |---------|--------------------------| | **[vLLM](/components/backends/vllm/dep...
(QB_NEW_EN)
[grammar] ~47-~47: There might be a mistake here.
Context: ...ated + Router, Disaggregated + Planner | | **[SGLang](/components/backends/sglang...
(QB_NEW_EN)
[grammar] ~48-~48: There might be a mistake here.
Context: ...ed + Planner, Disaggregated Multi-node | | **[TensorRT-LLM](/components/backends/...
(QB_NEW_EN)
[grammar] ~82-~82: There might be a mistake here.
Context: ...cations for DynamoGraphDeployment and DynamoComponentDeployment - **[Operator Guide](/docs/guides/dynamo_depl...
(QB_NEW_EN)
[grammar] ~83-~83: There might be a mistake here.
Context: ...mo operator configuration and management - **[Create Deployment](/docs/guides/dynamo_d...
(QB_NEW_EN)
docs/guides/dynamo_deploy/installation_guide.md
[grammar] ~24-~24: There might be a mistake here.
Context: ...uick Start Paths Platform is installed using Dynamo Kubernetes Platform [helm chart]...
(QB_NEW_EN)
🔇 Additional comments (19)
container/Dockerfile.vllm (1)
297-297: Stage rename LGTM.Switching
local-dev→devaligns with the rest of the PR and keeps target naming consistent..devcontainer/README.md (1)
94-98: Docs update LGTM, but ensure consistency across the page.Step 1 now references
--target dev, which is correct.container/run.sh (1)
366-366: Docker run integration LGTM.
${USER_STRING}is correctly injected and remains empty when not set.benchmarks/profiler/README.md (1)
1-1: Documentation link verified
Confirmed thatdocs/benchmarks/pre_deployment_profiling.mdexists in-tree.docs/architecture/sla_planner.md (1)
31-34: Clear prerequisite statement and cross-link.LGTM. This aligns with operational reality of SLA planner depending on profiled artifacts.
docs/guides/dynamo_deploy/sla_planner_deployment.md (1)
26-26: Prereq phrasing is precise.Good to specify that results live in
dynamo-pvc.components/backends/vllm/src/dynamo/vllm/handlers.py (2)
119-121: LGTM: explicit reset when no prefill client.Setting
can_prefill = 0in theelsebranch avoids stale values. No further action.
1-10: Remove scope-mismatch comment for handlers.py
File components/backends/vllm/src/dynamo/vllm/handlers.py isn’t modified in this PR; only container/run.sh was changed.Likely an incorrect or invalid review comment.
deploy/cloud/helm/crds/Chart.yaml (1)
19-19: CRDs chart version bump looks good.No issues spotted. Ensure release notes reference 0.5.0 for CRD changes.
deploy/helm/chart/Chart.yaml (1)
20-21: Update all lingering 0.4.1 references in documentation to 0.5.0
- docs/support_matrix.md (lines 63, 71): bump NIXL entries from 0.4.1 to 0.5.0
- docs/benchmarks/pre_deployment_profiling.md (line 109): update DOCKER_IMAGE tag to vllm-runtime:0.5.0
- docs/_includes/quick_start_local.rst (lines 13, 20): change
ai-dynamo[sglang]==0.4.1and GitHub release URL from release/0.4.1 to 0.5.0- docs/_includes/install.rst (lines 13, 44): update pip install extras and container image tags from 0.4.1 to 0.5.0
⛔ Skipped due to learnings
Learnt from: dmitry-tokarev-nv PR: ai-dynamo/dynamo#2179 File: docs/support_matrix.md:61-63 Timestamp: 2025-07-30T00:34:35.810Z Learning: In docs/support_matrix.md, the NIXL version difference between runtime dependencies (0.5.0) and build dependencies (0.4.0) is intentional and expected, not an error that needs to be corrected.docs/guides/dynamo_deploy/dynamo_operator.md (1)
32-32: Updated installation link reads well.No issues.
deploy/cloud/helm/platform/README.md (1)
104-104: Readme sync with template looks goodGenerated README matches the gotmpl change. No action needed.
docs/guides/dynamo_deploy/installation_guide.md (1)
107-114: Use NVCR as the registry server
Replace the existing--docker-server=${DOCKER_SERVER}line in the “create secret” block with:--docker-server=nvcr.io \(or ensure you
export DOCKER_SERVER=nvcr.iobefore running) so the pull secret explicitly targets NVCR.Likely an incorrect or invalid review comment.
docs/guides/dynamo_deploy/grove.md (1)
100-100: LGTM: link retargeted correctly to installation_guide.mdThe relative path resolves within the same directory; anchor text is clear.
docs/guides/dynamo_deploy/minikube.md (1)
61-61: LGTM: Next Steps link now points to the installation guideGood consistency with other doc updates.
deploy/cloud/operator/docs/header.md (1)
18-19: LGTM: concise auto-generation noticeAppropriate place to host the warning that will prefix the generated docs.
deploy/cloud/operator/Makefile (1)
280-294: Core PR objective appears missing: container/run.sh (--user) not in diffThe PR description/reviewer guidance call out container/run.sh changes (lines 128–134, 366), but the provided changes only cover docs and Makefile. Please include the run.sh modifications so we can review argument parsing and docker run integration.
examples/runtime/hello_world/hello_world.py (2)
1-60: Confirm scope: example changes unrelated tocontainer/run.sh --user.The hello_world streaming updates seem orthogonal to the Docker run permission fix. Confirm they’re intended in this PR; otherwise consider a follow-up PR to keep scope tight.
21-21: ConfirmContextbindings. The_core.pyistub doesn’t list aContextclass or its methods—please verify that the Pythondynamo.runtime.Context(fromdynamo._core) actually implementsid(),is_stopped(), andis_killed().
| export NAMESPACE=dynamo-cloud | ||
| export DOCKER_SERVER=nvcr.io/nvidia/ai-dynamo/ # or your registry | ||
| export DOCKER_USERNAME='$oauthtoken' | ||
| export DOCKER_PASSWORD=<YOUR_NGC_CLI_API_KEY> | ||
| export IMAGE_TAG=0.5.0 | ||
| export IMAGE_TAG=${RELEASE_VERSION} | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix registry variables to avoid auth mismatch and double slashes.
Docker registry secrets expect the server host (e.g., nvcr.io), not a path. Also, a trailing slash in DOCKER_SERVER causes a double slash in the image repository.
- export DOCKER_SERVER=nvcr.io/nvidia/ai-dynamo/ # or your registry
+ export DOCKER_SERVER=nvcr.io # registry server host only
+ export DOCKER_NAMESPACE=nvidia/ai-dynamo # image namespace/org
@@
-export IMAGE_TAG=${RELEASE_VERSION}
+export IMAGE_TAG=${RELEASE_VERSION}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export NAMESPACE=dynamo-cloud | |
| export DOCKER_SERVER=nvcr.io/nvidia/ai-dynamo/ # or your registry | |
| export DOCKER_USERNAME='$oauthtoken' | |
| export DOCKER_PASSWORD=<YOUR_NGC_CLI_API_KEY> | |
| export IMAGE_TAG=0.5.0 | |
| export IMAGE_TAG=${RELEASE_VERSION} | |
| export NAMESPACE=dynamo-cloud | |
| export DOCKER_SERVER=nvcr.io # registry server host only | |
| export DOCKER_NAMESPACE=nvidia/ai-dynamo # image namespace/org | |
| export DOCKER_USERNAME='$oauthtoken' | |
| export DOCKER_PASSWORD=<YOUR_NGC_CLI_API_KEY> | |
| export IMAGE_TAG=${RELEASE_VERSION} |
🤖 Prompt for AI Agents
In docs/guides/dynamo_deploy/installation_guide.md around lines 96-101, the
DOCKER_SERVER variable is set to a path with a trailing slash which is incorrect
for Docker registry auth and can produce double slashes in image names; update
DOCKER_SERVER to the registry host only (e.g., nvcr.io) and move the repository
path into a separate variable (e.g., DOCKER_REPOSITORY=nvcr.io/nvidia/ai-dynamo
or DOCKER_PATH=nvidia/ai-dynamo) or ensure image references are constructed
without adding an extra slash so image push/auth use the host for credentials
and the full repository path for image names.
fac71cc to
020b891
Compare
That was a mistake, I've set it to merge to release/0.5.0, is that ok? |
|
@saturley-hall Can you please clarify your comment from above? I don't see helm charts updated in this PR |
This was originally targeted at |
Out of a desire for expedience yes. Please ensure that is also in |
Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>
020b891 to
ae5e0be
Compare
|
Messy PR, removing. |
Overview:
Add --user parameter to container run.sh script to allow overriding the default user context when running Docker containers.
Details:
Where should the reviewer start?
container/run.sh - specifically the parameter handling section (lines 128-134) and Docker run command integration (line 366)
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Relates to BUG-5501463 https://nvbugspro.nvidia.com/bug/5501463
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Chores