Skip to content

Commit e176a29

Browse files
authored
docs: update feature matrix with local-dev + run.sh usage (#2977)
Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>
1 parent 6a089b1 commit e176a29

File tree

2 files changed

+221
-40
lines changed

2 files changed

+221
-40
lines changed

container/Dockerfile.vllm

Lines changed: 0 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -289,46 +289,6 @@ RUN --mount=type=bind,source=./container/launch_message.txt,target=/workspace/la
289289
ENTRYPOINT ["/opt/nvidia/nvidia_entrypoint.sh"]
290290
CMD []
291291

292-
#######################################################################
293-
########## DEVELOPMENT TARGETS FEATURE MATRIX #########################
294-
#######################################################################
295-
# Feature │ local-dev Target │ dev Target
296-
# ─────────────────────┼─────────────────────┼─────────────────────
297-
# Purpose │ Dev Container │ Command-line with
298-
# │ plugin use only │ run.sh script
299-
# ─────────────────────┼─────────────────────┼─────────────────────
300-
# Default User │ ubuntu user │ root user
301-
# ─────────────────────┼─────────────────────┼─────────────────────
302-
# User Setup │ Full ubuntu user │ No user setup
303-
# │ with UID/GID │
304-
# │ mapping │
305-
# ─────────────────────┼─────────────────────┼─────────────────────
306-
# Permissions │ ubuntu user with │ Root-level
307-
# │ sudo privileges │ permissions
308-
# ─────────────────────┼─────────────────────┼─────────────────────
309-
# Home Directory │ /home/ubuntu │ /root
310-
# ─────────────────────┼─────────────────────┼─────────────────────
311-
# Working Directory │ /home/ubuntu/dynamo │ /workspace
312-
# ─────────────────────┼─────────────────────┼─────────────────────
313-
# Rust Toolchain │ User's home │ System locations
314-
# │ (~/.rustup, │ (/usr/local/rustup,
315-
# │ ~/.cargo) │ /usr/local/cargo)
316-
# ─────────────────────┼─────────────────────┼─────────────────────
317-
# Python Environment │ User-owned venv │ System location
318-
# │ │ (/opt/dynamo/venv)
319-
# ─────────────────────┼─────────────────────┼─────────────────────
320-
# File Permissions │ User-level with │ Root-level
321-
# │ proper ownership │ permissions
322-
# ─────────────────────┼─────────────────────┼─────────────────────
323-
# Compatibility │ MS Plug-in: Dev │ Backward compatibility
324-
# │ Container workflow │ with existing
325-
# │ │ workflows
326-
#
327-
# USAGE GUIDELINES:
328-
# • Use local-dev: VS Code/Cursor Dev Container plugin only
329-
# • Use dev: run.sh script for command-line development
330-
331-
332292
#######################################################################
333293
########## Development (Dev Container only) ###########################
334294
#######################################################################

container/README.md

Lines changed: 221 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,221 @@
1+
# Container Development Guide
2+
3+
## Overview
4+
5+
The NVIDIA Dynamo project uses containerized development and deployment to maintain consistent environments across different AI inference frameworks and deployment scenarios. This directory contains the tools for building and running Dynamo containers:
6+
7+
### Core Components
8+
9+
- **`build.sh`** - A Docker image builder that creates containers for different AI inference frameworks (vLLM, TensorRT-LLM, SGLang). It handles framework-specific dependencies, multi-stage builds, and development vs production configurations.
10+
11+
- **`run.sh`** - A container runtime manager that launches Docker containers with proper GPU access, volume mounts, and environment configurations. It supports different development workflows from root-based legacy setups to user-based development environments.
12+
13+
- **Multiple Dockerfiles** - Framework-specific Dockerfiles that define the container images:
14+
- `Dockerfile.vllm` - For vLLM inference backend
15+
- `Dockerfile.trtllm` - For TensorRT-LLM inference backend
16+
- `Dockerfile.sglang` - For SGLang inference backend
17+
- `Dockerfile` - Base/standalone configuration
18+
19+
### Why Containerization?
20+
21+
Each inference framework (vLLM, TensorRT-LLM, SGLang) has specific CUDA versions, Python dependencies, and system libraries. Containers provide consistent environments, framework isolation, and proper GPU configurations across development and production.
22+
23+
The scripts in this directory abstract away the complexity of Docker commands while providing fine-grained control over build and runtime configurations.
24+
25+
### Convenience Scripts vs Direct Docker Commands
26+
27+
The `build.sh` and `run.sh` scripts are convenience wrappers that simplify common Docker operations. They automatically handle:
28+
- Framework-specific image selection and tagging
29+
- GPU access configuration and runtime selection
30+
- Volume mount setup for development workflows
31+
- Environment variable management
32+
- Build argument construction for multi-stage builds
33+
34+
**You can always use Docker commands directly** if you prefer more control or want to customize beyond what the scripts provide. The scripts use `--dry-run` flags to show you the exact Docker commands they would execute, making it easy to understand and modify the underlying operations.
35+
36+
## Development Targets Feature Matrix
37+
38+
These targets are specified with `build.sh --target <target>` and correspond to Docker multi-stage build targets defined in the Dockerfiles (e.g., `FROM somebase AS <target>`). Some commonly used targets include:
39+
40+
- `runtime` - For running pre-built containers without development tools (minimal size)
41+
- `dev` - For development with full toolchain (git, vim, build tools, etc.)
42+
- `local-dev` - For development with user-based permissions matching host UID/GID
43+
44+
Additional targets are available in the Dockerfiles for specific build stages and use cases.
45+
46+
```
47+
Feature │ 1. dev + `run.sh` │ 2. local-dev + `run.sh` │ 3. local-dev + Dev Container
48+
──────────────────┼───────────────────────┼──────────────────────────┼────────────────────────────
49+
Default User │ root │ ubuntu │ ubuntu
50+
──────────────────┼───────────────────────┼──────────────────────────┼────────────────────────────
51+
User Setup │ None │ Matches UID/GID of │ Matches UID/GID of
52+
│ │ `build.sh` user │ `build.sh` user
53+
──────────────────┼───────────────────────┼──────────────────────────┼────────────────────────────
54+
Permissions │ root │ ubuntu with sudo │ ubuntu with sudo
55+
──────────────────┼───────────────────────┼──────────────────────────┼────────────────────────────
56+
Home Directory │ /root │ /home/ubuntu │ /home/ubuntu
57+
──────────────────┼───────────────────────┼──────────────────────────┼────────────────────────────
58+
Working Directory │ /workspace │ /workspace │ /home/ubuntu/dynamo
59+
──────────────────┼───────────────────────┼──────────────────────────┼────────────────────────────
60+
Rust Toolchain │ System install │ User install (~/.rustup, │ User install (~/.rustup,
61+
│ (/usr/local/rustup, │ ~/.cargo) │ ~/.cargo)
62+
│ /usr/local/cargo) │ │
63+
──────────────────┼───────────────────────┼──────────────────────────┼────────────────────────────
64+
Python Env │ root owned │ User owned venv │ User owned venv
65+
──────────────────┼───────────────────────┼──────────────────────────┼────────────────────────────
66+
File Permissions │ root-level │ user-level, safe │ user-level, safe
67+
──────────────────┼───────────────────────┼──────────────────────────┼────────────────────────────
68+
Compatibility │ Legacy workflows, │ workspace writable on NFS│workspace writable on NFS
69+
│ workspace not │ │
70+
│ writable on NFS │ │
71+
──────────────────┼───────────────────────┼──────────────────────────┼────────────────────────────
72+
```
73+
74+
## Usage Guidelines
75+
76+
- **Use dev + `run.sh`**: `run.sh` script for command-line development by root user
77+
- **Use local-dev + `run.sh`**: `run.sh` script for command-line development using your local user ID
78+
- **Use local-dev + Dev Container**: VS Code/Cursor Dev Container Plugin, using your local user ID
79+
80+
## Example Commands
81+
82+
### 1. dev + `run.sh`:
83+
```bash
84+
run.sh --mount-workspace ...
85+
```
86+
87+
### 2. local-dev + `run.sh`:
88+
```bash
89+
run.sh --mount-workspace --image dynamo:latest-vllm-local-dev ...
90+
```
91+
92+
### 3. local-dev + Dev Container:
93+
Use VS Code/Cursor Dev Container plugin with devcontainer.json configuration
94+
95+
## Build and Run Scripts Overview
96+
97+
### build.sh - Docker Image Builder
98+
99+
The `build.sh` script is responsible for building Docker images for different AI inference frameworks. It supports multiple frameworks and configurations:
100+
101+
**Purpose:**
102+
- Builds Docker images for NVIDIA Dynamo with support for vLLM, TensorRT-LLM, SGLang, or standalone configurations
103+
- Handles framework-specific dependencies and optimizations
104+
- Manages build contexts, caching, and multi-stage builds
105+
- Configures development vs production targets
106+
107+
**Key Features:**
108+
- **Framework Support**: vLLM (default when --framework not specified), TensorRT-LLM, SGLang, or NONE
109+
- **Multi-stage Builds**: Build process with base images
110+
- **Development Targets**: Supports `dev` and `local-dev` targets
111+
- **Build Caching**: Docker layer caching and sccache support
112+
- **GPU Optimization**: CUDA, EFA, and NIXL support
113+
114+
**Common Usage Examples:**
115+
116+
```bash
117+
# Build vLLM image (default)
118+
./build.sh
119+
120+
# Build with specific framework
121+
./build.sh --framework trtllm
122+
123+
# Build local development image
124+
./build.sh --framework vllm --target local-dev
125+
126+
# Build with custom tag
127+
./build.sh --framework sglang --tag my-custom-tag
128+
129+
# Dry run to see commands
130+
./build.sh --dry-run
131+
132+
# Build with no cache
133+
./build.sh --no-cache
134+
135+
# Build with build arguments
136+
./build.sh --build-arg CUSTOM_ARG=value
137+
```
138+
139+
### run.sh - Container Runtime Manager
140+
141+
The `run.sh` script launches Docker containers with the appropriate configuration for development and inference workloads.
142+
143+
**Purpose:**
144+
- Runs pre-built Dynamo Docker images with proper GPU access
145+
- Configures volume mounts, networking, and environment variables
146+
- Supports different development workflows (root vs user-based)
147+
- Manages container lifecycle and resource allocation
148+
149+
**Key Features:**
150+
- **GPU Management**: Automatic GPU detection and allocation
151+
- **Volume Mounting**: Workspace and HuggingFace cache mounting
152+
- **User Management**: Root or user-based container execution
153+
- **Network Configuration**: Host networking for service communication
154+
- **Resource Limits**: Memory, file descriptors, and IPC configuration
155+
156+
**Common Usage Examples:**
157+
158+
```bash
159+
# Basic container launch
160+
./run.sh
161+
162+
# Mount workspace for development
163+
./run.sh --mount-workspace
164+
165+
# Use specific image and framework
166+
./run.sh --image dynamo:latest-vllm --framework vllm
167+
168+
# Interactive shell with workspace mounted
169+
./run.sh --mount-workspace -it -- bash
170+
171+
# Run with custom environment variables
172+
./run.sh -e CUDA_VISIBLE_DEVICES=0,1 --mount-workspace
173+
174+
# Run without GPU access
175+
./run.sh --gpus none
176+
177+
# Dry run to see docker command
178+
./run.sh --dry-run
179+
180+
# Run with custom volume mounts
181+
./run.sh -v /host/path:/container/path --mount-workspace
182+
183+
# Launch with specific container name
184+
./run.sh --name my-dynamo-container --mount-workspace
185+
```
186+
187+
## Workflow Examples
188+
189+
### Development Workflow
190+
```bash
191+
# 1. Build development image
192+
./build.sh --framework vllm --target local-dev
193+
194+
# 2. Run development container
195+
./run.sh --image dynamo:latest-vllm-local-dev --mount-workspace -it
196+
197+
# 3. Inside container, run inference (requires both frontend and backend)
198+
# Start frontend
199+
python -m dynamo.frontend &
200+
201+
# Start backend (vLLM example)
202+
python -m dynamo.vllm --model Qwen/Qwen3-0.6B --gpu-memory-utilization 0.50 &
203+
```
204+
205+
### Production Workflow
206+
```bash
207+
# 1. Build production image
208+
./build.sh --framework vllm --release-build
209+
210+
# 2. Run production container
211+
./run.sh --image dynamo:latest-vllm --gpus all
212+
```
213+
214+
### Testing Workflow
215+
```bash
216+
# 1. Build with no cache for clean build
217+
./build.sh --framework vllm --no-cache
218+
219+
# 2. Test container functionality
220+
./run.sh --mount-workspace -it -- python -m pytest tests/
221+
```

0 commit comments

Comments
 (0)