Skip to content

Commit 8f19a4d

Browse files
committed
feat: update in-cluster benchmark job and yaml
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
1 parent 5a09233 commit 8f19a4d

File tree

2 files changed

+46
-49
lines changed

2 files changed

+46
-49
lines changed

benchmarks/incluster/README.md

Lines changed: 43 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -29,70 +29,64 @@ The in-cluster benchmarking solution:
2929
## Prerequisites
3030

3131
1. **Kubernetes cluster** with NVIDIA GPUs and Dynamo namespace setup (see [Dynamo Cloud/Platform docs](../../docs/guides/dynamo_deploy/README.md))
32-
2. **dynamo-pvc** PersistentVolumeClaim configured (see [deploy/utils README](../../deploy/utils/README.md))
33-
3. **Service account** (`dynamo-sa`) with appropriate permissions (see [deploy/utils README](../../deploy/utils/README.md))
34-
4. **Docker image** containing the Dynamo benchmarking tools
32+
2. **Storage and service account** PersistentVolumeClaim and service account configured with appropriate permissions (see [deploy/utils README](../../deploy/utils/README.md))
33+
3. **Docker image** containing the Dynamo benchmarking tools
3534

3635
## Quick Start
3736

3837
### Step 1: Deploy Your DynamoGraphDeployment
3938
Deploy your DynamoGraphDeployment using the [deployment documentation](../../components/backends/). Ensure it has a frontend service exposed.
4039

4140
### Step 2: Deploy and Run Benchmark Job
41+
42+
**Option A: Set environment variables (recommended for multiple commands)**
4243
```bash
43-
# Deploy the benchmark job with your namespace
44-
NAMESPACE=your-namespace envsubst < benchmark_job.yaml | kubectl apply -f -
44+
# Set environment variables for your deployment
45+
export NAMESPACE=benchmarking
46+
export MODEL_NAME=Qwen/Qwen3-0.6B
47+
export INPUT_NAME=qwen-vllm-agg
48+
export SERVICE_URL=vllm-agg-frontend:8000
49+
export DOCKER_IMAGE=nvcr.io/nvidian/dynamo-dev/vllm-runtime:dyn-973.0
50+
51+
# Deploy the benchmark job
52+
envsubst < benchmark_job.yaml | kubectl apply -f -
4553

4654
# Monitor the job
47-
kubectl logs -f job/dynamo-benchmark -n your-namespace
55+
kubectl logs -f job/dynamo-benchmark -n $NAMESPACE
4856

4957
# Check job status
50-
kubectl get jobs -n your-namespace
58+
kubectl get jobs -n $NAMESPACE
59+
```
60+
61+
**Option B: One-liner deployment**
62+
```bash
63+
NAMESPACE=benchmarking MODEL_NAME=Qwen/Qwen3-0.6B INPUT_NAME=qwen-vllm-agg SERVICE_URL=vllm-agg-frontend:8000 DOCKER_IMAGE=nvcr.io/nvidian/dynamo-dev/vllm-runtime:dyn-973.0 envsubst < benchmark_job.yaml | kubectl apply -f -
5164
```
5265

5366
### Step 3: Retrieve Results
5467
```bash
5568
# Download results from PVC (recommended)
5669
python3 -m deploy.utils.download_pvc_results \
57-
--namespace your-namespace \
70+
--namespace $NAMESPACE \
5871
--output-dir ./benchmark_results \
5972
--folder /data/results \
6073
--no-config
6174

6275
# Alternative: Copy results directly (requires pod name)
63-
kubectl cp <pod-name>:/data/results ./benchmark_results -n your-namespace
76+
kubectl cp <pod-name>:/data/results ./benchmark_results -n $NAMESPACE
6477
```
6578

6679
## Configuration
6780

68-
The job manifest uses these default parameters:
69-
- **Model**: `Qwen/Qwen3-0.6B`
70-
- **Input sequence length**: 2000 tokens
71-
- **Output sequence length**: 256 tokens
72-
- **Input**: `dsr1=${NAMESPACE}-dsr1-frontend:8000` (internal service URL)
73-
74-
### Customizing the Job Manifest
75-
76-
Edit `benchmark_job.yaml` to modify:
77-
78-
```yaml
79-
# Change model
80-
args:
81-
- --model
82-
- "meta-llama/Meta-Llama-3-8B"
83-
84-
# Change sequence lengths
85-
args:
86-
- --isl
87-
- "1500"
88-
- --osl
89-
- "200"
90-
91-
# Change input service
92-
args:
93-
- --input
94-
- my-service=${NAMESPACE}-my-service:8000
95-
```
81+
The benchmark job is fully configurable through environment variables:
82+
83+
### Required Environment Variables
84+
85+
- **NAMESPACE**: Kubernetes namespace where the benchmark will run
86+
- **MODEL_NAME**: Hugging Face model identifier (e.g., `Qwen/Qwen3-0.6B`)
87+
- **INPUT_NAME**: Name identifier for the benchmark input (e.g., `qwen-agg`)
88+
- **SERVICE_URL**: Internal service URL for the DynamoGraphDeployment frontend
89+
- **DOCKER_IMAGE**: Docker image containing the Dynamo benchmarking tools
9690

9791
## Understanding Your Results
9892

@@ -118,26 +112,26 @@ Results are stored in `/data/results` and follow the same structure as local ben
118112

119113
### Check Job Status
120114
```bash
121-
kubectl get jobs -n <namespace>
122-
kubectl describe job dynamo-benchmark -n <namespace>
115+
kubectl get jobs -n $NAMESPACE
116+
kubectl describe job dynamo-benchmark -n $NAMESPACE
123117
```
124118

125119
### View Logs
126120
```bash
127121
# Follow logs in real-time
128-
kubectl logs -f job/dynamo-benchmark -n <namespace>
122+
kubectl logs -f job/dynamo-benchmark -n $NAMESPACE
129123

130124
# Get logs from specific container
131-
kubectl logs job/dynamo-benchmark -c benchmark-runner -n <namespace>
125+
kubectl logs job/dynamo-benchmark -c benchmark-runner -n $NAMESPACE
132126
```
133127

134128
### Debug Failed Jobs
135129
```bash
136130
# Check pod status
137-
kubectl get pods -n <namespace> -l job-name=dynamo-benchmark
131+
kubectl get pods -n $NAMESPACE -l job-name=dynamo-benchmark
138132

139133
# Describe failed pod
140-
kubectl describe pod <pod-name> -n <namespace>
134+
kubectl describe pod <pod-name> -n $NAMESPACE
141135
```
142136

143137
## Comparison with Local Benchmarking
@@ -171,11 +165,14 @@ The in-cluster approach is recommended for:
171165

172166
```bash
173167
# Check PVC status
174-
kubectl get pvc dynamo-pvc -n <namespace>
168+
kubectl get pvc dynamo-pvc -n $NAMESPACE
175169

176170
# Verify service account
177-
kubectl get sa dynamo-sa -n <namespace>
171+
kubectl get sa dynamo-sa -n $NAMESPACE
178172

179173
# Check service endpoints
180-
kubectl get svc -n <namespace>
174+
kubectl get svc -n $NAMESPACE
175+
176+
# Verify your service URL is accessible
177+
kubectl get svc $SERVICE_URL -n $NAMESPACE
181178
```

benchmarks/incluster/benchmark_job.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ spec:
1414
- name: docker-imagepullsecret
1515
containers:
1616
- name: benchmark-runner
17-
image: nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.5.0
17+
image: ${DOCKER_IMAGE}
1818
resources:
1919
requests:
2020
cpu: "4"
@@ -35,7 +35,7 @@ spec:
3535
command: ["python3", "-m", "benchmarks.utils.benchmark"]
3636
args:
3737
- --model
38-
- deepseek-ai/DeepSeek-R1
38+
- ${MODEL_NAME}
3939
- --isl
4040
- "2000"
4141
- --std
@@ -45,7 +45,7 @@ spec:
4545
- --output-dir
4646
- /data/results
4747
- --input
48-
- dsr1=${NAMESPACE}-sgl-dsr1-8gpu-frontend:8000
48+
- ${INPUT_NAME}=${SERVICE_URL}
4949
volumeMounts:
5050
- name: data-volume
5151
mountPath: /data

0 commit comments

Comments
 (0)