Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
b5daf41
feat: comprehensive benchmarking framework and deployment utilities (…
hhzhang16 Aug 27, 2025
e27dfe6
feat: fixing some mypy and addressing more CodeRabbig comments
hhzhang16 Aug 28, 2025
c5a4307
feat: addressing more CodeRabbit comments
hhzhang16 Aug 28, 2025
3fa7e88
feat: addressing more CodeRabbit comments
hhzhang16 Aug 28, 2025
f1286aa
feat: update images in vllm deploy files
hhzhang16 Aug 28, 2025
1c9600f
docs: update docs following feedback about hardware and model support
hhzhang16 Aug 28, 2025
8063142
docs: centralize on one model for benchmarking
hhzhang16 Aug 28, 2025
dbba94b
docs: update prerequisites in benchmarking guide
hhzhang16 Aug 28, 2025
6dfb335
docs: remove graphs/numbers from benchmarking.md
hhzhang16 Aug 28, 2025
d7413ba
docs: update default model throughout benchmarking
hhzhang16 Aug 28, 2025
bd90141
feat: update input structure to support --input; update documentation…
hhzhang16 Aug 29, 2025
47e7bee
docs: add diagram for benchmarking
hhzhang16 Aug 29, 2025
6ee97f8
feat: addressing MR comments
hhzhang16 Aug 29, 2025
8801fea
docs: cleaned up documentation guide, removing duplicate code
hhzhang16 Aug 29, 2025
d5b8aef
docs: remove diagram for now
hhzhang16 Aug 29, 2025
0b50543
Merge branch 'main' of github.com:ai-dynamo/dynamo into hannahz/dyn-8…
hhzhang16 Aug 29, 2025
0840ce6
docs: cleaned up documentation guide
hhzhang16 Aug 29, 2025
14045b0
fix dynamographdeployment name when deploying for benchmarking
hhzhang16 Aug 29, 2025
58a0e7c
docs: showcase other backends
hhzhang16 Aug 29, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,13 @@ Rerun with `curl -N` and change `stream` in the request to `true` to get the res
- Check out [Backends](components/backends) to deploy various workflow configurations (e.g. SGLang with router, vLLM with disaggregated serving, etc.)
- Run some [Examples](examples) to learn about building components in Dynamo and exploring various integrations.

### Benchmarking Dynamo

Dynamo provides comprehensive benchmarking tools to evaluate and optimize your deployments:

* **[Benchmarking Guide](docs/benchmarks/benchmarking.md)** – Compare deployment topologies (aggregated vs. disaggregated vs. vanilla vLLM) using GenAI-Perf
* **[Pre-Deployment Profiling](docs/benchmarks/pre_deployment_profiling.md)** – Optimize configurations before deployment to meet SLA requirements

# Engines

Dynamo is designed to be inference engine agnostic. To use any engine with Dynamo, NATS and etcd need to be installed, along with a Dynamo frontend (`python -m dynamo.frontend [--interactive]`).
Expand Down
63 changes: 58 additions & 5 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,19 +15,72 @@

# Benchmarks

This directory contains benchmarking scripts and tools for performance evaluation.
This directory contains benchmarking scripts and tools for performance evaluation of Dynamo deployments. The benchmarking framework is a wrapper around genai-perf that makes it easy to benchmark DynamoGraphDeployments and compare them with external endpoints.

## Quick Start

### Benchmark an Existing Endpoint
```bash
./benchmark.sh --namespace my-namespace --input my-endpoint=http://your-endpoint:8000
```

### Benchmark Dynamo Deployments
```bash
# Benchmark disaggregated vLLM with custom label
./benchmark.sh --namespace my-namespace --input vllm-disagg=components/backends/vllm/deploy/disagg.yaml

# Benchmark TensorRT-LLM disaggregated deployment
./benchmark.sh --namespace my-namespace --input trtllm-disagg=components/backends/trtllm/deploy/disagg.yaml

# Compare multiple Dynamo deployments
./benchmark.sh --namespace my-namespace \
--input agg=components/backends/vllm/deploy/agg.yaml \
--input disagg=components/backends/vllm/deploy/disagg.yaml

# Compare Dynamo vs external endpoint
./benchmark.sh --namespace my-namespace \
--input dynamo=components/backends/vllm/deploy/disagg.yaml \
--input external=http://localhost:8000
```

**Note**:
- The sample manifests may reference private registry images. Update the `image:` fields to use accessible images from [Dynamo NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo/artifacts) or your own registry before running.
- Only DynamoGraphDeployment manifests are supported for automatic deployment. To benchmark non-Dynamo backends (vLLM, TensorRT-LLM, SGLang, etc.), deploy them manually using their Kubernetes guides and use the endpoint option.

## Features

The benchmarking framework supports:

**Two Benchmarking Modes:**
- **Endpoint Benchmarking**: Test existing HTTP endpoints without deployment overhead
- **Deployment Benchmarking**: Deploy, test, and cleanup DynamoGraphDeployments automatically

**Flexible Configuration:**
- User-defined labels for each input using `--input label=value` format
- Support for multiple inputs to enable comparisons
- Customizable concurrency levels (configurable via CONCURRENCIES env var), sequence lengths, and models
- Automated performance plot generation with custom labels

**Supported Backends:**
- DynamoGraphDeployments
- External HTTP endpoints (for comparison with non-Dynamo backends)

## Installation

This is already included as part of the dynamo vllm image. To install locally or standalone, run:
This is already included as part of the Dynamo container images. To install locally or standalone:

```bash
pip install -e .
```

Currently, this will install lightweight tools for:
## Data Generation Tools

This directory also includes lightweight tools for:
- Analyzing prefix-structured data (`datagen analyze`)
- Synthesizing structured data customizable for testing purposes (`datagen synthesize`)
Detailed information are provided in the `prefix_data_generator` directory.

The benchmarking scripts for the core dynamo components are to come soon (e.g. routing, disagg, Planner).
Detailed information is provided in the `prefix_data_generator` directory.

## Comprehensive Guide

For detailed documentation, configuration options, and advanced usage, see the [complete benchmarking guide](../docs/benchmarks/benchmarking.md).
Loading
Loading