Skip to content

Conversation

@hhzhang16
Copy link
Contributor

@hhzhang16 hhzhang16 commented Aug 21, 2025

Overview:

This MR adds comprehensive benchmarking framework for Dynamo deployments with automated performance comparison between aggregated, disaggregated, and vanilla backend configurations. Includes both endpoint testing and full deployment lifecycle management.

Details:

  • New benchmarking framework (benchmarks/utils/) with GenAI-Perf integration, plotting, and deployment automation
  • Automated benchmark script (benchmarks/benchmark.sh) supporting both endpoint and deployment testing modes
  • Comprehensive documentation (docs/benchmarks/benchmarking.md)
  • Updated Kubernetes deployment utilities (deploy/utils/) for namespace setup, PVC management, and manifest injection
  • Refactored profiling tools - moved common utilities to shared deploy/utils/ package

Where should the reviewer start?

  • docs/benchmarks/benchmarking.md: complete benchmarking guide
  • benchmarks/benchmark.sh: main benchmarking script
  • deploy/utils/README.md: Kubernetes setup utilities that support benchmarking and other Kubernetes workflows

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Summary by CodeRabbit

  • New Features

    • End-to-end benchmarking tooling for deployments and existing endpoints, including a CLI runner, async workflow, concurrency sweeps, and automatic plot generation and summaries.
    • Vanilla backend support with a ready-to-use template and Kubernetes-managed deployment lifecycle with port-forwarding.
    • Namespace setup script and utilities to inject manifests and download PVC results.
  • Documentation

    • New Benchmarking and Pre-Deployment Profiling guides; expanded benchmarking README and links updated across docs.
  • Chores

    • Standardized Kubernetes resources (ServiceAccount, Role/RoleBinding, PVC) and profiling job updates.
    • Added dependencies manifest; package markers and licensing headers.
    • Removed deprecated profiling helper scripts.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Aug 21, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions bot added the feat label Aug 21, 2025
@hhzhang16 hhzhang16 self-assigned this Aug 21, 2025
@grahamking
Copy link
Contributor

You're adding 26,000 lines of mostly YAML to Dynamo @hhzhang16 ? Am I reading that MR right? Maybe we don't need the .bak files. I know it's Draft.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 26, 2025

Walkthrough

Adds end-to-end benchmarking tooling: a Bash orchestrator, Python workflow/CLI, GenAI-Perf integration, plotting, a vanilla backend client/template, and Kubernetes utilities for manifest injection and PVC downloads. Updates profiler job, refactors imports, standardizes RBAC/PVC names, and expands documentation with a comprehensive benchmarking guide and link fixes.

Changes

Cohort / File(s) Summary of changes
Benchmark docs
README.md, benchmarks/README.md, docs/benchmarks/*, docs/architecture/*, components/backends/vllm/deploy/README.md, docs/guides/dynamo_deploy/sla_planner_deployment.md, tests/planner/README.md
Added benchmarking guide; updated links to pre-deployment profiling; clarified benchmarking modes and quick starts; adjusted references (PVC name to dynamo-pvc).
Benchmark orchestration
benchmarks/benchmark.sh, benchmarks/utils/benchmark.py, benchmarks/utils/workflow.py
Introduced Bash runner and Python CLI/workflow to validate inputs, deploy/teardown backends, run benchmarks, and summarize results.
GenAI-Perf and plotting
benchmarks/utils/genai.py, benchmarks/utils/plot.py
Added GenAI-Perf wrapper and concurrency sweep; implemented parsing and generation of multiple plots and a summary.
Vanilla backend support
benchmarks/utils/vanilla_client.py, benchmarks/utils/templates/vanilla-vllm.yaml
Added Kubernetes client to deploy/manage a vanilla vLLM service; provided a deployment/service manifest template.
Profiler updates/removals
benchmarks/profiler/deploy/profile_sla_job.yaml, benchmarks/profiler/profile_sla.py, benchmarks/profiler/download_pvc_results.py (deleted), benchmarks/profiler/inject_disagg_config.py (deleted), benchmarks/profiler/utils/utils.py (deleted)
Switched service account/PVC names, module invocation, and config path; updated imports; removed legacy PVC tools and profiler utilities.
Deploy utils and K8s setup
deploy/utils/... (README, __init__.py, kubernetes.py, dynamo_deployment.py, inject_manifest.py, download_pvc_results.py, manifests/*, requirements.txt, setup_k8s_namespace.sh), deploy/__init__.py
Added utilities for PVC access, manifest injection, results download; async/UX updates to DynamoDeploymentClient incl. port-forward; standardized PVC (dynamo-pvc) and RBAC names; added setup script and requirements.
Package markers
benchmarks/utils/__init__.py
Added SPDX headers and package marker; no logic changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor U as User
  participant SH as benchmark.sh
  participant PY as benchmarks.utils.benchmark (CLI)
  participant WF as workflow
  participant K as Kubernetes API
  participant C as Client (Dynamo/Vanilla)
  participant G as GenAI-Perf
  participant P as Plotter
  participant FS as Results Dir

  U->>SH: Run ./benchmarks/benchmark.sh [flags]
  SH->>PY: python -m benchmarks.utils.benchmark [...args...]
  PY->>WF: run_benchmark_workflow(...)
  alt Endpoint benchmarking
    WF->>G: run_concurrency_sweep(endpoint, model, ISL/OSL/STD)
    G-->>FS: Write per-concurrency results
  else Deployment benchmarking
    loop For each provided manifest (agg/disagg/vanilla)
      WF->>C: create client (Dynamo/Vanilla)
      C->>K: create Deployment/Service
      C->>K: wait_for_deployment_ready()
      C->>C: port_forward_frontend()
      WF->>G: run_concurrency_sweep(local URL, model, ISL/OSL/STD)
      G-->>FS: Write per-concurrency results (subdir)
      C->>C: stop_port_forward()
      C->>K: delete Deployment/Service
    end
  end
  WF->>P: generate_plots(base_output_dir)
  P-->>FS: plots/, SUMMARY.txt
  PY-->>U: Exit code/status and result paths
  SH-->>U: Final summary and plot locations
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

A bunny with charts and a stopwatch in paw,
Spins up pods, then listens in awe.
Concurrency hops, plots sprout like clover,
Tokens per second—leaps over and over.
PVC trails, RBAC tails tied neat—
Benchmark burrow complete. 🥕📈

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@hhzhang16 hhzhang16 marked this pull request as ready for review August 27, 2025 01:03
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
@whoisj
Copy link
Collaborator

whoisj commented Aug 28, 2025

How is this a "guide" with no documentation and very few comments?

Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
@hhzhang16
Copy link
Contributor Author

Copy link
Contributor

@biswapanda biswapanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - let's address ux concerns Itay raised

… accordingly; remove vanilla stuff

Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
…-creating-benchmark-guide-building-on-existing-profiling-work
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
@hhzhang16 hhzhang16 merged commit 699996e into main Aug 30, 2025
10 checks passed
@hhzhang16 hhzhang16 deleted the hannahz/dyn-830-creating-benchmark-guide-building-on-existing-profiling-work branch August 30, 2025 00:14
jasonqinzhou pushed a commit that referenced this pull request Aug 30, 2025
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>
KrishnanPrash pushed a commit that referenced this pull request Sep 2, 2025
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Krishnan Prashanth <kprashanth@nvidia.com>
dillon-cullinan pushed a commit that referenced this pull request Sep 5, 2025
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
nnshah1 pushed a commit that referenced this pull request Sep 8, 2025
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: nnshah1 <neelays@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants