-
Notifications
You must be signed in to change notification settings - Fork 689
feat: add benchmarking guide #2620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add benchmarking guide #2620
Conversation
|
You're adding 26,000 lines of mostly YAML to Dynamo @hhzhang16 ? Am I reading that MR right? Maybe we don't need the |
WalkthroughAdds end-to-end benchmarking tooling: a Bash orchestrator, Python workflow/CLI, GenAI-Perf integration, plotting, a vanilla backend client/template, and Kubernetes utilities for manifest injection and PVC downloads. Updates profiler job, refactors imports, standardizes RBAC/PVC names, and expands documentation with a comprehensive benchmarking guide and link fixes. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor U as User
participant SH as benchmark.sh
participant PY as benchmarks.utils.benchmark (CLI)
participant WF as workflow
participant K as Kubernetes API
participant C as Client (Dynamo/Vanilla)
participant G as GenAI-Perf
participant P as Plotter
participant FS as Results Dir
U->>SH: Run ./benchmarks/benchmark.sh [flags]
SH->>PY: python -m benchmarks.utils.benchmark [...args...]
PY->>WF: run_benchmark_workflow(...)
alt Endpoint benchmarking
WF->>G: run_concurrency_sweep(endpoint, model, ISL/OSL/STD)
G-->>FS: Write per-concurrency results
else Deployment benchmarking
loop For each provided manifest (agg/disagg/vanilla)
WF->>C: create client (Dynamo/Vanilla)
C->>K: create Deployment/Service
C->>K: wait_for_deployment_ready()
C->>C: port_forward_frontend()
WF->>G: run_concurrency_sweep(local URL, model, ISL/OSL/STD)
G-->>FS: Write per-concurrency results (subdir)
C->>C: stop_port_forward()
C->>K: delete Deployment/Service
end
end
WF->>P: generate_plots(base_output_dir)
P-->>FS: plots/, SUMMARY.txt
PY-->>U: Exit code/status and result paths
SH-->>U: Final summary and plot locations
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
|
How is this a "guide" with no documentation and very few comments? |
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
|
@whoisj here’s the main documentation : https://github.com/ai-dynamo/dynamo/blob/hannahz/dyn-830-creating-benchmark-guide-building-on-existing-profiling-work/docs/benchmarks/benchmarking.md. It’s 340 lines. There is some additional documentation added in this MR as well, e.g. https://github.com/ai-dynamo/dynamo/blob/hannahz/dyn-830-creating-benchmark-guide-building-on-existing-profiling-work/deploy/utils/README.md |
biswapanda
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm - let's address ux concerns Itay raised
… accordingly; remove vanilla stuff Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
…-creating-benchmark-guide-building-on-existing-profiling-work
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com> Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com> Signed-off-by: Krishnan Prashanth <kprashanth@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com> Signed-off-by: nnshah1 <neelays@nvidia.com>
Overview:
This MR adds comprehensive benchmarking framework for Dynamo deployments with automated performance comparison between aggregated, disaggregated, and vanilla backend configurations. Includes both endpoint testing and full deployment lifecycle management.
Details:
Where should the reviewer start?
docs/benchmarks/benchmarking.md: complete benchmarking guidebenchmarks/benchmark.sh: main benchmarking scriptdeploy/utils/README.md: Kubernetes setup utilities that support benchmarking and other Kubernetes workflowsRelated Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit
New Features
Documentation
Chores