-
Notifications
You must be signed in to change notification settings - Fork 690
fix: prevent crash looping hello world #2625 #2670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
992adfb
9a93f11
2a616da
d0de1a0
edccbd5
54fbff3
a9b6b28
65e89b3
c92dc98
eb58916
e848cf5
5e3586d
4fbb4e5
dc13774
e5e94ad
92781d3
58ad4a2
039c061
2a8e251
2dc4a4b
85737ba
27c8a97
641e49d
1b145bb
4e4818f
c92c1f4
6fce98a
035d6d8
167c793
409aa9e
71126c7
f342c30
96d1f15
e8b37a6
b5c9278
b0c1a24
0cf8041
bd8e368
73bcc3b
aa57c6b
3f0a725
d98a791
37fca1c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -34,26 +34,25 @@ git checkout $(git describe --tags $(git rev-list --tags --max-count=1)) | |||||||||||||||||||||
|
|
||||||||||||||||||||||
| | Feature | SGLang | Notes | | ||||||||||||||||||||||
| |---------|--------|-------| | ||||||||||||||||||||||
| | [**Disaggregated Serving**](../../docs/architecture/disagg_serving.md) | ✅ | | | ||||||||||||||||||||||
| | [**Conditional Disaggregation**](../../docs/architecture/disagg_serving.md#conditional-disaggregation) | 🚧 | WIP [PR](https://github.com/sgl-project/sglang/pull/7730) | | ||||||||||||||||||||||
| | [**KV-Aware Routing**](../../docs/architecture/kv_cache_routing.md) | ✅ | | | ||||||||||||||||||||||
| | [**SLA-Based Planner**](../../docs/architecture/sla_planner.md) | ❌ | Planned | | ||||||||||||||||||||||
| | [**Load Based Planner**](../../docs/architecture/load_planner.md) | ❌ | Planned | | ||||||||||||||||||||||
| | [**KVBM**](../../docs/architecture/kvbm_architecture.md) | ❌ | Planned | | ||||||||||||||||||||||
| | [**Disaggregated Serving**](../../../docs/architecture/disagg_serving.md) | ✅ | | | ||||||||||||||||||||||
| | [**Conditional Disaggregation**](../../../docs/architecture/disagg_serving.md#conditional-disaggregation) | 🚧 | WIP [PR](https://github.com/sgl-project/sglang/pull/7730) | | ||||||||||||||||||||||
| | [**KV-Aware Routing**](../../../docs/architecture/kv_cache_routing.md) | ✅ | | | ||||||||||||||||||||||
| | [**SLA-Based Planner**](../../../docs/architecture/sla_planner.md) | ❌ | Planned | | ||||||||||||||||||||||
| | [**Load Based Planner**](../../../docs/architecture/load_planner.md) | ❌ | Planned | | ||||||||||||||||||||||
| | [**KVBM**](../../../docs/architecture/kvbm_architecture.md) | ❌ | Planned | | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| ### Large Scale P/D and WideEP Features | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| | Feature | SGLang | Notes | | ||||||||||||||||||||||
| |--------------------|--------|-----------------------------------------------------------------------| | ||||||||||||||||||||||
| | **WideEP** | ✅/🚧 | Full support on H100s/GB200 WIP [PR](https://github.com/sgl-project/sglang/pull/7556) | | ||||||||||||||||||||||
| | **DP Rank Routing**| 🚧 | Direct routing supported. Process per DP rank is not supported | | ||||||||||||||||||||||
| | **GB200 Support** | 🚧 | WIP [PR](https://github.com/sgl-project/sglang/pull/7556) | | ||||||||||||||||||||||
| | Feature | SGLang | Notes | | ||||||||||||||||||||||
| |---------------------|--------|--------------------------------------------------------------| | ||||||||||||||||||||||
| | **WideEP** | ✅ | Full support on H100s/GB200 | | ||||||||||||||||||||||
| | **DP Rank Routing** | 🚧 | Direct routing supported. Dynamo KV router does not router to DP worker | | ||||||||||||||||||||||
| | **GB200 Support** | ✅ | | | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
|
Comment on lines
+46
to
51
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fix grammar: “router” → “route” to DP worker. Small but user‑visible in the Feature Matrix. -| **DP Rank Routing** | 🚧 | Direct routing supported. Dynamo KV router does not router to DP worker |
+| **DP Rank Routing** | 🚧 | Direct routing supported. Dynamo KV router does not route to DP worker |📝 Committable suggestion
Suggested change
🧰 Tools🪛 LanguageTool[grammar] ~49-~49: There might be a mistake here. (QB_NEW_EN) 🤖 Prompt for AI Agents |
||||||||||||||||||||||
|
|
||||||||||||||||||||||
| ## Quick Start | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| Below we provide a guide that lets you run all of our the common deployment patterns on a single node. See our different [architectures](../llm/README.md#deployment-architectures) for a high level overview of each pattern and the architecture diagram for each. | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| Below we provide a guide that lets you run all of our the common deployment patterns on a single node. | ||||||||||||||||||||||
| ### Start NATS and ETCD in the background | ||||||||||||||||||||||
|
Comment on lines
+55
to
56
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fix wording: extra “the” in Quick Start intro. -Below we provide a guide that lets you run all of our the common deployment patterns on a single node.
+Below we provide a guide that lets you run all of our common deployment patterns on a single node.📝 Committable suggestion
Suggested change
🧰 Tools🪛 LanguageTool[grammar] ~55-~55: There might be a mistake here. (QB_NEW_EN) 🤖 Prompt for AI Agents |
||||||||||||||||||||||
|
|
||||||||||||||||||||||
| Start using [Docker Compose](../../../deploy/docker-compose.yml) | ||||||||||||||||||||||
|
|
@@ -141,7 +140,7 @@ cd $DYNAMO_ROOT/components/backends/sglang | |||||||||||||||||||||
|
|
||||||||||||||||||||||
| ## Request Migration | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| In a [Distributed System](#distributed-system), a request may fail due to connectivity issues between the Frontend and the Backend. | ||||||||||||||||||||||
| In a Distributed System, a request may fail due to connectivity issues between the Frontend and the Backend. | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| The Frontend will automatically track which Backends are having connectivity issues with it and avoid routing new requests to the Backends with known connectivity issues. | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
|
|
@@ -164,7 +163,6 @@ Below we provide a selected list of advanced examples. Please open up an issue i | |||||||||||||||||||||
|
|
||||||||||||||||||||||
| ### Large scale P/D disaggregation with WideEP | ||||||||||||||||||||||
| - **[Run DeepSeek-R1 on 104+ H100s](docs/dsr1-wideep-h100.md)** | ||||||||||||||||||||||
| - **[Run DeepSeek-R1 on GB200s](docs/dsr1-wideep-gb200.md)** | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| ### Speculative Decoding | ||||||||||||||||||||||
| - **[Deploying DeepSeek-R1 with MTP - coming soon!](.)** | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Workspace version bump may desync local crate versions
Bumping
[workspace.package].versionto0.4.0+post0whileworkspace.dependenciespin local crates at0.4.0can cause version mismatches if member crates inherit the workspace version or if any member’spackage.versionis updated without updating these dependency constraints.Choose one consistent approach:
0.4.0for now.0.4.0+post0.If going with Option B, update the local workspace dependency versions:
I can also generate a script to scan member crates for
version.workspace = trueand align everything if helpful.🏁 Script executed:
Length of output: 8765
Action Required: Align local crate dependency versions with the bumped workspace version
The root
Cargo.tomlnow setsbut under
the local crates are still pinned at
0.4.0. Since each member crate usesversion.workspace = true, their actual version becomes0.4.0+post0, which will conflict with the pinned0.4.0inworkspace.dependencies. You have two choices:"0.4.0".[workspace.dependencies]to match0.4.0+post0.If you proceed with Option B, apply this diff in root Cargo.toml:
• Root manifest:
Cargo.toml• Lines: update the three
dynamo-*entries under[workspace.dependencies]📝 Committable suggestion
🤖 Prompt for AI Agents