-
Notifications
You must be signed in to change notification settings - Fork 440
add lb default #1465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add lb default #1465
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds two new command-line arguments to configure load balancing thresholds for the SGLang router: --sglang-router-balance-abs-threshold (integer, default 10) and --sglang-router-balance-rel-threshold (float, default 1.2).
Changes:
- Added two new router configuration arguments that control balance thresholds for the SGLang router
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| help="Absolute threshold for balance in the SGLang router", | ||
| ) | ||
| parser.add_argument( | ||
| "--sglang-router-balance-rel-threshold", | ||
| type=float, | ||
| default=1.2, | ||
| help="Relative threshold for balance in the SGLang router", |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The help text for these arguments is vague and doesn't explain what "balance" means in the context of the SGLang router, or how the absolute and relative thresholds are used. Consider providing more descriptive help text that explains:
- What aspect of the router is being balanced (e.g., load balancing across workers)
- How the absolute threshold is applied and what units it uses
- What the relative threshold represents (e.g., a ratio or percentage)
This will make it easier for users to understand how to configure these parameters appropriately.
| help="Absolute threshold for balance in the SGLang router", | |
| ) | |
| parser.add_argument( | |
| "--sglang-router-balance-rel-threshold", | |
| type=float, | |
| default=1.2, | |
| help="Relative threshold for balance in the SGLang router", | |
| help=( | |
| "Maximum allowed absolute difference in number of in-flight or queued " | |
| "requests between the most and least loaded SGLang workers before the " | |
| "router starts actively preferring less-loaded workers. " | |
| "Expressed as a count of requests." | |
| ), | |
| ) | |
| parser.add_argument( | |
| "--sglang-router-balance-rel-threshold", | |
| type=float, | |
| default=1.2, | |
| help=( | |
| "Maximum allowed relative load imbalance between SGLang workers before " | |
| "the router starts actively preferring less-loaded workers. " | |
| "Interpreted as a ratio of highest load to lowest load " | |
| "(e.g., 1.2 means the busiest worker may have up to 20% more requests " | |
| "than the least busy one)." | |
| ), |
451f6c5 to
af70d71
Compare
No description provided.