-
Notifications
You must be signed in to change notification settings - Fork 693
fix: Adjust frontEnd thresholds #2288 #2290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Caution Review failedFailed to post review comments. WalkthroughThis update introduces substantial documentation restructuring, new deployment guides, and configuration improvements across multiple backend components. Notably, it adds Kubernetes deployment examples and guides for TRTLLM and vLLM, reorganizes and corrects support matrix and installation instructions, modularizes port allocation logic for the vLLM backend, and migrates engine configuration files to new formats. Several Dockerfiles and build scripts are updated for version consistency and improved health check tooling. The multimodal example documentation is removed, and Grove feature configuration in the operator is refactored to use a termination delay parameter with runtime detection. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant EtcdClient
participant PortsModule
participant vLLMBackend
User->>vLLMBackend: Start backend with CLI args (--dynamo-port-min/max)
vLLMBackend->>PortsModule: Request port allocation block (tp_size)
PortsModule->>PortsModule: Bind and check port block availability
PortsModule->>EtcdClient: Reserve port block in ETCD with metadata
PortsModule-->>vLLMBackend: Return allocated port block
vLLMBackend->>vLLMBackend: Configure side channel and KV ports
vLLMBackend-->>User: Backend ready with reserved ports
sequenceDiagram
participant Operator
participant K8sAPI
participant GroveAPI
Operator->>K8sAPI: Discover API groups
K8sAPI-->>Operator: Return API groups
Operator->>Operator: Detect Grove availability
Operator->>Operator: Set Grove.Enabled and TerminationDelay
Operator->>K8sAPI: Deploy resources with Grove config (if available)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.2.2)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/product/migration-guide for migration instructions Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@biswapanda @atchernych I noticed that this already was merged ot main, but will this work? These values don't seem too reasonable to me as they give the worker 30 seconds to fully bootstrap.
|
This is not a release blocker for 04.0 closing the PR |
Pull request was closed
Overview:
Cherry-pick: #2288 (merged to main)
More reasonable values to avoid customer confusion (https://nvbugspro.nvidia.com/bug/5425651)
closes: https://nvbugspro.nvidia.com/bug/5425651
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Refactor
Chores