-
Notifications
You must be signed in to change notification settings - Fork 7k
[serve][llm] Unify and Extend Builder Configuration for LLM Deployments #57724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Removed the `cu121` build argument from `rayllm.depsets.yaml`. - Updated the `nixl` package version to `0.6.1` in multiple lock files to ensure compatibility. - Adjusted the Dockerfile to comment out unused build arguments related to `ROOT_DIR`, `GDR_HOME`, `UCX_HOME`, and `NIXL_HOME`. - Cleaned up several lock files by deleting outdated versions for `cu121` and ensuring consistency across `cu128` and `cpu` configurations. These changes aim to streamline the dependency management and improve the build process for the rayllm project. Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant and well-executed revamp of the Prefill/Decode (P/D) and standard LLM deployment APIs. The move to a unified, Pydantic-based configuration model is a major improvement, making the APIs more intuitive, type-safe, and extensible. The ability to use pluggable components for ingress and proxy via configuration is a fantastic feature that will greatly simplify customizations for users. The code is cleaner and more maintainable as a result of this refactoring.
My review includes a couple of points for consideration. One is a behavioral change regarding the kv_transfer_config in P/D deployments, which is now mandatory. The other is a minor regression in parsing LLM configurations from raw YAML strings.
Overall, this is an excellent contribution that significantly enhances the usability and flexibility of Ray Serve's LLM capabilities.
python/ray/llm/_internal/serve/deployments/prefill_decode_disagg/builder_pd.py
Show resolved
Hide resolved
ruisearch42
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just some nitpicks
python/ray/serve/llm/__init__.py
Outdated
| Args: | ||
| pd_serving_args: The dictionary containing prefill and decode configs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update this as well?
pd_serving_args: A dict that conforms to the PDServingArgs pydantic model.
| class ProxyClsConfig(BaseModelExtended): | ||
| proxy_cls: Union[str, type[PDProxyServer]] = Field( | ||
| default=PDProxyServer, | ||
| description="The class name of the proxy class.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: The proxy class or class name
| elif isinstance(value, LLMConfig): | ||
| return value | ||
| else: | ||
| raise ValueError(f"Invalid LLMConfig: {value}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: TypeError?
| @field_validator("prefill_config") | ||
| @classmethod | ||
| def _validate_prefill_config(cls, value: Any) -> LLMConfig: | ||
| return cls._validate_llm_config(value) | ||
|
|
||
| @field_validator("decode_config") | ||
| @classmethod | ||
| def _validate_decode_config(cls, value: Any) -> LLMConfig: | ||
| return cls._validate_llm_config(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can do this?
@field_validator("prefill_config", "decode_config")
@classmethod
def _validate_llm_configs(cls, value: Any) -> LLMConfig:
return cls._validate_llm_config(value)
| elif isinstance(config, LLMConfig): | ||
| llm_configs.append(config) | ||
| else: | ||
| raise ValueError(f"Invalid LLMConfig: {config}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TypeError?
|
release tests failing here: https://buildkite.com/ray-project/release/builds/63841 ❌ |
…ts (ray-project#57724) Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>
…ts (ray-project#57724) Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
…ts (ray-project#57724) Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Signed-off-by: xgui <xgui@anyscale.com>
Original PR #57724 by kouroshHakha Original: ray-project/ray#57724
…ation for LLM Deployments Merged from original PR #57724 Original: ray-project/ray#57724
…ts (#57724) Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
…ts (ray-project#57724) Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
…ts (ray-project#57724) Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Unify and Extend Builder Configuration for LLM Deployments
Overview
This PR transforms Ray Serve LLM builders by providing:
Previously, users who needed to customize ingress behavior or P/D proxy logic had to fork and maintain their own builder functions. Now, the builder configuration supports pluggable components out of the box.
Problem Statement
Inconsistent Configuration Patterns
Standard LLM and P/D builders used different configuration approaches, forcing users to learn multiple patterns.
Limited Extensibility
Users couldn't easily customize core components:
Solution
1. Unified Configuration Pattern
Both builders now share the same intuitive structure:
2. Pluggable Component System
Inject custom components without custom builders:
Key Features
Component Extensibility
Users can now customize any component:
No custom builder needed! Just implement your component class with the expected signature.
Flexible Input Types
Accept dicts, objects, or file paths everywhere:
Sensible Defaults with Override Ability
Type-Safe with String Paths
Load custom classes from strings for YAML/config file compatibility:
Use Cases Enabled
Before: Custom Builder Required
After: Configuration Only
Benefits
Testing
Release tests: https://buildkite.com/ray-project/release/builds/63841 ✅
Future work