-
-
Notifications
You must be signed in to change notification settings - Fork 10.7k
[Misc] Use envs.VLLM_USE_RAY_COMPILED_DAG_CHANNEL_TYPE #15831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
# - "auto": use the default channel type | ||
# - "nccl": use NCCL for communication | ||
# - "shm": use shared memory and gRPC for communication | ||
# This flag is ignored if VLLM_USE_RAY_COMPILED_DAG is not set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC we enable CG in v1 by default? If so we should make it clear here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this statement is technically correct :) In V1 with ray backend, we set VLLM_USE_RAY_COMPILED_DAG
to 1.
I added a comment at the definition of VLLM_USE_RAY_COMPILED_DAG
.
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com> Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
6e12e19
to
89c563e
Compare
…15831) Signed-off-by: Rui Qiao <ruisearch42@gmail.com> Co-authored-by: Cody Yu <hao.yu.cody@gmail.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: xinyuxiao <xinyuxiao2024@gmail.com>
…15831) Signed-off-by: Rui Qiao <ruisearch42@gmail.com> Co-authored-by: Cody Yu <hao.yu.cody@gmail.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Louis Ulmer <ulmerlouis@gmail.com>
…15831) Signed-off-by: Rui Qiao <ruisearch42@gmail.com> Co-authored-by: Cody Yu <hao.yu.cody@gmail.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
…15831) Signed-off-by: Rui Qiao <ruisearch42@gmail.com> Co-authored-by: Cody Yu <hao.yu.cody@gmail.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
…15831) Signed-off-by: Rui Qiao <ruisearch42@gmail.com> Co-authored-by: Cody Yu <hao.yu.cody@gmail.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>
Use
envs.VLLM_USE_RAY_COMPILED_DAG_CHANNEL_TYPE
instead ofenvs.VLLM_USE_RAY_COMPILED_DAG_NCCL_CHANNEL
.Since ray 2.42 , the behavior of using
with_tensor_transport(transport='auto')
is that Compiled Graph will use NCCL if both the sender and receiver are using GPU (instead of CPU), which is almost always the case for vLLM. This PR allows overwriting the env var to use shared memory instead, which will be useful for debugging purposes. This also opens up opportunities to support additional channel types in future, say for different hardware.The original
VLLM_USE_RAY_COMPILED_DAG_NCCL_CHANNEL
is rarely changed, so backward compatibility should not be a concern.