Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core][experimental] Allow custom NCCL group declaration in accelerated DAG #45593

Closed
stephanie-wang opened this issue May 28, 2024 · 3 comments · Fixed by #47141
Closed

[core][experimental] Allow custom NCCL group declaration in accelerated DAG #45593

stephanie-wang opened this issue May 28, 2024 · 3 comments · Fixed by #47141
Assignees
Labels
compiled-graphs core Issues that should be addressed in Ray Core enhancement Request for new feature and/or capability P0 Issues that should be fixed in short order

Comments

@stephanie-wang
Copy link
Contributor

Description

Currently, accelerated DAGs automatically create a NCCL group of actors based on the annotated DAG nodes. However, sometimes user code will already create a NCCL group, or they will want more explicit control over which actors are in the NCCL group, especially if there are multiple groups. To support this, we should allow the user to pass in custom NCCL groups when creating the accelerated DAG.

Use case

No response

@stephanie-wang stephanie-wang added enhancement Request for new feature and/or capability triage Needs triage (eg: priority, bug/not-bug, and owning component) P1 Issue that should be fixed within a few weeks compiled-graphs and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels May 28, 2024
@kevin85421 kevin85421 self-assigned this Jun 24, 2024
@anyscalesam
Copy link
Contributor

@stephanie-wang how important is this; is this needed to enable pipeline parallel training? Or is this just a general ADAG devex improvement?

@stephanie-wang
Copy link
Contributor Author

No, it should not be necessary. Just a UX improvement.

@rkooo567
Copy link
Contributor

rkooo567 commented Aug 8, 2024

this is an important feature for vllm integration (and is explicitly asked)

@rkooo567 rkooo567 added P0 Issues that should be fixed in short order and removed P1 Issue that should be fixed within a few weeks labels Aug 12, 2024
@anyscalesam anyscalesam added the core Issues that should be addressed in Ray Core label Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiled-graphs core Issues that should be addressed in Ray Core enhancement Request for new feature and/or capability P0 Issues that should be fixed in short order
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants