Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eagerly create common nccl communicator(s) during init #17108

Open
skye opened this issue Sep 12, 2024 · 0 comments
Open

Eagerly create common nccl communicator(s) during init #17108

skye opened this issue Sep 12, 2024 · 0 comments
Labels
enhancement New feature or request NVIDIA-GPU XLA on Nvidia GPU

Comments

@skye
Copy link
Contributor

skye commented Sep 12, 2024

Right now all nccl communicators are created on-demand, which can increase startup time. We should create the common ones in the background on init (e.g. all GPUs, maybe that's it?).

Startup time has been cited by users as an expensive problem when running on large GPU clusters.

cc @ezhulenev

@nouiz nouiz added NVIDIA-GPU XLA on Nvidia GPU enhancement New feature or request labels Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request NVIDIA-GPU XLA on Nvidia GPU
Projects
None yet
Development

No branches or pull requests

2 participants