Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TL/CUDA: fix pipelining in linear rs #770

Merged
merged 1 commit into from
May 8, 2023

Conversation

Sergei-Lebedev
Copy link
Contributor

What

Fix CL HIER Split Rail Allreduce with TL CUDA

How ?

Fixes several bugs

  1. CL HIER split rail algorithm doesn't provide correct buffer for inplace reduce scatter, it passes original rbuf with stairway patter rank offset
  2. TL CUDA linear reduce scatter doesn't update rbuf pointer after fragment setup of pipelined schedule

@Sergei-Lebedev Sergei-Lebedev force-pushed the topic/fix_split_rail branch from 0f09708 to 61abc41 Compare May 8, 2023 07:07
@Sergei-Lebedev Sergei-Lebedev merged commit 9a2f2c8 into openucx:master May 8, 2023
@Sergei-Lebedev Sergei-Lebedev deleted the topic/fix_split_rail branch May 8, 2023 09:05
janjust pushed a commit to janjust/ucc that referenced this pull request Jan 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants