Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TL/UCP: reorder ranks for SRA #834

Merged
merged 1 commit into from
Oct 2, 2023

Conversation

shimmybalsam
Copy link
Collaborator

@shimmybalsam shimmybalsam commented Aug 31, 2023

What

Add reorder ranks in for Allreduce SRA

Why ?

Performance benfit, especialy when n_extra ranks > 0, in which case we do a complete allreduce between "proxy" and "extra" ranks and if they are on the same node this significantly improves perf.

How ?

By reordring ranks in each step: reduce_scatter_knomial, allgather_knomial.

Note: allgather_knomial is used also for small message allgather and large message bcast (bcast SAG). In these cases reorder ranks will not be used, it is currently only implemented for the usage within allreduce SRA.

@shimmybalsam shimmybalsam force-pushed the reorder_ranks_SRA branch 2 times, most recently from aaea7fe to bd5f180 Compare August 31, 2023 12:37
@shimmybalsam
Copy link
Collaborator Author

shimmybalsam commented Aug 31, 2023

Without extra ranks:

image

With extra ranks:

image

@shimmybalsam
Copy link
Collaborator Author

bot:retest

Copy link
Contributor

@Sergei-Lebedev Sergei-Lebedev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we instead make map part of knomial pattern. That way we don't need to repeat ucc_ep_map_eval every time we get peer.
In future we can also add ring patter to do the same for ring algorithms

@shimmybalsam
Copy link
Collaborator Author

can we instead make map part of knomial pattern. That way we don't need to repeat ucc_ep_map_eval every time we get peer. In future we can also add ring patter to do the same for ring algorithms

As discussed on call, in my opinion we should merge this PR as is as first step, since it substantially improves performance of SRA. In future we can rewrite knomial pattern functions (which will effect all knomial algorithms in TL/UCP).

Meanwhile here is some more supportive data:

image

@shimmybalsam
Copy link
Collaborator Author

bot:retest

@shimmybalsam shimmybalsam force-pushed the reorder_ranks_SRA branch 2 times, most recently from d7857f5 to 137de91 Compare September 27, 2023 07:04
@Sergei-Lebedev Sergei-Lebedev merged commit fc60143 into openucx:master Oct 2, 2023
nsarka pushed a commit to nsarka/ucc that referenced this pull request Oct 24, 2023
nsarka pushed a commit to nsarka/ucc that referenced this pull request Oct 24, 2023
janjust pushed a commit to janjust/ucc that referenced this pull request Jan 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants