Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

btl/sm: rewrite of fast box (per-peer receive buffers) #13037

Conversation

hjelmn
Copy link
Member

@hjelmn hjelmn commented Jan 14, 2025

I was investigating possibly lost btl/sm messages and realized that the code is difficult to follow and it is not always clear what I was attempting to do. It is not clear if there is a problem but the rewrite is worth committing.

This change does the following:

  • Seperate the fast box metadata out from the fast box receive data. These parts are logically separate so there was no need to keep adjusting the offset based on the metadata (start of buffer was offset 64, now 0).

  • Use modulo-math instead of toggling an extra bit to determine full vs empty. To keep this fast the modulo is done with bitwise-and with a mask and the fast box size has been limited to a power of two. This change simplifies the math and only has one special case to cover ( end overflow-- end less than start).

  • General cleanup of the code overall to improve readability.

@hjelmn hjelmn requested a review from bosilca January 14, 2025 07:22
Copy link
Member

@bosilca bosilca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the need for moving code around like this, but I also don't see any issues with the reorder. I guess the only missing point is a performance comparison for a basic test (aka ping/pong) and then a fan-in (gather) and fan-out (bcast or scatter) in a heavily loaded setup (64 or 128 processes).

@lrbison
Copy link
Contributor

lrbison commented Jan 21, 2025

I ran this PR through some internal CI we have, and while it didn't collect performance data, it at least passed smoke test for correctness issues on Graviton3 (64-core neoverse-v1).

@hjelmn hjelmn force-pushed the clean_up_btl_sm_fbox_code_and_fix_edge_condition_that_can_cause_lost_messages branch from 4b352f2 to 3c512e6 Compare January 31, 2025 19:57
opal/mca/btl/sm/btl_sm_fbox.h Outdated Show resolved Hide resolved
I was investigating possibly lost btl/sm messages and realized that the code is
difficult to follow and it is not always clear what I was attempting to do. It
is not clear if there is a problem but the rewrite is worth committing.

This change does the following:

 - Seperate the fast box metadata out from the fast box receive data. These
   parts are logically separate so there was no need to keep adjusting the
   offset based on the metadata (start of buffer was offset 64, now 0).

 - Use modulo-math instead of toggling an extra bit to determine full vs
   empty. To keep this fast the modulo is done with bitwise-and with a
   mask and the fast box size has been limited to a power of two. This
   change simplifies the math and only has one special case to cover (
   end overflow-- end less than start).

 - General cleanup of the code overall to improve readability.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
@hjelmn hjelmn force-pushed the clean_up_btl_sm_fbox_code_and_fix_edge_condition_that_can_cause_lost_messages branch from 3c512e6 to 95f7141 Compare February 4, 2025 00:11
@hjelmn hjelmn merged commit 2514b6e into open-mpi:main Feb 4, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants