-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RUN: performance hit from ucx 1.11 to 1.12 with OMPI & OSU benchmarks #7947
Comments
@Artemy-Mellanox can you pls look? seems related to RNDV_THRESH on Fujitsu ARM platform |
@yosefe that worked. Setting THRESH=4m reproduces the original behavior |
And RNDV_THRESH=16k with 1.12 was still bad perf? |
Sorry, wasn't clear. The 16k setting was back where I expected it to be (same as 1.11) |
Just as sanity check: with 1.12.x, on ThunderX2, |
Issue still present in 1.12.1-rc3 |
@tonycurtis can you pls confirm if adding #7967 on top of v1.12.x fixes the issue? |
Hunk #11 rejected, but getting correct default b/w now |
@Artemy-Mellanox can you pls port the fix to v1.12.x? please refer to this issue in PR description. |
Describe the bug
Running v5.8 of the OSU benchmarks: here's the MPI pt2pt bidirectional b/w (2 nodes, 1 rank per node)
With Open-MPI 4.1.2 + UCX 1.11.2
With Open-MPI 4.1.2 + UCX 1.12 (.0 and .1-rc2)
UCX_LOG_LEVEL=info shows rc_mlx5 is being used inter-node
Setup and versions
cat /etc/issue
orcat /etc/redhat-release
+uname -a
(aarch64 == a64fx)
The text was updated successfully, but these errors were encountered: