-
Notifications
You must be signed in to change notification settings - Fork 68
Fix performance regression caused by enabling opal thread support #1287
Conversation
This commit adds opal_using_threads() protection around the atomic operation in OBJ_RETAIN/OBJ_RELEASE. This resolves the performance issues seen when running psm with MPI_THREAD_SINGLE. To avoid issues with header dependencies opal_using_threads() has been moved to a new header (thread_usage.h). The OPAL_THREAD_ADD* and OPAL_THREAD_CMPSET* macros have also been relocated to this header. This was cherry-picked off a fix applied to v1.8 and not master. See open-mpi/ompi#1902. (cherry picked from commit ce91307) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Build Failed with XL compiler! Please review the log, and get in touch if you have questions. Gist: https://gist.github.com/ibm-ompi/7f110ab6c94da788f0fed01d83c12215 |
Build Failed with GNU compiler! Please review the log, and get in touch if you have questions. Gist: https://gist.github.com/ibm-ompi/514bb6fe123a40de30f56e64a600d85b |
Test PASSed. |
:bot:milestone:v2.0.1 This fix was lost last year before the 2.x branch. |
Something has gone wrong (error 422). @jsquyres Please have a look at it! |
@nysal. Please review. |
:bot:milestone:v2.0.1 |
:bot:assign: @nysal |
} | ||
#else | ||
static inline int32_t | ||
OPAL_THREAD_ADD32(volatile int32_t *addr, int delta) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hjelmn do we need two versions of OPAL_THREAD_ADD32, OPAL_THREAD_ADD64 and OPAL_THREAD_ADD_SIZE_T. Wouldn't the compiler remove the dead code if opal_using_threads() returns false?
I added the cleanup commit to this PR due to a bug in OPAL_THREAD_ADD64. The argument was of type int when it should have been int64_t. |
Test FAILed. |
1 similar comment
Test FAILed. |
Build Failed with GNU compiler! Please review the log, and get in touch if you have questions. Gist: https://gist.github.com/431e65fab860a6cb7d4a3438cc46aaeb |
Test FAILed. |
Test FAILed. |
This commit expands the OPAL_THREAD macros to include 32- and 64-bit atomic swap. Additionally, macro declararations have been updated to include both OPAL_THREAD_* and OPAL_ATOMIC_*. Before this commit the former was used with add and the later with cmpset. This commit remove the OMPI_THREAD_MULTIPLE macros around the OPAL_THREAD_* function. This should not make a difference in performance as opal_uses_threads is #defined to false when OMPI_THREAD_MULTIPLE is 0. The compiler should optimize out the check on opal_using_threads. Signed-off-by: Nathan Hjelm <hjelmn@me.com> (cherry picked from commit aac6112) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Build Failed with XL compiler! Please review the log, and get in touch if you have questions. Gist: https://gist.github.com/cbddb0dcc3fb86f444f37d60861b46c4 |
The XL compiler build failed with:
The opal/prefetch.h header is probably not included |
@nysal Yeah. Fixed that. I brought over one more commit because OPAL_THREAD_ADD64 and OPAL_THREAD_ADD_SIZE_T have the wrong argument type for delta. Been wrong for years. Figured we should go ahead and fix it now. |
Test PASSed. |
|
||
#include "opal/sys/atomic.h" | ||
#include "opal/prefetch.h" | ||
#include "opal_config.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we move "#include "opal_config.h"" to the top, although "opal/sys/atomic.h" does include it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
opal_config.h should always be the first header file included in any file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Figured it was pointless to include since it is included through atomic.h. Will update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should you add a #if !defined(...
) kind of check at the top to guarantee that this file is only included via atomic.h
?
With the previous comment about "opal_config.h" addressed, it looks good to me 👍 |
@hppritcha Good to go |
This commit adds opal_using_threads() protection around the atomic
operation in OBJ_RETAIN/OBJ_RELEASE. This resolves the performance
issues seen when running psm with MPI_THREAD_SINGLE.
To avoid issues with header dependencies opal_using_threads() has been
moved to a new header (thread_usage.h). The OPAL_THREAD_ADD* and
OPAL_THREAD_CMPSET* macros have also been relocated to this header.
This was cherry-picked off a fix applied to v1.8 and not master. See
open-mpi/ompi#1902.
(cherry picked from commit ce91307)
Signed-off-by: Nathan Hjelm hjelmn@lanl.gov