Skip to content
This repository was archived by the owner on Sep 30, 2022. It is now read-only.

Fix performance regression caused by enabling opal thread support #1287

Merged
merged 2 commits into from
Aug 3, 2016

Conversation

hjelmn
Copy link
Member

@hjelmn hjelmn commented Jul 28, 2016

This commit adds opal_using_threads() protection around the atomic
operation in OBJ_RETAIN/OBJ_RELEASE. This resolves the performance
issues seen when running psm with MPI_THREAD_SINGLE.

To avoid issues with header dependencies opal_using_threads() has been
moved to a new header (thread_usage.h). The OPAL_THREAD_ADD* and
OPAL_THREAD_CMPSET* macros have also been relocated to this header.

This was cherry-picked off a fix applied to v1.8 and not master. See
open-mpi/ompi#1902.

(cherry picked from commit ce91307)

Signed-off-by: Nathan Hjelm hjelmn@lanl.gov

This commit adds opal_using_threads() protection around the atomic
operation in OBJ_RETAIN/OBJ_RELEASE. This resolves the performance
issues seen when running psm with MPI_THREAD_SINGLE.

To avoid issues with header dependencies opal_using_threads() has been
moved to a new header (thread_usage.h). The OPAL_THREAD_ADD* and
OPAL_THREAD_CMPSET* macros have also been relocated to this header.

This was cherry-picked off a fix applied to v1.8 and not master. See
open-mpi/ompi#1902.

(cherry picked from commit ce91307)

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
@ibm-ompi
Copy link

Build Failed with XL compiler! Please review the log, and get in touch if you have questions.

Gist: https://gist.github.com/ibm-ompi/7f110ab6c94da788f0fed01d83c12215

@ibm-ompi
Copy link

Build Failed with GNU compiler! Please review the log, and get in touch if you have questions.

Gist: https://gist.github.com/ibm-ompi/514bb6fe123a40de30f56e64a600d85b

@mellanox-github
Copy link

Test PASSed.
See http://bgate.mellanox.com/jenkins/job/gh-ompi-release-pr/1989/ for details.

@hjelmn
Copy link
Member Author

hjelmn commented Jul 28, 2016

:bot:milestone:v2.0.1
:bot🏷️bug
:bot:assign: @nysal

This fix was lost last year before the 2.x branch.

@ompiteam-bot
Copy link

Something has gone wrong (error 422).

@jsquyres Please have a look at it!

@hjelmn
Copy link
Member Author

hjelmn commented Jul 28, 2016

@nysal. Please review.

@hjelmn
Copy link
Member Author

hjelmn commented Jul 28, 2016

:bot:milestone:v2.0.1
:bot🏷️bug

@ompiteam-bot ompiteam-bot added this to the v2.0.1 milestone Jul 28, 2016
@jsquyres jsquyres assigned jsquyres and unassigned jsquyres Jul 28, 2016
@hjelmn
Copy link
Member Author

hjelmn commented Jul 28, 2016

:bot:assign: @nysal

}
#else
static inline int32_t
OPAL_THREAD_ADD32(volatile int32_t *addr, int delta)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hjelmn do we need two versions of OPAL_THREAD_ADD32, OPAL_THREAD_ADD64 and OPAL_THREAD_ADD_SIZE_T. Wouldn't the compiler remove the dead code if opal_using_threads() returns false?

@hjelmn
Copy link
Member Author

hjelmn commented Jul 29, 2016

I added the cleanup commit to this PR due to a bug in OPAL_THREAD_ADD64. The argument was of type int when it should have been int64_t.

@lanl-ompi
Copy link
Contributor

Test FAILed.

1 similar comment
@lanl-ompi
Copy link
Contributor

Test FAILed.

@ibm-ompi
Copy link

Build Failed with GNU compiler! Please review the log, and get in touch if you have questions.

Gist: https://gist.github.com/431e65fab860a6cb7d4a3438cc46aaeb

@mellanox-github
Copy link

Test FAILed.
See http://bgate.mellanox.com/jenkins/job/gh-ompi-release-pr/1995/ for details.

@lanl-ompi
Copy link
Contributor

Test FAILed.

This commit expands the OPAL_THREAD macros to include 32- and 64-bit
atomic swap. Additionally, macro declararations have been updated to
include both OPAL_THREAD_* and OPAL_ATOMIC_*. Before this commit the
former was used with add and the later with cmpset.

This commit remove the OMPI_THREAD_MULTIPLE macros around the
OPAL_THREAD_* function. This should not make a difference in
performance as opal_uses_threads is #defined to false when
OMPI_THREAD_MULTIPLE is 0. The compiler should optimize out the check on
opal_using_threads.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>

(cherry picked from commit aac6112)

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
@ibm-ompi
Copy link

Build Failed with XL compiler! Please review the log, and get in touch if you have questions.

Gist: https://gist.github.com/cbddb0dcc3fb86f444f37d60861b46c4

@nysal
Copy link
Member

nysal commented Jul 29, 2016

The XL compiler build failed with:

In file included from dss/dss_copy.c:24:
In file included from ../opal/util/output.h:70:
In file included from ../opal/class/opal_object.h:128:
../opal/threads/thread_usage.h:144:1: warning: implicit declaration of function 'OPAL_UNLIKELY' is invalid in C99 [-Wimplicit-function-declaration]
OPAL_THREAD_DEFINE_ATOMIC_ADD(int32_t, 32)
^
../opal/threads/thread_usage.h:107:9: note: expanded from macro 'OPAL_THREAD_DEFINE_ATOMIC_ADD'
    if (OPAL_UNLIKELY(opal_using_threads())) {                          \
        ^
1 warning generated.
1 warning generated.

The opal/prefetch.h header is probably not included

@hjelmn
Copy link
Member Author

hjelmn commented Jul 29, 2016

@nysal Yeah. Fixed that. I brought over one more commit because OPAL_THREAD_ADD64 and OPAL_THREAD_ADD_SIZE_T have the wrong argument type for delta. Been wrong for years. Figured we should go ahead and fix it now.

@mellanox-github
Copy link

Test PASSed.
See http://bgate.mellanox.com/jenkins/job/gh-ompi-release-pr/1996/ for details.


#include "opal/sys/atomic.h"
#include "opal/prefetch.h"
#include "opal_config.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move "#include "opal_config.h"" to the top, although "opal/sys/atomic.h" does include it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opal_config.h should always be the first header file included in any file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Figured it was pointless to include since it is included through atomic.h. Will update.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you add a #if !defined(...) kind of check at the top to guarantee that this file is only included via atomic.h?

@nysal
Copy link
Member

nysal commented Aug 3, 2016

With the previous comment about "opal_config.h" addressed, it looks good to me 👍

@jsquyres
Copy link
Member

jsquyres commented Aug 3, 2016

@hppritcha Good to go

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants