forked from open-mpi/ompi
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Master tls refactor Part 2, rework address table as hash, and other minor changes #6
Open
janjust
wants to merge
239
commits into
master-tls-refactor_v4
Choose a base branch
from
master-tls-refactor_v5
base: master-tls-refactor_v4
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Christoph Niethammer <niethammer@hlrs.de>
Signed-off-by: Christoph Niethammer <niethammer@hlrs.de>
We completely disable C11 atomic op support for _Atomic for all Intel compiler prior to 20200310 (which is currently the latest released), by switching to our pre-C11 atomic operations. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
`--enable-mem-debug` `#define`s `realloc`/`free` as macros, though macros are also matched if they appear in references to members. Rename the members to avoid this matching. See open-mpi#6995 Signed-off-by: Bert Wesarg <bert.wesarg@tu-dresden.de>
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
EFA incorrectly implements FI_DELIVERY_COMPLETE in earlier libfabric versions. While FI_DELIVERY_COMPLETE would be advertised by the provider, completions would return too early by not accounting for bounce buffers on the receive side. This would cause the BTL to receive early completions that lead to correctness issues. This is not an issue in the mtl/ofi as it does not require FI_DELIVERY_COMPLETE. Signed-off-by: William Zhang <wilzhang@amazon.com>
The btl/ofi does not currently utilize the common ofi include/exclude list. Added verification code similar to the mtl/ofi that will check if the info object is in the include or exclude list. If it isn't in the include list or is in the exclude list, validate_info will return OPAL_ERROR. The btl/ofi will no longer pass a provider name as a hint when calling getinfo, instead filtering the provider during validate_info. This patch also moves the is_in_list MTL function into common code and adds additional debugging output to the BTL to match the MTL standard. Signed-off-by: William Zhang <wilzhang@amazon.com>
Minor fix in cmd line parser help
…take2 Second take on fixing the Intel _Atomic atomic operation warning
btl/ofi: Use common provider include/exclude list
(`prte_hwloc_base_get_locality_string` never returns locality string with L0). Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
janjust
changed the title
Master tls refactor Part 2, address table is hash, and other minor changes
Master tls refactor Part 2, rework address table as hash, and other minor changes
Aug 11, 2020
opal/hwloc: fix a typo in parsing locality string
bug fix: des->tag = hdr->frag, should be hdr->tag
The ofi_rxm provider is dependent upon the underlying hardware for its implementation of FI_DELIVERY_COMPLETE. Since this can lead to early completions, we disable the provider to avoid correctness issues. This is not an issue in the mtl/ofi as it does not require FI_DELIVERY_COMPLETE. Signed-off-by: William Zhang <wilzhang@amazon.com>
btl/ofi: Disable EFA provider in versions earlier than libfabric 1.12.0
The C++ bindings were removed a while ago; MPI::ERRORS_THROW_EXCEPTIONS and MPI_ERRORS_THROW_EXCEPTIONS no longer exist. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
MPI-4 is finally cleaning up its language: an MPI "exception" does not actually exist. The only thing that exists is an MPI "error" (and associated handlers). This commit replaces all relevant uses of the word "exception" with "error". Note that this is still applicable in versions of the MPI standard less than MPI-4.0 (indeed, nearly all the cases fixed in this commit are just changes to comments, anyway). One exception to this is the Java bindings, where there's an MPIException class. In hindsight, it probably should have been named MPIError, but changing it now would break anyone who is using the Java bindings. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
…rs-and-exceptions Cleanup of MPI errors and exceptions
the ofi mtl mrecv was not properly setting the message in/out arg to MPI_MRECV to MPI_MESSAGE_NULL. Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
…_fix ofi mtl: fix problem with mrecv
improve configury to check whether icc is handling no long double. This prevents seeing 100s of messages like this: icc: command line warning open-mpi#10148: option '-Wno-long-double' not supported A similar patch will be needed for pmix. Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
Add comments in the ADAPT module Signed-off-by: Xi Luo <xluo12@vols.utk.edu> Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
* piggybacking Bull functionalities * coll/adapt: Fix naming conventions and C11 atomic use This commit fixes some naming convention issues, such as function names which should follow the naming ompi_coll_adapt instead of mca_coll_adapt, reserved for component and module naming (cf. tuned collective component); It also fixes the use of _Atomic construct, which is only valid in C11. OPAL constructs have already been adapted to that use, so use opal_atomic_* types instead. * coll/adapt: Remove unused component field in module This commit removes an unneeded field referencing the component in the module of adapt, as it is already available through the mca_coll_adapt_component global variable. Signed-off-by: Marc Sergent <marc.sergent@atos.net> Co-authored-by: Lemarinier, Pierre <pierre.lemarinier@atos.net> Co-authored-by: pierrele <31764860+pierrele@users.noreply.github.com>
API consistent with other collective modules Add comments Other minor cleanups. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
As it is possible to have multiple outstanding non-blocking collectives provided by different collective modules, we need a consistent mechanism to allow them to select unique tags for each instance of a collective. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Set request in ibcast.c to empty when the count is 0. Signed-off-by: Xi Luo <xluo12@vols.utk.edu> Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Reduce scatter block and reduce scatter algorithms were hitting correctness issues for non commutative strided tests. We will revert to the original default algorithms for those two collectives (basic linear and non overlapping respectively) in the non commutative op case. See open-mpi#8010 Signed-off-by: William Zhang <wilzhang@amazon.com>
…iling param.c Signed-off-by: Pak Lui <pak.lui@amd.com>
- there was potential leak in error handling, fixed Signed-off-by: Sergey Oblomov <sergeyo@nvidia.com>
COLL TUNED: Use per-rank data size instead of total size for decision in allgatherv
oshmem/tools/oshmem_info: fix fortran keyword issue when compiling param.c
Signed-off-by: Ralph Castain <rhc@pmix.org>
Do not pass --enable-debug to internal hwloc
Seems like a copy/pasted typo in ob1 comments Signed-off-by: Julien EMMANUEL <julien.emmanuel@inria.fr>
The selectable list is sorted with lowest to highest priority so the user-defined preferences should be appended to the list. The preference treatment should also maintain the order provided by the user (first item has highest priority) so switch the loop order. Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
…e components Also make coll/tuned the default for shared memory communication as coll/sm has shown performance issues that need investigation. Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
This has shown to be more effective in achieving overlap of inter- and intra-node communication and reduces the inital delay before hitting the network. Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
In ob1 we have four similar conditions but they are not written in a uniform way Signed-off-by: Julien EMMANUEL <julien.emmanuel@inria.fr>
Typo in ob1 comments, and uniform conditions
Fix preference treatment in coll/base
…arning-wpool PML/UCX/WPOOL: fixed coverity issue
- fix path to getdate.sh - do not prepend "date" to the revision - support git worktree Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Thanks FX Coudert for reporting this issue and pointing to a solution. Refs. open-mpi#8218 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
autogen.pl: patch libtool.m4 for OSX Big Sur
Fix many compiler warnings
…RSION configury: fix OPAL_GET_VERSION
Resolves the PRRTE launch scale limitation Signed-off-by: Ralph Castain <rhc@pmix.org>
Update PMIx/PRRTE pointers
Exclude HAN, don't include it. Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
coll/han: fix coll preference selection in mca_coll_han_comm_create_new
Signed-off-by: Leonid Genkin <lgenkin@nvidia.com>
Replace usage of the deprecated NB API of UCX with NBX
…ry functionality to support world_comm rank translation Co-authored-by: Artem Polyakov <artpol84@gmail.com> Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
Update opal/mca/common/ucx/common_ucx_wpool.c Update opal/mca/common/ucx/common_ucx_wpool.h Update opal/mca/common/ucx/common_ucx_wpool_int.h Co-authored-by: Artem Polyakov <artpol84@gmail.com> Signed-off-by: Tomislav Janjusic <tomislavj@nvidia.com>
Co-authored-by: Artem Polyakov <artpol84@gmail.com> Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
janjust
force-pushed
the
master-tls-refactor_v5
branch
from
November 30, 2020 16:44
a24893e
to
b7683a4
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.