-
Notifications
You must be signed in to change notification settings - Fork 868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
btl/base: add subsystem to support for active-message based RDMA/atomic emulation #8493
Conversation
Needed to provide working RMA when no RDMA-capable device is present. |
bot:aws:retest |
Is this series of commits intended to fix #7830 ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, except for a few nit-picks.
opal/mca/btl/base/btl_base_am_rdma.c
Outdated
if (NULL == operation) { | ||
/* couldn't even allocate a small amount of memory. not much else can be done. */ | ||
BTL_ERROR(("could not allocate memory to queue active-message RDMA operation")); | ||
abort(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a more graceful way then to call abort here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please do not call abort()
-- that can lead to terrabytes of corefiles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is either abort or SEGV. IMHO abort is much more graceful. If we get here things are bad and we will abort or SEGV somewhere if we can't even allocate 64 bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
abort()
drops a corefile -- that's kinda anti-social.
If you need to exit, please call the btl error function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already called. In that case abort is not even reached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait. BTL_ERROR doesn't exit so we still want it to exit here. It will SEGV which will dump core files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I can remove the aborts() but the semantics of this error will not change. It will print the error message then crash when operation is dereferenced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I agree with @jsquyres on the abort thing; I think this is precisely when we want a core file (and, I believe, we call abort elsewhere). This is a should "never, ever happen" type thing, not a "we got our configuration wrong" type thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should have an option for dropping corefiles. It is very anti-social to have an MTT run fill up filesystems with terrabytes of corefiles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't happen ever in MTT until you are expecting to completely run out of memory. But, again, in that case we really can't control whether you core or not. Up to your shell to set ulimit -c 0
.
1dbe39d
to
1fa152d
Compare
I resolved the conflict in the copyright. Let's move to get this in/verified. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fixes, LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should rebase the branch and fix up the conflicts, not create another merge commit. @awlauria please undo this.
@bwbarrett I agree. However this pr has been open for weeks, I did it for expediency. If we want this rebased, please ping @hjelmn to do so in a timely manner. |
6010c00
to
59de860
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approving, although the header issue makes me really nervous.
f0100b1
to
d318916
Compare
This commit adds a new subsystem to the btl base: active-message RDMA. The subsystem can be activated on any non-RDMA or partial- RDMA btl (for example tcp) by calling: ```c mca_btl_base_am_rdma_init(mca_btl_base_module_t *btl). ``` Once called the btl module supplied with be augmented to have btl_get, btl_put, btl_fop, and btl_cswap functions. Flags will be set to indicate that the BTL is using active-message versions of these functions. Calling this function on a btl will register a progress function for handling retries of intiator or target-side operations. There may be some overhead once the progress function is registered. Active-message "RDMA" support is not activated on any BTL module by default. Signed-off-by: Nathan Hjelm <hjelmn@google.com>
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
This commit updates osc/rdma to support using alternate BTLs when a primary BTL is not available. There may be at most two alternate BTLs in use at any time. The default is selected to cover shared memory (sm) and off-node (tcp). The priority of osc/rdma is a bit lower when using a set of alternate btls. This will allow another osc component to win if there is an alternative. Signed-off-by: Nathan Hjelm <hjelmn@google.com>
This removes the emulated RDMA code provided as a fallback so btl/sm could be used for osc/rdma. This functionality is now provided by ```btl_base_am_rdma_init ()``` so the dedicated version in btl/sm is no longer needed. Signed-off-by: Nathan Hjelm <hjelmn@google.com>
d318916
to
214f6a6
Compare
bot:aws:retest |
ubuntu executor died with a java exception. |
bot:aws:retest |
I still see 6 uses of Outside the scope of this PR, I wonder if we should make I'm really sensitive to this whole corefile issue because there have been many times over the years where MTT caused the shared filesystem I use for Cisco MTT to fill up with terrabytes/petabytes/whateverbytes of corefiles. My colleagues get (rightfully) pissed at me when this happens. It could be helpful if one could optionally turn on corefiles from |
Nathan's right; there's not a great way for OMPI to decide whether to drop a core file or not. That's really a system setting (via ulimit). We use abort() or assert() or similar so liberally in the OMPI code base, that this seems like a really weird place to hold the line. |
Given that Cisco isn't running MTT and the folks who are running MTT approve this PR, let's separate the question of what to do with abort (dealing with it in #8557) and get this in so MTT quits failing for those who are running it. |
Point of fact: Cisco is running MTT, but it's currently in "trial" mode (i.e., results are not visible by default) because I had some issues with networking and equipment being powered off. Also, I objected to the use of |
No description provided.