-
Notifications
You must be signed in to change notification settings - Fork 925
v3.1.x: Do not use CMA in user namespaces #6999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v3.1.x: Do not use CMA in user namespaces #6999
Conversation
FYI @adrianreber @bwbarrett Thoughts on bringing this back to v3.1.x? |
The IBM CI (GNU Compiler) build failed! Please review the log, linked below. Gist: https://gist.github.com/a4e9cc27cd5fc186815ed2c64bf90977 |
The IBM CI (XL Compiler) build failed! Please review the log, linked below. Gist: https://gist.github.com/4d887b7ab041b69e82db35ee4d05a924 |
623d761
to
c4fe1d6
Compare
@adrianreber It turns out that Can you please test that the functionality still works correctly for you? |
On top of this PR following one line change is needed: diff --git a/opal/mca/btl/vader/btl_vader_module.c b/opal/mca/btl/vader/btl_vader_module.c
index 8f704c8fca..15071f968e 100644
--- a/opal/mca/btl/vader/btl_vader_module.c
+++ b/opal/mca/btl/vader/btl_vader_module.c
@@ -252,6 +252,7 @@ static int init_vader_endpoint (struct mca_btl_base_endpoint_t *ep, struct opal_
opal_show_help("help-btl-vader.txt", "cma-different-user-namespace-warning",
true, opal_process_info.nodename);
mca_btl_vader_component.single_copy_mechanism = MCA_BTL_VADER_NONE;
+ mca_btl_vader.super.btl_flags &= ~MCA_BTL_FLAGS_RDMA;
mca_btl_vader.super.btl_get = NULL;
mca_btl_vader.super.btl_put = NULL;
mca_btl_vader.super.btl_put_limit = 0; Can you amend the existing commit? |
Trying out to run processes via mpirun in Podman containers has shown that the CMA btl_vader_single_copy_mechanism does not work when user namespaces are involved. Creating containers with Podman requires at least user namespaces to be able to do unprivileged mounts in a container Even if running the container with user namespace user ID mappings which result in the same user ID on the inside and outside of all involved containers, the check in the kernel to allow ptrace (and thus process_vm_{read,write}v()), fails if the same IDs are not in the same user namespace. One workaround is to specify '--mca btl_vader_single_copy_mechanism none' and this commit adds code to automatically skip CMA if user namespaces are detected and fall back to MCA_BTL_VADER_NONE (as opposed to MCA_BTL_VADER_EMUL on master as of 2019-09-21 and the v4.0.x branch). Signed-off-by: Adrian Reber <areber@redhat.com> Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit fc68d8a)
c4fe1d6
to
e19e210
Compare
@adrianreber Done! |
@jsquyres I tested the PR once more with the latest updates and it works as it should. 👍 from my side |
Thanks @adrianreber. I just updated #6998 (the v3.0.x version of this PR) with the same feedback from this PR. |
@hjelmn @bwbarrett This PR is now ready for review. |
Trying out to run processes via mpirun in Podman containers has shown
that the CMA btl_vader_single_copy_mechanism does not work when user
namespaces are involved.
Creating containers with Podman requires at least user namespaces to be
able to do unprivileged mounts in a container
Even if running the container with user namespace user ID mappings which
result in the same user ID on the inside and outside of all involved
containers, the check in the kernel to allow ptrace (and thus
process_vm_{read,write}v()), fails if the same IDs are not in the same
user namespace.
One workaround is to specify '--mca btl_vader_single_copy_mechanism none'
and this commit adds code to automatically skip CMA if user namespaces
are detected and fall back to MCA_BTL_VADER_EMUL.
Signed-off-by: Adrian Reber areber@redhat.com
(cherry picked from commit fc68d8a)