-
Notifications
You must be signed in to change notification settings - Fork 868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MCA param for multithread opal_progress(). #5241
Conversation
The oversubscription case will not necessarily be a problem if we handle the progress according to the core where it happen and prevent threads on the same core for going into progress. This might add one additional atomic operation on the critical path, but it will give us the opportunity to keep things clean in the BTL progress. |
So we all agreed this should happen right? I would love some input from other people. |
The basic idea of the PR to allow multi-threaded progress would be beneficial. Playing with Scalable Endpoints in OFI MTL, I see significant performance improvement with this approach. |
Should help with btl/uct as well and will eventually help with btl/ugni (once I enable multi-device support for two-sided). |
I see performance bump on btl/ofi as well. I can start working on this as soon as I get off my internship. Assuming there is no one opposing this. I also want to know if it's possible to make multithreaded progress a default. Should I send out an email on devel list to ask if anyone's component is still not thread safe? |
I discussed with @bosilca on this.
We might want to make the multithreaded opal_progress() a default. At least we should bring this up in our weekly meeting or maybe a discussion/decision on the upcoming face to face meeting? |
@thananon - Were you planning on merging this patch for now and leave the discussion to make multi-threaded opal_progress a default for later? (Not sure if this was already discussed in the weekly meeting?) |
@aravindksg There is no plan to merge this now. We have plan to bring this up in the next face to face meeting in October. Right now I'm running test on applications and the effect that of this change to different btls. I know that this will be off by default but we would like to have community member weight in on this. |
Per face to face discussion, we agreed on this...
|
1002fc5
to
3d8cec3
Compare
I fixed the patch. This one creates an MCA I will send out the component guidelines out soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make num_thread_in_progress static.
opal/runtime/opal_params.c
Outdated
@@ -121,7 +122,7 @@ int opal_register_params(void) | |||
opal_signal_string = string; | |||
ret = mca_base_var_register ("opal", "opal", NULL, "signal", | |||
"Comma-delimited list of integer signal numbers to Open MPI to attempt to intercept. Upon receipt of the intercepted signal, Open MPI will display a stack trace and abort. Open MPI will *not* replace signals if handlers are already installed by the time MPI_INIT is invoked. Optionally append \":complain\" to any signal number in the comma-delimited list to make Open MPI complain if it detects another signal handler (and therefore does not insert its own).", | |||
MCA_BASE_VAR_TYPE_STRING, NULL, 0, MCA_BASE_VAR_FLAG_SETTABLE, | |||
MCA_BASE_VAR_TYPE_STRING, NULL, 0, MCA_BASE_VAR_FLAG_SETTABLE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure I fixed this.
This commit added MCA param `opal_max_thread_in_progress` to set the number of threads allowed to do opal_progress concurrently. The default value is 1. Component with multithreaded design can benefit from this change to parallelize their component progress function. Signed-off-by: Thananon Patinyasakdikul <tpatinya@utk.edu>
3d8cec3
to
4e23fed
Compare
PR open-mpi#5241 provided an MCA variable to allow multi-threaded opal_progress. However, it allowed to update the linked list even when multiple threads was allowed to call opal_progress. This caused a scenario when a more recent thread could complete it's progress and fail the assert(sync == wait_sync_list). Allowing to update the linked list only for the case when the number of threads exceeds the threshold fixes the problem. It also seems to improve performance quite a bit. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
PR open-mpi#5241 provided an MCA variable to allow multi-threaded opal_progress. However, it allowed to update the linked list even when multiple threads was allowed to call opal_progress. This caused a scenario when a more recent thread could complete it's progress and fail the assert(sync == wait_sync_list). Allowing to update the linked list only for the case when the number of threads exceeds the threshold fixes the problem. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
This is more of a discussion than a PR. After some work into multi-thread support, all of us found the bottleneck at
opal_progress()
since we serialize the threads here. (The request refactoring a few years back was to handle the serialization better for oversubscribed case.)Now let's discuss here. @hjelmn , @bosilca and I agreed that we should allow threads in opal_progress to maximize the use of multiple resources and most of the components are now thread-safe. We should push this forward. However, this will be a problem with oversubscribed case. So I think the first good step should be an MCA param.
So I drafted a PR. This is just a simple way to do it and I'm open to discussion. This does not have to be in 4.0. release.