-
Notifications
You must be signed in to change notification settings - Fork 901
osc/ucx: Make wpctx global #8212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, some minor comments/questions. Most importantly, please make sure to use correct types for proc_vpid
(from what I see consistently using unsigned int
should be fine, but don't mix uint64_t
and size_t
). Also there may be a conflict with what was merged in #7632.
@@ -1096,6 +1096,7 @@ int ompi_osc_ucx_rput(const void *origin_addr, int origin_count, | |||
return ret; | |||
} | |||
|
|||
opal_atomic_wmb(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need a write memory barrier at the end of ompi_osc_ucx_rput
?
@@ -1149,6 +1150,7 @@ int ompi_osc_ucx_rget(void *origin_addr, int origin_count, | |||
return ret; | |||
} | |||
|
|||
opal_atomic_wmb(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here: why a memory barrier at the end of rget
? (the operation is still pending after all)
ompi_proc_t *proc = ompi_group_get_proc_ptr(group, rank, true); | ||
|
||
if( ompi_proc_is_sentinel(proc) ) { | ||
tmp = ompi_proc_sentinel_to_name((uintptr_t)proc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation (function and this block)
char *worker_addrs; | ||
int worker_displs; | ||
int worker_lens; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
char *worker_addrs; | |
int worker_displs; | |
int worker_lens; | |
char *worker_addr; | |
int worker_displ; | |
int worker_len; |
char *worker_addrs; | ||
int worker_displs; | ||
int worker_lens; | ||
} wpool_worker_addr_tbl_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this will be included in other code, I suggest renaming it to include the opal_common_ucx
namespace
size_t proc_vpid; | ||
int ret = OPAL_ERROR; | ||
|
||
ret = opal_hash_table_get_first_key_uint64(&winfo->ep_iops, &proc_vpid, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having proc_vpid
as size_t
is unsafe on 32bit systems, please use uint64_t
explicitly (here and in other places)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it even uint64_t
in the first place? get_proc_vpid
returns unsinged int
as far as I can see
&module->ctx); | ||
if (OMPI_SUCCESS != ret) { | ||
/* Populate addr table */ | ||
ret = opal_common_ucx_wpool_update_addr(mca_osc_ucx_component.wpool, comm_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to check whether all addresses have been exchanged on that particular communicator and avoid the exchange after the first window has been allocated?
opal_mutex_unlock(&mem->mutex); | ||
mem_rec->wpmem = mem; | ||
mem_rec->ctx_rec = ctx_rec; | ||
mem_rec->winfo = ctx_rec->winfo; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we OBJ_RETAIN
the winfo here?
@@ -197,23 +250,24 @@ opal_common_ucx_wpool_init(opal_common_ucx_wpool_t *wpool, | |||
wpool->dflt_winfo = winfo; | |||
OBJ_RETAIN(wpool->dflt_winfo); | |||
|
|||
status = ucp_worker_get_address(wpool->dflt_winfo->worker, | |||
&wpool->recv_waddr, &wpool->recv_waddr_len); | |||
status = ucp_worker_get_address(wpool->dflt_worker, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a merge conflict with #7632? (I had changed dflt_worker
to dflt_winfo
so that the winfo
can be locked in opal_common_ucx_wpool_progress
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right - I have this fixed in the upcoming push, but we're working on a couple other issues before hand. Thanks Joseph
winfo->comm_size = 0; | ||
winfo->inflight_ops = NULL; | ||
OBJ_CONSTRUCT(&winfo->ep_iops, opal_hash_table_t); | ||
opal_hash_table_init(&winfo->ep_iops, 256); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we limit the size of the hash table for communicators smaller than 256 ranks?
@devreal, thank you for the comments |
…ry functionality to support world_comm rank translation Co-authored-by: Artem Polyakov <artpol84@gmail.com> Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
Update opal/mca/common/ucx/common_ucx_wpool.c Update opal/mca/common/ucx/common_ucx_wpool.h Update opal/mca/common/ucx/common_ucx_wpool_int.h Co-authored-by: Artem Polyakov <artpol84@gmail.com> Signed-off-by: Tomislav Janjusic <tomislavj@nvidia.com>
Co-authored-by: Artem Polyakov <artpol84@gmail.com> Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
a24893e
to
b7683a4
Compare
fixes #6987 |
Fixes #6987 |
Further resource optimizations:
wpctx struct and ucx worker address table is global.
Add necessary functionality to support world_comm rank translation:
Fixes #6987
Co-authored-by: Artem Y. Polyakov artemp@nvidia.com
Signed-off-by: Tomislav Janjusic tomislavj@nvidia.com