Skip to content

osc/ucx: Make wpctx global #8212

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

janjust
Copy link
Contributor

@janjust janjust commented Nov 13, 2020

Further resource optimizations:
wpctx struct and ucx worker address table is global.
Add necessary functionality to support world_comm rank translation:

Fixes #6987

Co-authored-by: Artem Y. Polyakov artemp@nvidia.com

Signed-off-by: Tomislav Janjusic tomislavj@nvidia.com

Copy link
Contributor

@devreal devreal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, some minor comments/questions. Most importantly, please make sure to use correct types for proc_vpid (from what I see consistently using unsigned int should be fine, but don't mix uint64_t and size_t). Also there may be a conflict with what was merged in #7632.

@@ -1096,6 +1096,7 @@ int ompi_osc_ucx_rput(const void *origin_addr, int origin_count,
return ret;
}

opal_atomic_wmb();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need a write memory barrier at the end of ompi_osc_ucx_rput?

@@ -1149,6 +1150,7 @@ int ompi_osc_ucx_rget(void *origin_addr, int origin_count,
return ret;
}

opal_atomic_wmb();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here: why a memory barrier at the end of rget? (the operation is still pending after all)

ompi_proc_t *proc = ompi_group_get_proc_ptr(group, rank, true);

if( ompi_proc_is_sentinel(proc) ) {
tmp = ompi_proc_sentinel_to_name((uintptr_t)proc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation (function and this block)

Comment on lines +51 to +53
char *worker_addrs;
int worker_displs;
int worker_lens;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
char *worker_addrs;
int worker_displs;
int worker_lens;
char *worker_addr;
int worker_displ;
int worker_len;

char *worker_addrs;
int worker_displs;
int worker_lens;
} wpool_worker_addr_tbl_t;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this will be included in other code, I suggest renaming it to include the opal_common_ucx namespace

size_t proc_vpid;
int ret = OPAL_ERROR;

ret = opal_hash_table_get_first_key_uint64(&winfo->ep_iops, &proc_vpid,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having proc_vpid as size_t is unsafe on 32bit systems, please use uint64_t explicitly (here and in other places)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it even uint64_t in the first place? get_proc_vpid returns unsinged int as far as I can see

&module->ctx);
if (OMPI_SUCCESS != ret) {
/* Populate addr table */
ret = opal_common_ucx_wpool_update_addr(mca_osc_ucx_component.wpool, comm_size,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to check whether all addresses have been exchanged on that particular communicator and avoid the exchange after the first window has been allocated?

opal_mutex_unlock(&mem->mutex);
mem_rec->wpmem = mem;
mem_rec->ctx_rec = ctx_rec;
mem_rec->winfo = ctx_rec->winfo;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we OBJ_RETAIN the winfo here?

@@ -197,23 +250,24 @@ opal_common_ucx_wpool_init(opal_common_ucx_wpool_t *wpool,
wpool->dflt_winfo = winfo;
OBJ_RETAIN(wpool->dflt_winfo);

status = ucp_worker_get_address(wpool->dflt_winfo->worker,
&wpool->recv_waddr, &wpool->recv_waddr_len);
status = ucp_worker_get_address(wpool->dflt_worker,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a merge conflict with #7632? (I had changed dflt_worker to dflt_winfo so that the winfo can be locked in opal_common_ucx_wpool_progress)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right - I have this fixed in the upcoming push, but we're working on a couple other issues before hand. Thanks Joseph

winfo->comm_size = 0;
winfo->inflight_ops = NULL;
OBJ_CONSTRUCT(&winfo->ep_iops, opal_hash_table_t);
opal_hash_table_init(&winfo->ep_iops, 256);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we limit the size of the hash table for communicators smaller than 256 ranks?

@artpol84
Copy link
Contributor

@devreal, thank you for the comments

Tomislav Janjusic and others added 3 commits November 30, 2020 18:09
…ry functionality to

support world_comm rank translation

Co-authored-by: Artem Polyakov <artpol84@gmail.com>

Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
Update opal/mca/common/ucx/common_ucx_wpool.c
Update opal/mca/common/ucx/common_ucx_wpool.h
Update opal/mca/common/ucx/common_ucx_wpool_int.h

Co-authored-by: Artem Polyakov <artpol84@gmail.com>

Signed-off-by: Tomislav Janjusic <tomislavj@nvidia.com>
Co-authored-by: Artem Polyakov <artpol84@gmail.com>

Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
@gpaulsen
Copy link
Member

gpaulsen commented Aug 5, 2021

fixes #6987

@awlauria
Copy link
Contributor

awlauria commented Aug 5, 2021

Fixes #6987

@janjust janjust closed this Sep 17, 2024
@janjust janjust deleted the master-tls-refactor_v5 branch March 27, 2025 02:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regression: UCX osc: dynamic windows broken on master
5 participants