Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osc/rdma: Fix some bugs running with btl/tcp. #8719

Closed
wants to merge 1 commit into from

Conversation

awlauria
Copy link
Contributor

  • Make sure peer->state_endpoint is set correctly.
  • Fix double free of pending_op in ompi_osc_rdma_btl_fop() and ompi_osc_rdma_btl_op().

Cleanup/leaks:

  • Don't parse ompi_osc_rdma_btl_alternate_names twice.
  • free temp in allocate_state_shared().

@awlauria awlauria requested review from bosilca, hjelmn and devreal March 26, 2021 17:02
@awlauria awlauria force-pushed the fix_osc_rdma_tcp_bugs branch 2 times, most recently from 6443523 to dad72d3 Compare March 26, 2021 17:04
@@ -529,6 +529,7 @@ static int allocate_state_single (ompi_osc_rdma_module_t *module, void **base, s
my_peer->state_handle = module->state_handle;
my_peer->state_btl_index = my_peer->data_btl_index;
my_peer->state_endpoint = my_peer->data_endpoint;
assert(my_peer -> state_endpoint != NULL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's with all these spaces ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - removed these asserts(). Thanks.

ompi/mca/osc/rdma/osc_rdma_component.c Outdated Show resolved Hide resolved
pending_op->op_frag->handle, (void *) pending_op, NULL, OPAL_SUCCESS);
} else {
/* need to release here because ompi_osc_rdma_atomic_complete was not called */
OBJ_RELEASE(pending_op);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch.

@awlauria awlauria force-pushed the fix_osc_rdma_tcp_bugs branch 2 times, most recently from 23fd874 to a0d4106 Compare March 26, 2021 18:50
- Make sure peer->state_endpoint is set correctly.
- Fix double free of pending_op in ompi_osc_rdma_btl_fop() and ompi_osc_rdma_btl_op().

Cleanup/leaks:
- Don't parse ompi_osc_rdma_btl_alternate_names twice.
- free temp in allocate_state_shared().

Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
@awlauria awlauria force-pushed the fix_osc_rdma_tcp_bugs branch from a0d4106 to 928ac83 Compare March 26, 2021 21:59
@awlauria
Copy link
Contributor Author

Found another double-free.

Should be good to go.

@awlauria awlauria requested a review from bosilca March 29, 2021 12:57
@awlauria awlauria dismissed bosilca’s stale review March 30, 2021 12:17

@boslica comments have been addressed. Ready for re-review

@awlauria
Copy link
Contributor Author

awlauria commented Apr 1, 2021

I cherry-picked this commit into #8756. Closing.

@awlauria awlauria closed this Apr 1, 2021
@awlauria awlauria deleted the fix_osc_rdma_tcp_bugs branch March 17, 2022 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants