-
Notifications
You must be signed in to change notification settings - Fork 396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prov/efa: Fix the ibv cq error handling. #9652
Conversation
case IBV_WC_SEND: | ||
#if ENABLE_DEBUG | ||
if (opcode == IBV_WC_SEND) | ||
ep->failed_send_comps++; | ||
else | ||
#endif | ||
efa_rdm_pke_handle_tx_error(pkt_entry, FI_EIO, prov_errno); | ||
break; | ||
case IBV_WC_RDMA_WRITE: | ||
#if ENABLE_DEBUG | ||
ep->failed_write_comps++; | ||
#endif | ||
efa_rdm_pke_handle_tx_error(pkt_entry, FI_EIO, prov_errno); | ||
} else { | ||
assert(opcode == IBV_WC_RECV); | ||
break; | ||
case IBV_WC_RDMA_READ: | ||
#if ENABLE_DEBUG | ||
ep->failed_read_comps++; | ||
#endif | ||
efa_rdm_pke_handle_tx_error(pkt_entry, FI_EIO, prov_errno); | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit we can also do fallthrough e.g.
case IBV_WC_SEND:
#if ENABLE_DEBUG
ep->failed_send_comps++;
#endif
case IBV_WC_RDMA_WRITE:
#if ENABLE_DEBUG
ep->failed_write_comps++;
#endif
case IBV_WC_RDMA_READ:
#if ENABLE_DEBUG
ep->failed_read_comps++;
#endif
efa_rdm_pke_handle_tx_error(pkt_entry, FI_EIO, prov_errno);
break;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't that make failed_write_comps
incremented for send as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. The counter indeed introduces extra coding. We can find a more concise approach but I don't have a strong preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, ideally we should declare failed_comps
as an array, and have a map between the ibv wc code to the ofi op code, so we can increment the counter with the right index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
offline talked with @wenduwan, new revision removed the unused failed_send/write_comps to reduce the code duplication
Currently, efa_rdm_ep_poll_ibv_cq couldn't handle error for IBV_WC_RECV_RDMA_WITH_IMM and IBV_WC_RDMA_READ. This patch fixes it. It also removed the failed_send/write/read_comps in the debug build, because these symbols are never used. Signed-off-by: Shi Jin <sjina@amazon.com>
Currently, efa_rdm_ep_poll_ibv_cq couldn't
handle error for IBV_WC_RECV_RDMA_WITH_IMM
and IBV_WC_RDMA_READ. This patch fixes it.
It also removed the failed_send/write_comps
in the debug build, because these symbols
are never used.