You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Hi, I want to use the provider verbs;ofi_rxm and have iov_limit set to 16 for both rx and tx. Unfortunately, when I set the hints to 16, no providers are returned. It seems the rxm provider has a hard limit of 4, but If I query the RDMA device using libibverbs I see that it supports up to 30 scatter-gather entries.
To Reproduce
#include <stdio.h>
#include <rdma/fabric.h>
#include <rdma/fi_errno.h>
int main() {
struct fi_info *hints = NULL;
struct fi_info *info;
hints = fi_allocinfo();
hints->ep_attr->type = FI_EP_RDM;
hints->fabric_attr->prov_name = (char *)"verbs;ofi_rxm";
hints->tx_attr->iov_limit = 16;
hints->rx_attr->iov_limit = 16;
if (hints == NULL) {
printf("failed to allocate memory for hints\n");
return 1;
}
int ret = fi_getinfo(FI_VERSION(2, 0), "0.0.0.0", "8080", FI_SOURCE,
hints, &info);
if (ret != FI_SUCCESS) {
printf("failed to get any provider reason=%s\n", fi_strerror(ret));
return 1;
}
printf("%s\n", fi_tostr(info, FI_TYPE_INFO));
}
Output
If I run with FI_LOG_LEVEL=info I get the following errors:
libfabric:1180674:1734402138::ofi_rxm:core:ofi_check_rx_attr():914<info> iov_limit too large
libfabric:1180674:1734402138::ofi_rxm:core:ofi_check_rx_attr():915<info> Supported: 4
libfabric:1180674:1734402138::ofi_rxm:core:ofi_check_rx_attr():915<info> Requested: 16
Expected behavior
I would expect that a user can set the hints for the iov limits up to what the RDMA device can support (in my case, at least up to 30).
Environment:
Linux ubuntu 20.04
libfabric version 2.0.0
Thank you.
The text was updated successfully, but these errors were encountered:
@mlefebvre1 Hi there and welcome to the libfabric community!
Many providers hard code their limits. 4 is a reasonable limit. Since rxm is layered on top of verbs and often has to copy over the iovs from the user, it is difficult to make this dynamic and has the potential to waste a lot of space. So this isn't a bug, but rather an optimization. We can look into the possibility of adding support for more depending on the device limit, but it's probably not going to be high on the list if I'm being honest.
I would recommend trying to get your application to need fewer iovs (ie sending 4 at a time) if possible. Alternatively, you could try increasing the hardcoded limit to 16 but obviously you would need your own libfabric build so it wouldn't be a long term solution.
Describe the bug
Hi, I want to use the provider verbs;ofi_rxm and have iov_limit set to 16 for both rx and tx. Unfortunately, when I set the hints to 16, no providers are returned. It seems the rxm provider has a hard limit of 4, but If I query the RDMA device using libibverbs I see that it supports up to 30 scatter-gather entries.
To Reproduce
Output
If I run with FI_LOG_LEVEL=info I get the following errors:
Expected behavior
I would expect that a user can set the hints for the iov limits up to what the RDMA device can support (in my case, at least up to 30).
Environment:
Linux ubuntu 20.04
libfabric version 2.0.0
Thank you.
The text was updated successfully, but these errors were encountered: