-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prov/socket segfaults in sock_ep_connect() when it tries to dereference dest_addr #2676
Comments
tonyzinger
changed the title
prov/socket segfaults in sock_ep_connect() when ti tries to dereference dest_addr
prov/socket segfaults in sock_ep_connect() when it tries to dereference dest_addr
Jan 26, 2017
under investigation by @gladkovdmitry17 |
dmitrygx
added a commit
to dmitrygx/fabtests
that referenced
this issue
Feb 9, 2017
…it tries to dereference dest_addr" issue ofiwg/libfabric#2676 fi_msg_sockets: Add new test case that covers Issue #2676 Invokes fi_send when no connect is established and no destination addres:port pair is passed to fi_info Change-Id: I1a64131eafa882b9f60a725d055ede039ad2250b Signed-off-by: Gladkov, Dmitry <dmitry.gladkov@intel.com>
dmitrygx
added a commit
to dmitrygx/fabtests
that referenced
this issue
Feb 9, 2017
…it tries to dereference dest_addr" issue ofiwg/libfabric#2676 fi_msg_sockets: Add new test case that covers Issue #2676 Invokes fi_send when no connect is established and no destination addres:port pair is passed to fi_info Change-Id: I1a64131eafa882b9f60a725d055ede039ad2250b Signed-off-by: Gladkov, Dmitry <dmitry.gladkov@intel.com>
dmitrygx
added a commit
to dmitrygx/libfabric
that referenced
this issue
Feb 9, 2017
…nce dest_addr ofiwg#2676 The validation of destination address on NULL pointer has been added in sock_ep_connect. The destination address is NULL in case of if no destination address is passed as a fi_info and no connect is established. but fi_send is called. Change-Id: I0e526b286360756ed4dd5732b36f29ca08ed18d4 Signed-off-by: Dmitry Gladkov <dmitry.gladkov@intel.com>
Should work in #2714 |
dmitrygx
added a commit
to dmitrygx/libfabric
that referenced
this issue
Feb 13, 2017
…nce dest_addr ofiwg#2676 The validation of destination address on NULL pointer has been added in sock_ep_connect. The destination address is NULL in case of if no destination address is passed as a fi_info and no connect is established. but fi_send is called. Change-Id: I0e526b286360756ed4dd5732b36f29ca08ed18d4 Signed-off-by: Dmitry Gladkov <dmitry.gladkov@intel.com>
sayantansur
pushed a commit
to sayantansur/libfabric
that referenced
this issue
Feb 15, 2017
…nce dest_addr ofiwg#2676 The validation of destination address on NULL pointer has been added in sock_ep_connect. The destination address is NULL in case of if no destination address is passed as a fi_info and no connect is established. but fi_send is called. Change-Id: I0e526b286360756ed4dd5732b36f29ca08ed18d4 Signed-off-by: Dmitry Gladkov <dmitry.gladkov@intel.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The socket provider segfaults in sock_ep_connect() when it tries to dereference the dest_addr field of the sock_ep_attr structure.
The endpoint type is FI_EP_MSG.
The dest_addr field was NULL.
The gdb back trace information from the core files is:
Program terminated with signal 11, Segmentation fault.
#0 0x00007ffff78923e8 in sock_ep_connect (ep_attr=0x64fdc0, index=0) at prov/sockets/src/sock_conn.c:414
414 addr = *ep_attr->dest_addr;
(gdb) bt
#0 0x00007ffff78923e8 in sock_ep_connect (ep_attr=0x64fdc0, index=0) at prov/sockets/src/sock_conn.c:414
#1 0x00007ffff7880b43 in sock_ep_get_conn (attr=0x64fdc0, tx_ctx=0x650610, index=2, pconn=0x7fffffff5b88) at prov/sockets/src/sock_ep.c:1794
#2 0x00007ffff7895cd9 in sock_ep_tx_atomic (ep=0x64fcd0, msg=0x7fffffff5cd0, comparev=0x0, compare_desc=0x0, compare_count=0, resultv=0x0, result_desc=0x0,
result_count=0, flags=2305843009213693952) at prov/sockets/src/sock_atomic.c:101
#3 0x00007ffff7896627 in sock_ep_atomic_writemsg (ep=0x64fcd0, msg=0x7fffffff5cd0, flags=2305843009213693952) at prov/sockets/src/sock_atomic.c:272
#4 0x00007ffff7896708 in sock_ep_atomic_write (ep=0x64fcd0, buf=0x69a340, count=1, desc=0x69a3f0, dest_addr=2, addr=6923712, key=2, datatype=FI_INT32, op=FI_SUM,
context=0x7fffffff5e88) at prov/sockets/src/sock_atomic.c:304
(gdb) print *ep_attr
$1 = {fclass = 3, tx_shared = 0, rx_shared = 0, buffered_len = 0, min_multi_recv = 64, ref = {val = 0, is_initialized = 1}, eq = 0x0, av = 0x699fd0, domain = 0x62be80,
rx_ctx = 0x650990, tx_ctx = 0x650610, rx_array = 0x62c160, tx_array = 0x62ad60, num_rx_ctx = {val = 0, is_initialized = 1}, num_tx_ctx = {val = 0,
is_initialized = 1}, rx_ctx_entry = {next = 0x650ad8, prev = 0x650ad8}, tx_ctx_entry = {next = 0x650760, prev = 0x650760}, info = {next = 0x62aff0,
caps = 216172782117008144, mode = 0, addr_format = 2, src_addrlen = 16, dest_addrlen = 0, src_addr = 0x62a790, dest_addr = 0x0, handle = 0x0, tx_attr = 0x62ae10,
rx_attr = 0x62ae60, ep_attr = 0x62aeb0, domain_attr = 0x62af20, fabric_attr = 0x62afc0}, ep_attr = {type = FI_EP_UNSPEC, protocol = 0, protocol_version = 0,
max_msg_size = 0, msg_prefix_size = 0, max_order_raw_size = 0, max_order_war_size = 0, max_order_waw_size = 0, mem_tag_format = 0, tx_ctx_cnt = 1, rx_ctx_cnt = 1,
auth_keylen = 0, auth_key = 0x0}, ep_type = FI_EP_MSG, src_addr = 0x62ad40, dest_addr = 0x0, msg_src_port = 0, msg_dest_port = 0, peer_fid = 0, key = 0,
is_enabled = 1, cm = {sock = 0, do_listen = 0, signal_fds = {11, 12}, next_msg_id = 0, lock = {impl = 1, is_initialized = 1}, is_connected = 0, listener_thread = 0,
msg_list = {next = 0x64ff80, prev = 0x64ff80}}, listener = {sock = 0, do_listen = 0, is_ready = 0, signal_fds = {0, 0}, listener_thread = 0,
service = '\000' <repeats 31 times>}, lock = {impl = 1, is_initialized = 1}, conn_idm = {array = {0x0 <repeats 64 times>}, count = {0 <repeats 64 times>}},
av_idm = {array = {0x69a660, 0x0 <repeats 63 times>}, count = {1, 0 <repeats 63 times>}}, cmap = {table = 0x650b50, epoll_set = {fd = 13, size = 1024, used = 0,
events = 0x664b60}, used = 0, size = 1024, lock = {impl = 1, is_initialized = 1}}}
The sock_ep_connect() source code that I was testing against is:
403 struct sock_conn *sock_ep_connect(struct sock_ep_attr *ep_attr, fi_addr_t index)
404 {
405 int conn_fd = -1, ret;
406 int do_retry = sock_conn_retry;
407 struct sock_conn *conn, *new_conn;
408 struct sockaddr_in addr;
409 socklen_t lon;
410 int valopt = 0;
411 struct pollfd poll_fd;
412
413 if (ep_attr->ep_type == FI_EP_MSG) {
414 addr = *ep_attr->dest_addr;
415 addr.sin_port = htons(ep_attr->msg_dest_port);
416 } else {
417 addr = *((struct sockaddr_in *)&ep_attr->av->table[index].addr);
418 }
419
420 do_connect:
421 fastlock_acquire(&ep_attr->cmap.lock);
422 conn = sock_ep_lookup_conn(ep_attr, index, &addr);
423 fastlock_release(&ep_attr->cmap.lock);
424
425 if (conn != SOCK_CM_CONN_IN_PROGRESS)
426 return conn;
427
428 conn_fd = socket(AF_INET, SOCK_STREAM, 0);
429 if (conn_fd == -1) {
430 SOCK_LOG_ERROR("failed to create conn_fd, errno: %d\n", errno);
431 errno = FI_EOTHER;
432 return NULL;
433 }
434
435 ret = fd_set_nonblock(conn_fd);
The test case that I was using was invalid but the socket provider should not segfault because of a user using Libfabric incorrectly. The reason, that the test case is invalid, was it did not use fi_connect() for an endpoint type of FI_EP_MSG before it tried to send its data.
The text was updated successfully, but these errors were encountered: