Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault #9

Closed
tim-nordell opened this issue Aug 3, 2016 · 5 comments
Closed

Segmentation fault #9

tim-nordell opened this issue Aug 3, 2016 · 5 comments

Comments

@tim-nordell
Copy link

tim-nordell commented Aug 3, 2016

I haven't taken the time to debug this, but it is possible for rx_buf to overflow. This was observed with the Ubuntu folder variant on a 64-bit machine. iaap_recv_dec_process received a rx_buf at offset of 1032, with a len of 16149. These two combined exceed the 16384 buffer DEFBUF and overflows the buffer. (Something somewhere isn't paying enough attention to the boundaries as such!)

Glancing at the codebase, it looks like it processes multiple chunks of data from the remote at once moving through the buffer each time it grabs a segment. It should possibly move data back to the start of buf for each sub-packet if it detects that the remaining length exceeds the receive buffer.

Here's the full backtrace from gdb:

(gdb) bt full
#0  0x00007ffff699f16c in BIO_write () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
No symbol table info available.
#1  0x00000000004054bd in iaap_recv_dec_process (chan=2, flags=9, buf=0x716368 <rx_buf+1032> "\027\003\003?\020\204\365C\036", len=16149) at ../jni/hu_aap.c:1256
        bytes_written = 1020
        ctr = 0
        max_tries = 2
        bytes_read = 991
        prot_func_ret = 797
#2  0x0000000000405799 in hu_aap_recv_process () at ../jni/hu_aap.c:1411
        chan = 2
        flags = 9
        enc_len = 16149
        msg_type = 0
        buf = 0x716368 <rx_buf+1032> "\027\003\003?\020\204\365C\036"
        ret = 0
        min_size_hdr = 6
        rx_len = 16384
        have_len = 16149
#3  0x0000000000408ec0 in read_data (app=0x71a6e0 <gst_app>) at main.c:63
        buffer = 0x7ffff71abda0
        ret = -143674261
        iret = 32767
        vbuf = 0x1073e00 ""
        abuf = 0x7fffdc001100 " \022"
        res_len = 0
#4  0x00007ffff71af05a in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
No symbol table info available.
#5  0x00007ffff71af400 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
No symbol table info available.
#6  0x00007ffff71af722 in g_main_loop_run () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
No symbol table info available.
#7  0x000000000040ab93 in gst_loop (app=0x71a6e0 <gst_app>) at main.c:806
        ret = 32767
        state_ret = GST_STATE_CHANGE_FAILURE
#8  0x000000000040b019 in main (argc=1, argv=0x7fffffffe5f8) at main.c:974
        app = 0x71a6e0 <gst_app>
        ret = 0
        ep_in_addr = 254 '\376'
        ep_out_addr = 254 '\376'
        cursor = 0x135b840
        info = {version = {major = 1 '\001', minor = 2 '\002', patch = 15 '\017'}, subsystem = SDL_SYSWM_X11, info = {x11 = {display = 0x1330cc0, window = 71303181, 
              lock_func = 0x7ffff6630ce0, unlock_func = 0x7ffff6630cf0, fswindow = 71303170, wmwindow = 71303171, gfxdisplay = 0x133ced0}}}
        screen = 0x135b750
        wmInfo = {version = {major = 1 '\001', minor = 2 '\002', patch = 15 '\017'}, subsystem = SDL_SYSWM_X11, info = {x11 = {display = 0x1330cc0, window = 71303181, 
              lock_func = 0x7ffff6630ce0, unlock_func = 0x7ffff6630cf0, fswindow = 71303170, wmwindow = 71303171, gfxdisplay = 0x133ced0}}}

Note: This goes away if I increase the DEFBUF constant. I can observe this just by interacting with the Android Auto window for a few minutes.

@gartnera
Copy link
Owner

gartnera commented Aug 3, 2016

Ok so first the Ubuntu code is super old. It hasn't been synced with the rest of the changes I/we have made for awhile.

I think I might have broken something with d848813. Try reverting that and see what happens.

@tim-nordell
Copy link
Author

Note: I did compile it myself so it was using the latest code in commit 9c0fd2b.

I'll poke around at that later; I've saved a core dump of when the seg fault was hit. It seems like the buffer that overflowed was allocated inside of hu_aap.c though, so it would be weird if the Ubuntu specific code would affect overflowing that buffer especially since the Ubuntu specific code only handles the h264 output side of things (it crashed in the decryption portion of the data stream).

The hu_aap_recv_process() function increments buf from the it's starting place as it processes data and it trusts the output in the encoded stream as to not overflowing its own buffer (enc_len) as it decrypts the data stream. E.g. it decoded a length of "16149" when it was already at offset 1032 in the incoming data stream, and as such iaap_recv_dec_process() went beyond the bounds of the 16384 buffer.

@tim-nordell
Copy link
Author

Okay, I added in the debug output from the program (undefining NDEBUG). It looks like I got a LIBUSB error which trickled down into the decryption code. Maybe this won't happen on the head unit? Although, I'd suspect that if it can happen on a desktop with a theoretically mature USB stack...

Here's the end of the log before a seg fault:

...
14:28:51 2016 V: hu_aap:: First fragment total_size: 22096  max_assy_size: 150075
14:28:51 2016 D: hu_aap::         OK iaap_recv_dec_process() ret: 0  have_len: 16149  enc_len: 16149  buf: 0x71df68  chan: 2 VID  flags: 0x9  msg_type: 0
14:28:51 2016 D: hu_aap::         OK iaap_recv_dec_process() ret: 0  have_len: 6005  enc_len: 6005  buf: 0x71df64  chan: 2 VID  flags: 0xa  msg_type: 5891
14:28:51 2016 V: hu_aap:: First fragment total_size: 17712  max_assy_size: 150075
14:28:51 2016 D: hu_aap::         OK iaap_recv_dec_process() ret: 0  have_len: 16149  enc_len: 16149  buf: 0x71df68  chan: 2 VID  flags: 0x9  msg_type: 0
14:28:51 2016 D: hu_aap::         OK iaap_recv_dec_process() ret: 0  have_len: 1621  enc_len: 1621  buf: 0x71df64  chan: 2 VID  flags: 0xa  msg_type: 5891
14:28:51 2016 V: hu_aap:: First fragment total_size: 17528  max_assy_size: 150075
14:28:51 2016 D: hu_aap::         OK iaap_recv_dec_process() ret: 0  have_len: 16149  enc_len: 16149  buf: 0x71df68  chan: 2 VID  flags: 0x9  msg_type: 0
14:28:51 2016 D: hu_aap::         OK iaap_recv_dec_process() ret: 0  have_len: 1437  enc_len: 1437  buf: 0x71df64  chan: 2 VID  flags: 0xa  msg_type: 5891
14:28:51 2016 V: hu_aap:: First fragment total_size: 16292  max_assy_size: 150075
14:28:51 2016 D: hu_aap::         OK iaap_recv_dec_process() ret: 0  have_len: 16149  enc_len: 16149  buf: 0x71df68  chan: 2 VID  flags: 0x9  msg_type: 0
14:28:51 2016 D: hu_aap::         OK iaap_recv_dec_process() ret: 0  have_len: 201  enc_len: 201  buf: 0x71df64  chan: 2 VID  flags: 0xa  msg_type: 5891
14:28:51 2016 D: hu_aap::         OK iaap_recv_dec_process() ret: 0  have_len: 16131  enc_len: 16131  buf: 0x71df64  chan: 2 VID  flags: 0xb  msg_type: 5891
14:28:51 2016 V: hu_aap:: First fragment total_size: 17546  max_assy_size: 150075
14:28:51 2016 D: hu_aap::         OK iaap_recv_dec_process() ret: 0  have_len: 16149  enc_len: 16149  buf: 0x71df68  chan: 2 VID  flags: 0x9  msg_type: 0
14:28:51 2016 D: hu_aap::         OK iaap_recv_dec_process() ret: 0  have_len: 1455  enc_len: 1455  buf: 0x71df64  chan: 2 VID  flags: 0xa  msg_type: 5891
14:28:51 2016 D: hu_aap::         OK iaap_recv_dec_process() ret: 0  have_len: 14003  enc_len: 14003  buf: 0x71df64  chan: 2 VID  flags: 0xb  msg_type: 5891
14:28:51 2016 D: hu_aap::         OK iaap_recv_dec_process() ret: 0  have_len: 13778  enc_len: 13778  buf: 0x71df64  chan: 2 VID  flags: 0xb  msg_type: 5891
14:28:51 2016 D: hu_aap::         OK iaap_recv_dec_process() ret: 0  have_len: 13459  enc_len: 13459  buf: 0x71df64  chan: 2 VID  flags: 0xb  msg_type: 5891
14:28:51 2016 D: hu_aap::         OK iaap_recv_dec_process() ret: 0  have_len: 16380  enc_len: 12796  buf: 0x71df64  chan: 2 VID  flags: 0xb  msg_type: 5891
14:28:51 2016 D: hu_aap:: iaap_recv_dec_process() more than one message   have_len: 3584  enc_len: 12796
14:28:51 2016 D: hu_aap:: have_len: 3580 < enc_len: 13767  need_len: 10187
14:28:51 2016 D: hu_usb:: Done dir: recv  len: 10187  bytes_xfrd: 10187  total_bytes_xfrd: 10187  usb_err: 0 (LIBUSB_SUCCESS)  errno: 11 (Resource temporarily unavailable)
14:28:51 2016 D: hu_uti:: UR:  00000000 77 39 6c d4 1b 7b d2 ae e6 59 ae df 64 83 80 99 
14:28:51 2016 D: hu_uti:: UR:      0010 fa 48 ed 6a 1d ee 71 93 4e f1 30 4a e7 23 4c 17 
14:28:51 2016 D: hu_uti:: UR:      0020 ee 4a 58 68 d2 76 21 0b 44 63 31 ae 45 58 49 50 
14:28:51 2016 D: hu_uti:: UR:      0030 f4 ca 90 3e b2 d4 5b 2e 9b d5 ee 69 bf b0 a1 02 

@tim-nordell
Copy link
Author

I added the following in there:

diff --git a/jni/hu_aap.c b/jni/hu_aap.c
index 00cab99..3773712 100644
--- a/jni/hu_aap.c
+++ b/jni/hu_aap.c
@@ -1392,6 +1392,9 @@ http://www.cisco.com/c/en/us/support/docs/security-vpn/secure-socket-layer-ssl/1
         if (transport_type != 2 || rx_len != min_size_hdr)              // If NOT wifi...
           logd ("have_len: %d < enc_len: %d  need_len: %d", have_len, enc_len, need_len);

+        logd ("Buf = rx_buf[%i]; hu_aap_tra_recv(&buf[%i], %i, -1)",
+               buf - rx_buf, have_len, need_len);
+
         int need_ret = hu_aap_tra_recv (& buf [have_len], need_len, -1);// Get Rx packet from Transport. Use -1 instead of iaap_tra_recv_tmo to indicate need to get need_len bytes
                                                                         // Length remaining for all sub-packets plus 4/8 byte headers
         if (need_ret != need_len) {                                     // If we didn't get precisely the number of bytes we need...

I get the following in the routine:

14:37:03 2016 D: hu_aap:: iaap_recv_dec_process() more than one message   have_len: 1536  enc_len: 14844
14:37:03 2016 V: hu_aap:: First fragment total_size: 27980  max_assy_size: 85104
14:37:03 2016 D: hu_aap:: have_len: 1528 < enc_len: 16149  need_len: 14621
14:37:03 2016 D: hu_aap:: Buf = rx_buf[14856]; hu_aap_tra_recv(&buf[1528], 14621, -1)

Looks like the "more than one message" code needs to be revised. ;) Here's a quick hack that "fixes" the issue on my PC:

diff --git a/jni/hu_aap.c b/jni/hu_aap.c
index 00cab99..e8bc403 100644
--- a/jni/hu_aap.c
+++ b/jni/hu_aap.c
@@ -1392,6 +1392,10 @@ http://www.cisco.com/c/en/us/support/docs/security-vpn/secure-socket-layer-ssl/1
         if (transport_type != 2 || rx_len != min_size_hdr)              // If NOT wifi...
           logd ("have_len: %d < enc_len: %d  need_len: %d", have_len, enc_len, need_len);

+       // Move the buffer back to the start
+       memmove(rx_buf, buf, have_len);
+       buf = rx_buf;
+
         int need_ret = hu_aap_tra_recv (& buf [have_len], need_len, -1);// Get Rx packet from Transport. Use -1 instead of iaap_tra_recv_tmo to indicate need to get need_len bytes
                                                                         // Length remaining for all sub-packets plus 4/8 byte headers
         if (need_ret != need_len) {                                     // If we didn't get precisely the number of bytes we need...

However, that being said, it could be improved by detecting if it would overflow with the hu_aap_tra_recv and deciding only then to do the memory move.

Also, the code shouldn't trust the length within the transport stream as being within bounds of the buffer in the program.

@borconi
Copy link

borconi commented Aug 16, 2016

Hmmm... interesting....
I tried to do a bit of poking around with the Desktop Head Unit binary (provided by google), and I can see that all the recv and send calls are made with a len of 131080 rather than 16384, so they use a buffer of (128Kb data + 8 byte header - I guess) rather than 16Kb.

On the Android version, I have increased the DEFBUF size to match the value used in the DHU and the number of buffer overruns have significantly decreased, to be noted I'm using Wifi and TCP protocol instead of USB which can also add some overheads.

Like your fix through... will try it on Android version, hoping to fix the annoying random disconnection...

@gartnera gartnera closed this as completed Oct 1, 2016
Trevelopment added a commit that referenced this issue Jan 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants