Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tcp connection worker crashes #258

Closed
artushin opened this issue Jan 8, 2019 · 18 comments
Closed

Tcp connection worker crashes #258

artushin opened this issue Jan 8, 2019 · 18 comments

Comments

@artushin
Copy link

artushin commented Jan 8, 2019

Bug Report

Worker is crashing with the following. Seeing the following backtrace (sorry, not in debug, happening in production)

Your environment

  • Operating system:
    Debian GNU/Linux 8 (jessie)
  • Node version:
    v8.12.0
  • npm version:
    6.4.1
  • gcc/clang version:
    gcc 4.9.2
  • mediasoup version:
    2.6.3
  • mediasoup-client version:
    Can't tell, sorry

Issue description

(gdb) backtrace full
#0  0x000000000069e7c5 in uv__io_stop (loop=0x3cac7513bf4e417c, w=0xd8c4a44, events=1) at ../deps/libuv/src/unix/core.c:867
        __PRETTY_FUNCTION__ = <error reading variable __PRETTY_FUNCTION__ (Cannot access memory at address 0x73f2fa)>
#1  0x00000000006a7fd7 in uv_read_stop (stream=0xd8c49bc) at ../deps/libuv/src/unix/stream.c:1613
No locals.
#2  0x000000000066b74a in TcpConnection::Close (this=0x12cd680) at ../src/handles/TcpConnection.cpp:86
        err = 0
        __FUNCTION__ = <error reading variable __FUNCTION__ (Cannot access memory at address 0x73a190)>
#3  0x000000000066c4e9 in TcpConnection::OnUvWriteError (this=0x12cd680, error=-110) at ../src/handles/TcpConnection.cpp:420
No locals.
#4  0x000000000066b4f8 in onWrite (req=0x2ca78f8, status=-110) at ../src/handles/TcpConnection.cpp:34
        writeData = 0x2ca78f0
        connection = 0x12cd680
#5  0x00000000006a6aa0 in uv__write_callbacks (stream=0x8415400) at ../deps/libuv/src/unix/stream.c:976
        req = 0x2ca78f8
        q = 0x2ca7950
        pq = {0x7ffc24e88d30, 0x7ffc24e88d30}
        __PRETTY_FUNCTION__ = <error reading variable __PRETTY_FUNCTION__ (Cannot access memory at address 0x73fe60)>
#6  0x00000000006a76b6 in uv__stream_io (loop=0xe3abc0, w=0x8415488, events=28) at ../deps/libuv/src/unix/stream.c:1348
        stream = 0x8415400
        __PRETTY_FUNCTION__ = <error reading variable __PRETTY_FUNCTION__ (Cannot access memory at address 0x73feaa)>
#7  0x00000000006ac046 in uv__io_poll (loop=0xe3abc0, timeout=18) at ../deps/libuv/src/unix/linux-core.c:378
        max_safe_timeout = <error reading variable max_safe_timeout (Cannot access memory at address 0x7404d8)>
        events = {{events = 1, data = {ptr = 0x85c, fd = 2140, u32 = 2140, u64 = 2140}}, {events = 1, data = {ptr = 0x75a, fd = 1882, u32 = 1882, u64 = 1882}}, {events = 28, data = {ptr = 0x81b,
              fd = 2075, u32 = 2075, u64 = 2075}}, {events = 1, data = {ptr = 0x6f3, fd = 1779, u32 = 1779, u64 = 1779}}, {events = 1, data = {ptr = 0x6f3, fd = 1779, u32 = 1779, u64 = 1779}}, {
            events = 1, data = {ptr = 0x3, fd = 3, u32 = 3, u64 = 3}}, {events = 1, data = {ptr = 0x742, fd = 1858, u32 = 1858, u64 = 1858}}, {events = 1, data = {ptr = 0x742, fd = 1858, u32 = 1858,
              u64 = 1858}}, {events = 1, data = {ptr = 0x742, fd = 1858, u32 = 1858, u64 = 1858}}, {events = 1, data = {ptr = 0x779, fd = 1913, u32 = 1913, u64 = 1913}}, {events = 1, data = {
              ptr = 0x775, fd = 1909, u32 = 1909, u64 = 1909}}, {events = 1, data = {ptr = 0x3, fd = 3, u32 = 3, u64 = 3}}, {events = 1, data = {ptr = 0x6bd, fd = 1725, u32 = 1725, u64 = 1725}}, {
            events = 1, data = {ptr = 0x6bb, fd = 1723, u32 = 1723, u64 = 1723}}, {events = 1, data = {ptr = 0x527, fd = 1319, u32 = 1319, u64 = 1319}}, {events = 1, data = {ptr = 0x53b, fd = 1339,
              u32 = 1339, u64 = 1339}}, {events = 1, data = {ptr = 0x52e, fd = 1326, u32 = 1326, u64 = 1326}}, {events = 1, data = {ptr = 0x4cc, fd = 1228, u32 = 1228, u64 = 1228}}, {events = 1,
            data = {ptr = 0x4b3, fd = 1203, u32 = 1203, u64 = 1203}}, {events = 1, data = {ptr = 0x12f, fd = 303, u32 = 303, u64 = 303}}, {events = 1, data = {ptr = 0x52a, fd = 1322, u32 = 1322,
              u64 = 1322}}, {events = 1, data = {ptr = 0x426, fd = 1062, u32 = 1062, u64 = 1062}}, {events = 1, data = {ptr = 0x20d, fd = 525, u32 = 525, u64 = 525}}, {events = 0, data = {ptr = 0x0,
              fd = 0, u32 = 0, u64 = 0}} <repeats 129 times>, {events = 5, data = {ptr = 0xffffffff00000000, fd = 0, u32 = 0, u64 = 18446744069414584320}}, {events = 4294967295, data = {ptr = 0x0,
              fd = 0, u32 = 0, u64 = 0}}, {events = 64, data = {ptr = 0x24e899b000000000, fd = 0, u32 = 0, u64 = 2659544561155571712}}, {events = 32764, data = {ptr = 0x0, fd = 0, u32 = 0,
              u64 = 0}}, {events = 24, data = {ptr = 0x2000000000, fd = 0, u32 = 0, u64 = 137438953472}}, {events = 0, data = {ptr = 0x7ffc24e899c0, fd = 619223488, u32 = 619223488,
              u64 = 140720927709632}}, {events = 3856216174, data = {ptr = 0x3933353100007f27, fd = 32551, u32 = 32551, u64 = 4121696568543837991}}, {events = 875835961, data = {ptr = 0x0, fd = 0,
              u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}} <repeats 20 times>, {events = 100, data = {ptr = 0xe5d92e5e00000000, fd = 0, u32 = 0,
              u64 = 16562320085893513216}}, {events = 32551, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {
              ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 619223403,
            data = {ptr = 0xe60ded4000007ffc, fd = 32764, u32 = 32764, u64 = 16577166662554386428}}, {events = 32551, data = {ptr = 0x100000000, fd = 0, u32 = 0, u64 = 4294967296}}, {events = 0,
            data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0,
            data = {ptr = 0x7f27e5ea0fea, fd = -437645334, u32 = 3857321962, u64 = 139809337774058}}, {events = 0, data = {ptr = 0x24e89ae800000000, fd = 0, u32 = 0, u64 = 2659545901185368064}}, {
            events = 32764, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {
            events = 0, data = {ptr = 0x4100000000, fd = 0, u32 = 0, u64 = 279172874240}}, {events = 0, data = {ptr = 0x7f27e60e1f20, fd = -435282144, u32 = 3859685152, u64 = 139809340137248}}, {
            events = 256, data = {ptr = 0xe5ea125d00000000, fd = 0, u32 = 0, u64 = 16567074369877049344}}, {events = 32551, data = {ptr = 0x7ffc24e895d0, fd = 619222480, u32 = 619222480,
              u64 = 140720927708624}}, {events = 1, data = {ptr = 0x800000000, fd = 0, u32 = 0, u64 = 34359738368}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {
              ptr = 0xa00000000, fd = 0, u32 = 0, u64 = 42949672960}}, {events = 0, data = {ptr = 0x7ffc24e898b0, fd = 619223216, u32 = 619223216, u64 = 140720927709360}}, {events = 0, data = {
              ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 619222488, data = {ptr = 0x7ffc, fd = 32764, u32 = 32764, u64 = 32764}}, {
            events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x1800000000, fd = 0, u32 = 0, u64 = 103079215104}}, {events = 48, data = {ptr = 0x7ffc24e89bc0,
              fd = 619224000, u32 = 619224000, u64 = 140720927710144}}, {events = 619223808, data = {ptr = 0xe60f684000007ffc, fd = 32764, u32 = 32764, u64 = 16577583377461313532}}, {events = 32551,
            data = {ptr = 0x7ffc24e899c0, fd = 619223488, u32 = 619223488, u64 = 140720927709632}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}} <repeats 24 times>, {events = 0,
            data = {ptr = 0x24e899c000000000, fd = 0, u32 = 0, u64 = 2659544629875048448}}, {events = 32764, data = {ptr = 0x7ffc24e89bd0, fd = 619224016, u32 = 619224016, u64 = 140720927710160}}, {
            events = 619224016, data = {ptr = 0x24e89ae800007ffc, fd = 32764, u32 = 32764, u64 = 2659545901185400828}}, {events = 32764, data = {ptr = 0x7ffc24e89bd0, fd = 619224016,
              u32 = 619224016, u64 = 140720927710160}}, {events = 10612800, data = {ptr = 0x414e5000000000, fd = 0, u32 = 0, u64 = 18381978990542848}}, {events = 0, data = {ptr = 0x7f27e5ea124d,
              fd = -437644723, u32 = 3857322573, u64 = 139809337774669}}, {events = 3856303255, data = {ptr = 0xfbad800100007f27, fd = 32551, u32 = 32551, u64 = 18135292016274210599}}, {events = 0,
            data = {ptr = 0x7ffc24e89beb, fd = 619224043, u32 = 619224043, u64 = 140720927710187}}, {events = 619224044, data = {ptr = 0x24e89bd000007ffc, fd = 32764, u32 = 32764,
              u64 = 2659546897617813500}}, {events = 32764, data = {ptr = 0x7ffc24e89bd0, fd = 619224016, u32 = 619224016, u64 = 140720927710160}}, {events = 619224016, data = {
              ptr = 0x24e89bd000007ffc, fd = 32764, u32 = 32764, u64 = 2659546897617813500}}, {events = 32764, data = {ptr = 0x7ffc24e89bd0, fd = 619224016, u32 = 619224016, u64 = 140720927710160}},
          {events = 619224044, data = {ptr = 0x7ffc, fd = 32764, u32 = 32764, u64 = 32764}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x0, fd = 0,
              u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0xebbaf000000000, fd = 0, u32 = 0, u64 = 66352159481921536}}, {events = 0,
            data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x8900000000, fd = 0, u32 = 0, u64 = 588410519552}}, {events = 0, data = {ptr = 0xf00000b7, fd = -268435273,
              u32 = 4026532023, u64 = 4026532023}}, {events = 4294967295, data = {ptr = 0xffffffff, fd = -1, u32 = 4294967295, u64 = 4294967295}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0,
              u64 = 0}}, {events = 0, data = {ptr = 0xffffffff00000000, fd = 0, u32 = 0, u64 = 18446744069414584320}}, {events = 0, data = {ptr = 0x400, fd = 1024, u32 = 1024, u64 = 1024}}, {
            events = 0, data = {ptr = 0xe60e076000000000, fd = 0, u32 = 0, u64 = 16577195387295629312}}, {events = 32551, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 1546902266, data = {
              ptr = 0x294d9db000000000, fd = 0, u32 = 0, u64 = 2976208308001570816}}, {events = 0, data = {ptr = 0x7ffc24e89bd0, fd = 619224016, u32 = 619224016, u64 = 140720927710160}}, {
            events = 15448816, data = {ptr = 0xe5ea124d00000000, fd = 0, u32 = 0, u64 = 16567074301157572608}}, {events = 32551, data = {ptr = 0x7ffc24e89bc8, fd = 619224008, u32 = 619224008,
              u64 = 140720927710152}}, {events = 3856280951, data = {ptr = 0xebbaf000007f27, fd = 32551, u32 = 32551, u64 = 66352159481954087}}, {events = 0, data = {ptr = 0x3000000010, fd = 16,
              u32 = 16, u64 = 206158430224}}, {events = 619224000, data = {ptr = 0x24e89b0000007ffc, fd = 32764, u32 = 32764, u64 = 2659546004264615932}}, {events = 32764, data = {
              ptr = 0x7f27e60e06a0, fd = -435288416, u32 = 3859678880, u64 = 139809340130976}}, {events = 3856339872, data = {ptr = 0x24e89bc800007f27, fd = 32551, u32 = 32551,
---Type <return> to continue, or q <return> to quit---
              u64 = 2659546863258074919}}, {events = 32764, data = {ptr = 0x3933353120202020, fd = 538976288, u32 = 538976288, u64 = 4121696569082781728}}, {events = 3876909084, data = {
              ptr = 0x7f27, fd = 32551, u32 = 32551, u64 = 32551}}, {events = 0, data = {ptr = 0x100000000, fd = 0, u32 = 0, u64 = 4294967296}}, {events = 3856297972, data = {ptr = 0x7f27,
              fd = 32551, u32 = 32551, u64 = 32551}}, {events = 0, data = {ptr = 0x7f27e5db214a, fd = -438623926, u32 = 3856343370, u64 = 139809336795466}}, {events = 0, data = {
              ptr = 0xe5db194b00000000, fd = 0, u32 = 0, u64 = 16562859864498372608}}, {events = 0, data = {ptr = 0x7f27e5db1710, fd = -438626544, u32 = 3856340752, u64 = 139809336792848}}, {
            events = 15448816, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x7f27e5ea124d, fd = -437644723, u32 = 3857322573, u64 = 139809337774669}}, {
            events = 619224008, data = {ptr = 0xa1f04000007ffc, fd = 32764, u32 = 32764, u64 = 45581628919021564}}, {events = 0, data = {ptr = 0x414e50 <cipher_compare>, fd = 4279888, u32 = 4279888,
              u64 = 4279888}}, {events = 3856291637, data = {ptr = 0xeaf94c00007f27, fd = 32551, u32 = 32551, u64 = 66139249363156775}}, {events = 0, data = {ptr = 0xebbaf0, fd = 15448816,
              u32 = 15448816, u64 = 15448816}}, {events = 3857322573, data = {ptr = 0xe5e22ec200007f27, fd = 32551, u32 = 32551, u64 = 16564853790180671271}}, {events = 32551, data = {ptr = 0x0,
              fd = 0, u32 = 0, u64 = 0}}, {events = 3849811, data = {ptr = 0x546d654d00000000, fd = 0, u32 = 0, u64 = 6083630053034295296}}, {events = 1818326127, data = {ptr = 0x202020202020203a,
              fd = 538976314, u32 = 538976314, u64 = 2314885530818453562}}, {events = 959657265, data = {ptr = 0xa426b2034343239, fd = 875835961, u32 = 875835961, u64 = 739271074901144121}}, {
            events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}} <repeats 183 times>, {events = 0, data = {ptr = 0xe6f389be00000000, fd = 0, u32 = 0, u64 = 16641796497200906240}}, {
            events = 32551, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {
            events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x1a100000000000, fd = 0, u32 = 0,
              u64 = 7335941580521472}}, {events = 0, data = {ptr = 0x1a0edc, fd = 1707740, u32 = 1707740, u64 = 1707740}}, {events = 1707740, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {
            events = 0, data = {ptr = 0x5, fd = 5, u32 = 5, u64 = 5}}, {events = 3805184, data = {ptr = 0x3a700000000000, fd = 0, u32 = 0, u64 = 16448693951528960}}, {events = 0, data = {
              ptr = 0x3a6738, fd = 3827512, u32 = 3827512, u64 = 3827512}}, {events = 3844640, data = {ptr = 0xe6f41c5400000000, fd = 0, u32 = 0, u64 = 16641957670143655936}}, {events = 32551,
            data = {ptr = 0x3, fd = 3, u32 = 3, u64 = 3}}, {events = 3855951164, data = {ptr = 0x7f27, fd = 32551, u32 = 32551, u64 = 32551}}, {events = 0, data = {ptr = 0x7f27e6f3a18a,
              fd = -420241014, u32 = 3874726282, u64 = 139809355178378}}, {events = 96848, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}, {events = 0, data = {ptr = 0x5, fd = 5, u32 = 5, u64 = 5}}, {
            events = 2191360, data = {ptr = 0x100000000, fd = 0, u32 = 0, u64 = 4294967296}}, {events = 0, data = {ptr = 0x2182c0, fd = 2196160, u32 = 2196160, u64 = 2196160}}, {events = 2213008,
            data = {ptr = 0x1700000000000, fd = 0, u32 = 0, u64 = 404620279021568}}, {events = 0, data = {ptr = 0x3, fd = 3, u32 = 3, u64 = 3}}, {events = 0, data = {ptr = 0x10000000000000, fd = 0,
              u32 = 0, u64 = 4503599627370496}}, {events = 0, data = {ptr = 0xff78c, fd = 1046412, u32 = 1046412, u64 = 1046412}}, {events = 1046412, data = {ptr = 0xe6f384ec00000000, fd = 0,
              u32 = 0, u64 = 16641791197211262976}}, {events = 32551, data = {ptr = 0x5, fd = 5, u32 = 5, u64 = 5}}, {events = 3141632, data = {ptr = 0x30100000000000, fd = 0, u32 = 0,
              u64 = 13528391068155904}}, {events = 0, data = {ptr = 0x30010c, fd = 3145996, u32 = 3145996, u64 = 3145996}}, {events = 3146072, data = {ptr = 0xff00000000000, fd = 0, u32 = 0,
              u64 = 4486007441326080}}, {events = 0, data = {ptr = 0x3, fd = 3, u32 = 3, u64 = 3}}, {events = 0, data = {ptr = 0x1600000000000, fd = 0, u32 = 0, u64 = 387028092977152}}, {events = 0,
            data = {ptr = 0x20, fd = 32, u32 = 32, u64 = 32}}, {events = 47, data = {ptr = 0xe6f3ccc900000000, fd = 0, u32 = 0, u64 = 16641870211724607488}}, {events = 32551, data = {ptr = 0x5,
              fd = 5, u32 = 5, u64 = 5}}, {events = 3876873416, data = {ptr = 0x2000007f27, fd = 32551, u32 = 32551, u64 = 137438986023}}, {events = 0, data = {ptr = 0x403bdf, fd = 4209631,
              u32 = 4209631, u64 = 4209631}}, {events = 47, data = {ptr = 0xa00000001, fd = 1, u32 = 1, u64 = 42949672961}}, {events = 0, data = {ptr = 0x3, fd = 3, u32 = 3, u64 = 3}}, {events = 1,
            data = {ptr = 0x24e8a7e000000000, fd = 0, u32 = 0, u64 = 2659560160476790784}}, {events = 32764, data = {ptr = 0x7f27e6f3c959, fd = -420230823, u32 = 3874736473, u64 = 139809355188569}},
          {events = 0, data = {ptr = 0xe71464e800000000, fd = 0, u32 = 0, u64 = 16651044669890756608}}, {events = 32551, data = {ptr = 0x7ffc24e8a7e0, fd = 619227104, u32 = 619227104,
              u64 = 140720927713248}}, {events = 3874757716, data = {ptr = 0x24e8a81000007f27, fd = 32551, u32 = 32551, u64 = 2659560366635253543}}, {events = 32764, data = {ptr = 0x7f27e6d2e8cf,
              fd = -422385457, u32 = 3872581839, u64 = 139809353033935}}...}
        pe = 0x7ffc24e88e78
        e = {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}
        real_timeout = 18
        q = 0xa42e388
        w = 0x8415488
        sigset = {__val = {0 <repeats 16 times>}}
        psigset = 0x0
        base = 47199198
        have_signals = 0
        nevents = 2
        count = 48
        nfds = 3
        fd = 2075
        op = 619232960
        i = 2
        __PRETTY_FUNCTION__ = <error reading variable __PRETTY_FUNCTION__ (Cannot access memory at address 0x7404cb)>
#8  0x000000000069d8e2 in uv_run (loop=0xe3abc0, mode=UV_RUN_DEFAULT) at ../deps/libuv/src/unix/core.c:370
        timeout = 18
        r = 1
        ran_pending = 0
#9  0x0000000000583b3a in DepLibUV::RunLoop () at ../src/DepLibUV.cpp:53
        __FUNCTION__ = <error reading variable __FUNCTION__ (Cannot access memory at address 0x727f8a)>
#10 0x000000000058db87 in Worker::Worker (this=0x7ffc24e8bfa0, channel=0xe3afb0) at ../src/Worker.cpp:35
No locals.
#11 0x00000000006728e2 in main (argc=16, argv=0x7ffc24e8c1a8) at ../src/main.cpp:91
        worker = {<SignalsHandler::Listener> = {_vptr.Listener = 0x729510 <vtable for Worker+16>}, <Channel::UnixStreamSocket::Listener> = {
            _vptr.Listener = 0x729550 <vtable for Worker+80>}, <RTC::Router::Listener> = {_vptr.Listener = 0x729570 <vtable for Worker+112>}, channel = 0xe3afb0, notifier = 0xec65e0,
          signalsHandler = 0xec5920, closed = false, routers = {
            _M_h = {<std::__detail::_Hashtable_base<unsigned int, std::pair<unsigned int const, RTC::Router*>, std::__detail::_Select1st, std::equal_to<unsigned int>, std::hash<unsigned int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Hashtable_traits<false, false, true> >> = {<std::__detail::_Hash_code_base<unsigned int, std::pair<unsigned int const, RTC::Router*>, std::__detail::_Select1st, std::hash<unsigned int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, false>> = {<std::__detail::_Hashtable_ebo_helper<0, std::__detail::_Select1st, true>> = {<std::__detail::_Select1st> = {<No data fields>}, <No data fields>}, <std::__detail::_Hashtable_ebo_helper<1, std::hash<unsigned int>, true>> = {<std::hash<unsigned int>> = {<std::__hash_base<unsigned long, unsigned int>> = {<No data fields>}, <No data fields>}, <No data fields>}, <std::__detail::_Hashtable_ebo_helper<2, std::__detail::_Mod_range_hashing, true>> = {<std::__detail::_Mod_range_hashing> = {<No data fields>}, <No data fields>}, <No data fields>}, <std::__detail::_Hashtable_ebo_helper<0, std::equal_to<unsigned int>, true>> = {<std::equal_to<unsigned int>> = {<std::binary_function<unsigned int, unsigned int, bool>> = {<No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}, <std::__detail::_Map_base<unsigned int, std::pair<unsigned int const, RTC::Router*>, std::allocator<std::pair<unsigned int const, RTC::Router*> >, std::__detail::_Select1st, std::equal_to<unsigned int>, std::hash<unsigned int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true>, true>> = {<No data fields>}, <std::__detail::_Insert<unsigned int, st---Type <return> to continue, or q <return> to quit---
d::pair<unsigned int const, RTC::Router*>, std::allocator<std::pair<unsigned int const, RTC::Router*> >, std::__detail::_Select1st, std::equal_to<unsigned int>, std::hash<unsigned int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true>, false, true>> = {<std::__detail::_Insert_base<unsigned int, std::pair<unsigned int const, RTC::Router*>, std::allocator<std::pair<unsigned int const, RTC::Router*> >, std::__detail::_Select1st, std::equal_to<unsigned int>, std::hash<unsigned int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >> = {<No data fields>}, <No data fields>}, <std::__detail::_Rehash_base<unsigned int, std::pair<unsigned int const, RTC::Router*>, std::allocator<std::pair<unsigned int const, RTC::Router*> >, std::__detail::_Select1st, std::equal_to<unsigned int>, std::hash<unsigned int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >> = {<No data fields>}, <std::__detail::_Equality<unsigned int, std::pair<unsigned int const, RTC::Router*>, std::allocator<std::pair<unsigned int const, RTC::Router*> >, std::__detail::_Select1st, std::equal_to<unsigned int>, std::hash<unsigned int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true>, true>> = {<No data fields>}, <std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<std::pair<unsigned int const, RTC::Router*>, false> > >> = {<std::__detail::_Hashtable_ebo_helper<0, std::allocator<std::__detail::_Hash_node<std::pair<unsigned int const, RTC::Router*>, false> >, true>> = {<std::allocator<std::__detail::_Hash_node<std::pair<unsigned int const, RTC::Router*>, false> >> = {<__gnu_cxx::new_allocator<std::__detail::_Hash_node<std::pair<unsigned int const, RTC::Router*>, false> >> = {<No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}, _M_buckets = 0x11c5460, _M_bucket_count = 23, _M_before_begin = {_M_nxt = 0xafdca20}, _M_element_count = 7, _M_rehash_policy = {static _S_growth_factor = 2, _M_max_load_factor = 1,
                _M_next_resize = 23}, _M_single_bucket = 0x67efc0 <Json::Value::~Value()>}}}
        id = {static npos = <optimized out>, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0xe3ab78 "uraqakfd#1"}}
        channelFd = 3
        channel = 0xe3afb0
        __FUNCTION__ = <error reading variable __FUNCTION__ (Cannot access memory at address 0x73b2b0)>
@ibc
Copy link
Member

ibc commented Jan 8, 2019

Hummm, it crahses even with the refactor of libuv C++ handles that we did recently...

OK, we'll try to fix it, however it's hard (I tried to reproduce it in the past and it was impossible). Said that, I recommend you not using TCP transport. It mostly useless in WebRTC. It's much better than you just enable UDP in mediasoup and then use a TURN server that uses both UDP and TLS.

@ibc
Copy link
Member

ibc commented Jan 22, 2019

@artushin have you seen this again with the latest version? (just wondering if recent changes have also fixed this strange issue).

@artushin
Copy link
Author

Yep, still seeing it, adding some instrumentation to see if we can find the issue. I'll let you know if we come up with anything.

@ibc
Copy link
Member

ibc commented Jan 22, 2019

Thanks. It must be something wrong in some TcpServer or TcpConnection class, but I've revised them and found nothing...

@mariat-atg
Copy link

Hi @ibc, @artushin 's coworker here. I added a little bit of instrumentation in TcpConnection to investigate this. I could not find the cause or figure out 100% solid repro. Here is what I know though:

  1. The error happens in TcpConnection::OnUvWriteError() followed by TcpConnection::Close() and caused by somehow invalid TcpConnection::uvHandle. From there it will be either SIGSEGV from invalid memory access, or some assertion in libuv core.c would fire and abort the worker. So next I tried to narrow down where the handle becomes corrupted.

  2. In https://github.com/versatica/mediasoup/blob/master/worker/deps/libuv/src/unix/stream.c#L976 req->cb(req, req->error) is called with req->error != 0, I checked that condition (req->handle == stream) stays true inside uv__write_callbacks()

  3. Next, I put a check into TcpConnection::onWrite() to compare handle ptrs from incoming req and casted TcpConnection. Here I get a mismatch 100% of the time the bug occurs. Code looks like this, inserted before std::free() call ie right after https://github.com/versatica/mediasoup/blob/master/worker/src/handles/TcpConnection.cpp#L27:

bool connectionClosed = connection->IsClosed();   
uv_stream_t* casted= reinterpret_cast<uv_stream_t*>(connection->GetUvHandle());
  if (status != 0 && !connectionClosed && casted != req->handle) {
    MS_ERROR("onWrite() uvHandle mismatch: err=%d req=0x%" PRIx64 " connection=0x%" PRIx64, status, req->handle, casted);
  }
  1. I cannot pinpoint this issue to any specific scenario but status value will be -32, -110 or -104, and there was an active session with at least two participants, and most of the time there was a viewer leaving the session in a normal fashion not so long before crash.

This is all solid data I have for now, and unfortunately we still see quite a few of these in production. If you can suggest how we can further research this please do. Thanks!

@ibc
Copy link
Member

ibc commented Jan 25, 2019

Hi @mariat-atg, amazing check So many thanks. I'll investigate it next week. It would be so nice if we had a solid way to reproduce the crash, although I understand it's not easy at all. Will work on it next week. Thanks again.

@ibc
Copy link
Member

ibc commented Jan 25, 2019

mmm, just some ideas coming to my mind (must elaborate them better)

  • Imagine that there is pending TCP data in the libuv loop to be written (to be sent to the browser).
  • However, before it's sent we call delete transport.
  • The TcpConnection destructor will first call to its Close() method, which will call uv_read_stop() followed by uv_shutdown() or uv_close() to destroy (gracefully or not) the UV handle.
  • However, the way libuv works is that the UV handle won't be destroyed in that moment, but later within the uv_close_cb callback (out onClose function).
  • So, even if the UV handle closure will be invoked later, we have already called delete transport.
  • And, since there were pending TCP data to be sent, it may happen that uv_write_cb (so our onWrite function) is called later, and when it tries to get the associated TcpConnection instance (in line 27) it happens that such a TcpConnection was already deleted before!!! so here the SIGSEGV.

Does it make any sense??

@ibc
Copy link
Member

ibc commented Jan 25, 2019

I've asked in the libuv mailing list: https://groups.google.com/forum/#!topic/libuv/YdkcPY57sec

ibc added a commit that referenced this issue Jan 25, 2019
@ibc
Copy link
Member

ibc commented Jan 25, 2019

I've created a branch fix-tcp-crash and added some changes and comments. Please check it (note that it's not finished, check the comments in the crash). If you could test it (by also removing the uv_shutdown() usage as explained in the comments) it would be great.

@mariat-atg
Copy link

Oh yes, this does make sense! Just one thing is that TcpConnection::closed of a deleted object should be set into false so we get into the repro's code path to call TcpConnection::Close() on it one more time. On the first call to TcpConnection::Close() this flag will be set into true so we must be lucky to have it reset in a deleted object. I'll try to add some tracing.

@mariat-atg
Copy link

Thank you, we will give it a try.

ibc added a commit that referenced this issue Jan 25, 2019
@ibc ibc closed this as completed in 79b7c4b Jan 25, 2019
@ibc
Copy link
Member

ibc commented Jan 25, 2019

Hi guys, this should have been fixed in 2.6.8. So many thanks for your help.

I'm pretty sure 79b7c4b fixes the problem (it makes sense as explained above) so I've tested it locally and released 2.6.8. Please upgrade your versions and tell me that it no longer crashes :)

@artushin
Copy link
Author

Looking great so far @ibc! Not a single crash in 48 hours since the upgrade.

@ibc
Copy link
Member

ibc commented Jan 28, 2019

I love those issues that have a proper explanation :)
So thanks a lot for reporting and giving us the key of the bug.

@ibc
Copy link
Member

ibc commented Jan 28, 2019

I @artushin, I've seen this commit in your fork (-fstack-protector-all):

LivelyVideo@7795ca5

I've read about it and makes sense given that mediasoup does receive input from outside. Does it affect performance or anything? Is it supposed to work in both GCC and Clang in Linux and OSX? So you recommend adding it?

And wouldn't it be better to use fstack-protector-strong? https://pagure.io/fesco/issue/1128

BTW: GitHub should provide some way to make it possible for developers to communicate :)

@artushin
Copy link
Author

That was added by @mariat-atg to a branch we were using for debugging. She might be able to tell you more, but my understanding is that it's implemented purely for security concerns and didn't end up being relevant to this issue. I can't really tell you how much it impacted performance as we didn't profile with and without it.

I think you might have my email address from the google group in case you want to reach out directly.

@ibc
Copy link
Member

ibc commented Jan 28, 2019

Clear, thanks. Since we do extensive fuzzing testing in mediasoup-worker, I think we do not have "stack overflow" issues (but who knows).

@mariat-atg
Copy link

yep, enabling stack protector flag is not relevant to this problem, please ignore. While testing there were no noticeable perf changes (I happened to run some profiling) but neither I saw any benefits during testing.

lavarsicious pushed a commit to lavarsicious/mediasoup that referenced this issue Feb 5, 2019
* C++: verify in libuv static callbacks that the associated C++ instance has not been deallocated (should fix versatica#258)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants