-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tcp connection worker crashes #258
Comments
Hummm, it crahses even with the refactor of libuv C++ handles that we did recently... OK, we'll try to fix it, however it's hard (I tried to reproduce it in the past and it was impossible). Said that, I recommend you not using TCP transport. It mostly useless in WebRTC. It's much better than you just enable UDP in mediasoup and then use a TURN server that uses both UDP and TLS. |
@artushin have you seen this again with the latest version? (just wondering if recent changes have also fixed this strange issue). |
Yep, still seeing it, adding some instrumentation to see if we can find the issue. I'll let you know if we come up with anything. |
Thanks. It must be something wrong in some TcpServer or TcpConnection class, but I've revised them and found nothing... |
Hi @ibc, @artushin 's coworker here. I added a little bit of instrumentation in TcpConnection to investigate this. I could not find the cause or figure out 100% solid repro. Here is what I know though:
This is all solid data I have for now, and unfortunately we still see quite a few of these in production. If you can suggest how we can further research this please do. Thanks! |
Hi @mariat-atg, amazing check So many thanks. I'll investigate it next week. It would be so nice if we had a solid way to reproduce the crash, although I understand it's not easy at all. Will work on it next week. Thanks again. |
mmm, just some ideas coming to my mind (must elaborate them better)
Does it make any sense?? |
I've asked in the libuv mailing list: https://groups.google.com/forum/#!topic/libuv/YdkcPY57sec |
I've created a branch fix-tcp-crash and added some changes and comments. Please check it (note that it's not finished, check the comments in the crash). If you could test it (by also removing the |
Oh yes, this does make sense! Just one thing is that |
Thank you, we will give it a try. |
…e has not been deallocated (should fix #258)
Hi guys, this should have been fixed in 2.6.8. So many thanks for your help. I'm pretty sure 79b7c4b fixes the problem (it makes sense as explained above) so I've tested it locally and released 2.6.8. Please upgrade your versions and tell me that it no longer crashes :) |
Looking great so far @ibc! Not a single crash in 48 hours since the upgrade. |
I love those issues that have a proper explanation :) |
I @artushin, I've seen this commit in your fork ( I've read about it and makes sense given that mediasoup does receive input from outside. Does it affect performance or anything? Is it supposed to work in both GCC and Clang in Linux and OSX? So you recommend adding it? And wouldn't it be better to use BTW: GitHub should provide some way to make it possible for developers to communicate :) |
That was added by @mariat-atg to a branch we were using for debugging. She might be able to tell you more, but my understanding is that it's implemented purely for security concerns and didn't end up being relevant to this issue. I can't really tell you how much it impacted performance as we didn't profile with and without it. I think you might have my email address from the google group in case you want to reach out directly. |
Clear, thanks. Since we do extensive fuzzing testing in mediasoup-worker, I think we do not have "stack overflow" issues (but who knows). |
yep, enabling stack protector flag is not relevant to this problem, please ignore. While testing there were no noticeable perf changes (I happened to run some profiling) but neither I saw any benefits during testing. |
* C++: verify in libuv static callbacks that the associated C++ instance has not been deallocated (should fix versatica#258)
Bug Report
Worker is crashing with the following. Seeing the following backtrace (sorry, not in debug, happening in production)
Your environment
Debian GNU/Linux 8 (jessie)
v8.12.0
6.4.1
gcc 4.9.2
2.6.3
Can't tell, sorry
Issue description
The text was updated successfully, but these errors were encountered: