-
Notifications
You must be signed in to change notification settings - Fork 844
sometimes ua_session comes back empty #4158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dmorilha
commented
Aug 23, 2018
``` Program terminated with signal SIGSEGV, Segmentation fault. #0 Http2ConnectionState::release_stream (this=this@entry=0x7f22381730e0, stream=stream@entry=0x7f22381a6b80) at /opt/src/trafficserver/proxy/http2/Http2ConnectionState.cc:1186 1186 ua_session->destroy(); [Current thread is 1 (Thread 0x7f224975c700 (LWP 20146))] (gdb) bt #0 Http2ConnectionState::release_stream (this=this@entry=0x7f22381730e0, stream=stream@entry=0x7f22381a6b80) at /opt/src/trafficserver/proxy/http2/Http2ConnectionState.cc:1186 #1 0x00000000006197a5 in Http2Stream::destroy (this=0x7f22381a6b80) at /opt/src/trafficserver/proxy/http2/Http2Stream.cc:719 #2 0x0000000000612976 in Http2Stream::terminate_if_possible (this=0x7f22381a6b80) at /opt/src/trafficserver/proxy/http2/Http2Stream.cc:384 #3 Http2Stream::transaction_done (this=0x7f22381a6b80) at /opt/src/trafficserver/proxy/http2/Http2Stream.cc:373 #4 0x00000000005bb027 in HttpSM::kill_this (this=this@entry=0x7f2228a21620) at /opt/src/trafficserver/proxy/http/HttpSM.cc:6895 #5 0x00000000005bc1ad in HttpSM::main_handler (this=0x7f2228a21620, event=2301, data=0x7f2228a229c0) at /opt/src/trafficserver/proxy/http/HttpSM.cc:2640 #6 0x0000000000606751 in Continuation::handleEvent (data=0x7f2228a229c0, event=2301, this=<optimized out>) at /opt/src/trafficserver/iocore/eventsystem/I_Continuation.h:152 #7 HttpTunnel::main_handler (this=0x7f2228a229c0, event=<optimized out>, data=<optimized out>) at /opt/src/trafficserver/proxy/http/HttpTunnel.cc:1651 #8 0x0000000000755a6e in Continuation::handleEvent (data=<optimized out>, event=<optimized out>, this=<optimized out>) at /opt/src/trafficserver/iocore/eventsystem/I_Continuation.h:152 #9 CacheVC::calluser (event=<optimized out>, this=<optimized out>) at /opt/src/trafficserver/iocore/cache/P_CacheInternal.h:646 #10 CacheVC::openReadMain (this=0x7f223ece8980) at /opt/src/trafficserver/iocore/cache/CacheRead.cc:712 #11 0x00000000006da481 in Continuation::handleEvent (data=0x0, event=1, this=0x7f223ece8980) at /opt/src/trafficserver/iocore/eventsystem/I_Continuation.h:152 #12 CacheVC::callcont (this=0x7f223ece8980, event=<optimized out>) at /opt/src/trafficserver/iocore/cache/P_CacheInternal.h:665 #13 0x000000000075c1a1 in CacheVC::openReadStartHead (this=0x7f223ece8980, event=<optimized out>, e=0x0) at /opt/src/trafficserver/iocore/cache/CacheRead.cc:1159 #14 0x0000000000731185 in Continuation::handleEvent (data=0x0, event=3900, this=0x7f223ece8980) at /opt/src/trafficserver/iocore/eventsystem/I_Continuation.h:152 #15 CacheVC::handleReadDone (this=0x7f223ece8980, event=<optimized out>, e=<optimized out>) at /opt/src/trafficserver/iocore/cache/Cache.cc:2463 #16 0x00000000006d90f3 in Continuation::handleEvent (data=<optimized out>, event=3900, this=<optimized out>) at /opt/src/trafficserver/iocore/eventsystem/I_Continuation.h:152 #17 AIOCallbackInternal::io_complete (this=<optimized out>, event=<optimized out>, data=<optimized out>) at /opt/src/trafficserver/iocore/aio/P_AIO.h:116 #18 0x0000000000806a55 in Continuation::handleEvent (data=0x7f21252a8dc0, event=1, this=<optimized out>) at /opt/src/trafficserver/iocore/eventsystem/I_Continuation.h:152 #19 EThread::process_event (this=this@entry=0x7f2249c67010, e=e@entry=0x7f21252a8dc0, calling_code=1) at /opt/src/trafficserver/iocore/eventsystem/UnixEThread.cc:140 #20 0x00000000008072b6 in EThread::process_queue (this=this@entry=0x7f2249c67010, NegativeQueue=NegativeQueue@entry=0x7f224975bea0) at /opt/src/trafficserver/iocore/eventsystem/UnixEThread.cc:175 #21 0x00000000008076db in EThread::execute_regular (this=0x7f2249c67010) at /opt/src/trafficserver/iocore/eventsystem/UnixEThread.cc:207 #22 0x00000000008064b5 in spawn_thread_internal (a=0x2c83d70) at /opt/src/trafficserver/iocore/eventsystem/Thread.cc:84 #23 0x00007f224f8ad6ba in start_thread (arg=0x7f224975c700) at pthread_create.c:333 #24 0x00007f224eb3241d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 (gdb) f 0 #0 Http2ConnectionState::release_stream (this=this@entry=0x7f22381730e0, stream=stream@entry=0x7f22381a6b80) at /opt/src/trafficserver/proxy/http2/Http2ConnectionState.cc:1186 1186 ua_session->destroy(); (gdb) list 1181 1182 if (fini_received && total_client_streams_count == 0) { 1183 // We were shutting down, go ahead and terminate the session 1184 // this is a member of Http2ConnectionState and will be freed 1185 // when ua_session is destroyed 1186 ua_session->destroy(); 1187 1188 // Can't do this because we just destroyed right here ^, 1189 // or we can use a local variable to do it. 1190 // ua_session = nullptr; (gdb) print ua_session $1 = (Http2ClientSession *) 0x0 (gdb) info locals lock = {m = {m_ptr = 0x7f21a5f7c370}, locked_p = true} (gdb) print *lock.m.m_ptr $2 = {<RefCountObj> = {<ForceVFPTToTop> = {_vptr.ForceVFPTToTop = 0x80af70 <vtable for ProxyMutex+16>}, m_refcount = 3}, the_mutex = {__data = {__lock = 1, __count = 0, __owner = 20146, __nusers = 1, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\001\000\000\000\000\000\000\000\262N\000\000\001", '\000' <repeats 26 times>, __align = 1}, thread_holding = 0x7f2249c67010, nthread_holding = 3} (gdb) print *this $3 = {<Continuation> = {<force_VFPT_to_top> = {_vptr.force_VFPT_to_top = 0x842b40 <vtable for Http2ConnectionState+16>}, handler = (int (Continuation::*)(Continuation * const, int, void *)) 0x62a000 <Http2ConnectionState::state_closed(int, void*)>, mutex = {m_ptr = 0x0}, link = {<SLink<Continuation>> = {next = 0x0}, prev = 0x0}, control_flags = {raw_flags = 0}}, ua_session = 0x0, local_hpack_handle = 0x7f20fcd0c930, remote_hpack_handle = 0x7f20fcb91260, dependency_tree = 0x7f20fcc73550, server_settings = {settings = {4096, 1, 150, 1048576, 16384, 4294967295}}, client_settings = {settings = {65536, 1, 1000, 6291456, 16384, 4294967295}}, client_rwnd = 15714466, server_rwnd = 1048576, stream_list = {<DLL<Http2Stream, Continuation::Link_link>> = {head = 0x0}, tail = 0x0}, latest_streamid_in = 1, latest_streamid_out = 0, stream_requests = 1, client_streams_in_count = 0, client_streams_out_count = 0, total_client_streams_count = 0, continued_stream_id = 0, continued_buffer = {iov_base = 0x0, iov_len = 0}, _scheduled = false, fini_received = true, recursion = 0, fini_event = 0x0} (gdb) ```
| ua_session->destroy(); | ||
|
|
||
| if (ua_session) { | ||
| ua_session->destroy(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not seeing this issue in our 7.1.x or in 8.0.x. Makes sense to not destroy if ua_session is nullptr.
| // when ua_session is destroyed | ||
| ua_session->destroy(); | ||
|
|
||
| if (ua_session) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks weird because ua_session is already null-checked in above (line # 1169).
|
I have seem some similar things later on in the shutdown logic where the stack shows null, and it seems that client_vc has been nulled. #4147. In that case the client_vc had been closed without updating the reference to it in the Http2ClientSession object. Not the same thing here, but there are some odd races going on. I agree @masaori335's observation that just doing another null check is unlikely to help, since we have already done the null check in the current line of execution. I've spent this week chasing down a couple race conditions. After they have baked for the weekend, I will create PR's or update existing WIP PRs. |
|
Sounds promising @shinrich |
|
Is this only for 7.1.x ? Is there a corresponding commit on master? |
|
I think Susan ruled this as useless, I still face random crashes with 7.1.x related with closing sessions |
|
Should we close this then ? |
|
done :) |
|
Seems we have multiple threads calling into I think we need this merged unless we figured how to prevent multiple threads calling into this function. |
|
That scinario explains why |
|
@zizhong It looks like all
|
|
@masaori335 Thanks for pointing it out. So what really happened would be much more complicated than that. I know we have more locking preventing the above situation. But this use-after-free beast might happen and bypass these locking, too. |
|
@zizhong Thanks for details. Do you have ASan report of the use-after-free? or something similar? The situations is still bit unclear. I'd like to know the two threads have same stream or not. |
|
BTW, if this changes can save the situation, I'm fine with reopen and land this for workaround fix. |
|
@masaori335 no, I don't have ASAN report on this. The above hypothesis needs to be verified. |