-
Notifications
You must be signed in to change notification settings - Fork 583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tls_mgm: don't free anything in mod_destroy() #3269
base: master
Are you sure you want to change the base?
Conversation
Any updates here? No progress has been made in the last 30 days, marking as stale. |
No updates here. |
Any updates here? No progress has been made in the last 30 days, marking as stale. |
No updates here. |
@jes , even if I agree with the issue, I do not agree with the solution. You cannot destroy infrastructure resources before giving a chance to the modules to do a proper shutdown - for an example a module may require a TCP conn for its shutdown - this is a theoretical example, I'm not sure if there is such module, not even if you actually can use the TCP layer in the shutdown sequence - but let's check this first. |
OK, good point. Then what about making If you think that might be OK then I'll update the PR. |
yeah, indeed, it is a "bit" broken for the tls_mgm module to trash its data while this is still in use by ongoing connections. So, a quick workaround here will be for the tls_mgm module NOT to free anything that may be used by the connections. |
This fixes a possible double-free during shutdown. The issue was that the `tls_mgm` module was unloaded (free'ing all of its "domains") before the connections were destroyed. This left connections with dangling pointers to the domains, and when the connections are cleaned up, if the reference count is 0, `tls_mgm` can then try to free the domain again, causing a crash.
5e78cb7
to
f1838f2
Compare
@bogdan-iancu thanks, I've force-pushed a commit that does that, and I'll change the title of the pull request. |
@jes , revisiting a bit this issue and doing some brainstorming with @liviuchircu and @razvancrainea , we got to the conclusion that the code, as it is right now, should work ok. |
@jes , There was a similar report to yours, see #3338 . So, trying to see how to investigate this further (as as per my prev comment, the free'ing should be controlled by refcnt, so it should be ok) - can you somehow (even if randomly) reproduce this crash? I can eventually create a quick patch for troubleshooting those refcnt's |
The easiest way for me to reproduce it was to put a deliberate segfault in I could be wrong but if you have TLS domains setup I don't think there is any way to get OpenSIPS to try to shutdown without triggering the double free. But maybe it only gets detected if you are using the debug allocator? |
Summary
We were observing a crash on shutdown caused by a double-free of the TLS "domains" in
tls_mgm
. Not massively important because it only happens on shutdown, but still worth fixing.Details
The issue was that the
tls_mgm
module was unloaded (free'ing all of its "domains") before the connections were destroyed. This left connections with dangling pointers to the domains, and when the connections are cleaned up, if the reference count is 0,tls_mgm
can then try to free the domain again, causing a crash.Solution
The solution is to defer
destroy_modules()
until aftertcp_destroy()
.Compatibility
It is possible that there are other related bugs that could be solved in the same way (i.e. that
destroy_modules()
should move even further down) but I've tried to be conservative.Closing issues