-
Notifications
You must be signed in to change notification settings - Fork 849
Fix for dropped connections due to next_inactivity_timeout_at==0 and inactivity_timeout_in!=0 #1675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for dropped connections due to next_inactivity_timeout_at==0 and inactivity_timeout_in!=0 #1675
Conversation
…inactivity_timeout_in!=0 Under certain conditions it is possible that read_disable or write_disable will set vc->next_inactivity_timeout to 0 while inactivity_timeout_in still has a value. In manage_active_queue this will lead to a close of the vc, resulting in interrupted service for the client. Checking the source code the best fix seems to be to check for next_inactivity_timeout_at!=0 in manage_active_queue. The value 0 does not seem to be used to force an early close/cleanup. The current time is used for that in certain cases. The active timeout does not have this problem. If it is set to 0 the active_timeout_in is also set to 0. This cannot be done for inactive timeout because the net_activity call needs inactivity_timeout_in.
|
The commit message is missing the note about which commit on master this was cherry-picked from. Please use the |
|
How does this apply to master now? Is this an issue there? I saw your comment on my other Issue, maybe this is now a candidate for 7.1.x as well ? |
|
@zwoop this applies to 7,1.x also,It can be reliably applied to master also, However for master the 425b696 commit mitigates the behavior by calling the vc->set_inactivity_timeout(0); instead of vc->next_inactivity_timeout_at = 0; 425b696 causes other problems though as you've found out. Do I need to do anything to ready it for master/7.1? |
|
In the master branch,
Therefore, the patch for 7.0.x and later is below, @keesspoelstra Thanks for helping us figure it out. @zwoop |
|
@oknet , seems reasonable, but I don't know if reverting back to default inactivity timeout is the way to go, in the old code To make it more explicit I would opt for an extra function pause_inactivity_timeout or something like that. |
|
@keesspoelstra in the old InactivityCop::check_inactivity() code, it will set next_inactivity_timeout_at to default if it is 0. |
|
"set next_inactivity_timeout_at to default" is a safety mechanism that enforce the netvc is closed in a expected time. The manage_active_queue() try to close the netvc that has timeouted. |
|
Ok, now we're on to something. Because the VCs return from disable very quickly (we hope so) the inactivity cop does not see them often and does not set the default, thus not introducing the lengthening or shortening of inactivity timeout. But by setting the inactivity_timeout_in explicitly to default when we're disabling and on other places we're making matters worse because we're introducing longer (or shorter) timeouts in some cases. No access to source code , but I can imagine SSL handshakes being vulnerable to this. |
|
What's the consensus here? Note that I had to revert @oknet's patch for inactivity timeout on 7.1.x, we likely need to do it on master too unless we can resolve the performance issues (they are severe). How does this PR apply to master? I'm confused with the fact that this is being developed on the 6.2.x branch. |
|
If you reverted you will need this patch. @oknet , agree? Without it every once in a few 1000-100.000 requests connections can be dropped giving partial content depending on plugins and timing of the system. It was developed against 6.2.x because at the time it was only applicable to 6.2.x and earlier. What is the procedure to land this on master or 7.1.x? |
|
@keesspoelstra agree with you, @zwoop please apply this PR if you have revert #771 . ATS maybe close netvc in active queue if add_to_active_queue() called. |
|
Without #771 , the cancel_inactivity_timeout() is useless because the InactivityCop::check_inactivity() will reset next_inactivity_timeout_at to default value in the next second. codes from 7.1.x branch: Set next_inactivity_timeout_at by set_inactivity_timeout() if it is 0. |
|
[approve ci] |
1 similar comment
|
[approve ci] |
|
Where are we with this PR? Do we need this on master and/or 7.1.x as well? Or is it already in either (or both) ? |
|
@zwoop I think this is only to 6.2.x. |
|
This PR seems a little confused. It seems like a legitimate bug, but this is not set up as a backport of an upstream fix. I'd like to see that fixed so I can merge, but I cannot merge this as is. I am going to close it, but please reopen it if it can be fixed, or open a new one as a proper backport of an upstream fix. |
Under certain conditions it is possible that read_disable or write_disable will set vc->next_inactivity_timeout to 0 while inactivity_timeout_in still has a value.
In manage_active_queue this will lead to a close of the vc, resulting in interrupted service for the client.
Checking the source code the best fix seems to be to check for next_inactivity_timeout_at!=0 in manage_active_queue.
The value 0 does not seem to be used to force an early close/cleanup. The current time is used for that in certain cases.
The active timeout does not have this problem. If it is set to 0 the active_timeout_in is also set to 0. This cannot be done for inactive timeout because the net_activity call needs inactivity_timeout_in.