-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: fix show problem for kill tidb connection (#24031) #29212
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
I changed 2 fields in process info, one is 'Info' shown as 'this session is being killed', another is 'Command' shown like 'Kill', since 'Kill' in Command field has been deprecated(https://dev.mysql.com/doc/refman/5.7/en/mysql-nutshell.html), I am not sure whether we should change this filed. |
Feel free to give me your commends @morgo @tiancaiamao THX! |
server/server.go
Outdated
@@ -685,6 +685,8 @@ func (s *Server) getTLSConfig() *tls.Config { | |||
func killConn(conn *clientConn) { | |||
sessVars := conn.ctx.GetSessionVars() | |||
atomic.StoreUint32(&sessVars.Killed, 1) | |||
// 'Kill' status can be showed in Command/Info field when show processlist | |||
conn.ctx.SetProcessInfo("this session is being killed", time.Now(), mysql.ComProcessKill, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, for MySQL compatibility the Info
column is not overwritten. This helps you see what is being actively killed. The State should also say "Killed" instead of autocommit.
For #24031, there are really two issues:
- Update the status for connections which have been "killed" so that it is more accurate.
- Change the
wait_timeout
to be a loop, so the connection which is being killed can cleanup faster.
This fixes issue (1), but we should also fix (2) to consider #24031 resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx for your review, and do we need design doc for fix (2)?
/run-check_dev_2 |
1 similar comment
/run-check_dev_2 |
@bb7133 @tiancaiamao Please review, thx. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some risk that "killed connections" that are not shown could still be consuming a fair amount of memory, and this makes it less transparent to users what the cause is. But it's also an existing problem, since background threads are not well instrumented. It is not typical that a user will have a very high count of killed connections, if we receive bug reports we can advise them to lower the wait_timeout
.
So there is some risk, but overall the change LGTM so I think we should proceed with it.
an integration test was added in https://github.com/pingcap/automated-tests/pull/881 |
/cc @bb7133 @tiancaiamao |
Well, this is a bug if it really happens. If we found such kind of bug or any user report a bug to us, we need to figure out why and fix the bug. Our promise to the user is, he killed a session, then he doesn't see the killed session, it's killed, right? Even the resource is not released immediately, or even the query continue for a while, it will (and should) be done eventually ... If a user observe that the kill doesn't take effect (the query still running for a long time...) then its our bug. Some common situations are: a query eat up the memory, or a query use up tikv coprocessor resource, or a query just takes too long ... the user want to kill it. So my point is, the user doesn't really care about the internal state or willing to know why, display "the killing is still in process" is not what he expected ... and we deliver what he wants and ... fix bugs |
I've discussed this with everyone: we will approve it for now because the goal is to cherry pick to 5.3. We will work on an additional fix for master to clean up killed connections faster. |
/merge |
This pull request has been accepted and is ready to merge. Commit hash: 7b0eecf
|
@yiwen92: Your PR was out of date, I have automatically updated it for you. At the same time I will also trigger all tests for you: /run-all-tests If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
) (pingcap#29212)" This reverts commit 52c6890.
What problem does this PR solve?
Issue Number: close #24031
Problem Summary:
Customer cannot be noticed with a killed session, which still stay in processlist info shown as normal sessions.
What is changed and how it works?
If a session has been killed by another session, it will not be showed in show processlist query.
Check List
Tests
https://github.com/pingcap/automated-tests/pull/881
Before:
After fix:
Side effects
Documentation
Release note