Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store/tikv: fix CheckStreamTimeoutLoop goroutine leak (#13812) #14227

Conversation

SunRunAway
Copy link
Contributor

@SunRunAway SunRunAway commented Dec 25, 2019

Automated cherry pick of #13812 on release-3.0.

Fixes #14207


What problem does this PR solve?

When the TiKV server close, this goroutine leaks:

4 @ 0x4136c 0x51f74 0xc6af1c 0x70444
#	0xc6af1b	github.com/pingcap/tidb/store/tikv/tikvrpc.CheckStreamTimeoutLoop+0x16b	/Users/zhou/gorepo/src/github.com/pingcap/tidb/store/tikv/tikvrpc/tikvrpc.go:812

In our code, the CheckStreamTimeoutLoop goroutine is only closed when rpcClient close,
but the rpcClient never close before the TiDB process exit.

What is changed and how it works?

CheckStreamTimeoutLoop goroutine is initialized for each connArray, but they're not closed when connArray close, thus leading to the leak.

Move the done channel from rpcClient to connArray, the goroutine would exit when connArray is closed.

There is an idle recycle mechanism for the connArray to be closed when TiKV server is gone.

Check List

CheckStreamTimeoutLoop goroutine is initialized for each connArray,
but they're not closed when connArray close, thus leading to the leak
@SunRunAway SunRunAway added the priority/release-blocker This issue blocks a release. Please solve it ASAP. label Dec 25, 2019
Copy link
Contributor

@lysu lysu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lysu lysu added the status/LGT1 Indicates that a PR has LGTM 1. label Dec 25, 2019
Copy link
Member

@wjhuang2016 wjhuang2016 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@wjhuang2016 wjhuang2016 added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Dec 25, 2019
@jackysp
Copy link
Member

jackysp commented Dec 25, 2019

/merge

@sre-bot sre-bot added the status/can-merge Indicates a PR has been approved by a committer. label Dec 25, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Dec 25, 2019

/run-all-tests

@sre-bot sre-bot merged commit da1427a into pingcap:release-3.0 Dec 25, 2019
@SunRunAway SunRunAway deleted the automated-cherry-pick-of-#13812-upstream-release-3.0 branch December 25, 2019 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/tikv priority/release-blocker This issue blocks a release. Please solve it ASAP. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants