Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schedule: let merge not easy to timeout #1495

Merged
merged 3 commits into from
Apr 8, 2019
Merged

Conversation

Connor1996
Copy link
Member

@Connor1996 Connor1996 commented Apr 4, 2019

What problem does this PR solve?

merge operator may timeout easily cause the merge operator timeout span is calculated as LeaderOperatorWaitTime.

func (o *Operator) IsTimeout() bool {
	var timeout bool
	if o.IsFinish() {
		return false
	}
	if o.kind&OpRegion != 0 {
		timeout = time.Since(o.createTime) > RegionOperatorWaitTime(10m)
	} else {
		timeout = time.Since(o.createTime) > LeaderOperatorWaitTime(10s)
	}
	if timeout {
		operatorCounter.WithLabelValues(o.Desc(), "timeout").Inc()
		return true
	}
	return false
}

What is changed and how it works?

For merge schedule, there will be two operators created. One for the source region (op1), the other for the target region (op2). And cause source region will be destroyed and there will be no heartbeat of it anymore, so op1 will never be finished, instead, it will be cleaned up in the PD background thread. So the op2 is the operator to detect when the merge process finished. But op2 has no steps like AddPeer, RemovePeer, so its timeout is always regarded as LeaderOperatorWaitTime.

This PR makes op2 have the same operator kinds as op1.

Check List

  • Need to cherry-pick to the release branch

Signed-off-by: Connor1996 <zbk602423539@gmail.com>
@nolouch
Copy link
Contributor

nolouch commented Apr 4, 2019

CI failed. Should cherry pick to 2.1?

@Connor1996
Copy link
Member Author

/rebuild

@rleungx
Copy link
Member

rleungx commented Apr 4, 2019

Do we need a test?

@rleungx rleungx added the component/schedule Scheduling logic. label Apr 4, 2019
@nolouch nolouch added the needs-cherry-pick-release-2.1 The PR needs to cherry pick to release-2.1 branch. label Apr 4, 2019
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
@Connor1996
Copy link
Member Author

Connor1996 commented Apr 4, 2019

PTAL, replace the test and check operator timeout

Signed-off-by: Connor1996 <zbk602423539@gmail.com>
@codecov-io
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (master@c0c07dc). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #1495   +/-   ##
=========================================
  Coverage          ?   67.28%           
=========================================
  Files             ?      158           
  Lines             ?    15459           
  Branches          ?        0           
=========================================
  Hits              ?    10401           
  Misses            ?     4107           
  Partials          ?      951
Impacted Files Coverage Δ
server/schedule/operator.go 86.26% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c0c07dc...5b40f34. Read the comment docs.

@zhouqiang-cl
Copy link
Contributor

/rebuild

1 similar comment
@zhouqiang-cl
Copy link
Contributor

/rebuild

@disksing disksing merged commit 001486e into tikv:master Apr 8, 2019
Connor1996 added a commit to Connor1996/pd that referenced this pull request Apr 10, 2019
* fix merge easy to timeout

Signed-off-by: Connor1996 <zbk602423539@gmail.com>

* add test

Signed-off-by: Connor1996 <zbk602423539@gmail.com>

* change time

Signed-off-by: Connor1996 <zbk602423539@gmail.com>
Connor1996 added a commit to Connor1996/pd that referenced this pull request Apr 10, 2019
* fix merge easy to timeout

Signed-off-by: Connor1996 <zbk602423539@gmail.com>
nolouch pushed a commit that referenced this pull request Apr 11, 2019
* fix merge easy to timeout

Signed-off-by: Connor1996 <zbk602423539@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/schedule Scheduling logic. needs-cherry-pick-release-2.1 The PR needs to cherry pick to release-2.1 branch.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants