Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mpp task may get the same ID and lead to query failed #27952

Closed
wjhuang2016 opened this issue Sep 10, 2021 · 1 comment · Fixed by #28022
Closed

mpp task may get the same ID and lead to query failed #27952

wjhuang2016 opened this issue Sep 10, 2021 · 1 comment · Fixed by #28022
Labels
affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. component/tiflash severity/major type/bug The issue is confirmed as a bug.

Comments

@wjhuang2016
Copy link
Member

wjhuang2016 commented Sep 10, 2021

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

I run the TPC-DS query56.

mysql> source /Users/xxx/pc/tidb-bench/tpcds/queries/query_56.sql;
ERROR 1105 (HY000): DB::Exception: The task [427626637160349697,9] has been registered

I added some logs to TiDB, and got.

[2021/09/10 17:29:23.280 +08:00] [WARN] [session.go:972] [AllocMPPTaskID] [id=2]
[2021/09/10 17:29:23.280 +08:00] [WARN] [session.go:972] [AllocMPPTaskID] [id=3]
[2021/09/10 17:29:23.280 +08:00] [WARN] [fragment.go:381] [constructMPPTasksForSinglePartitionTable] [id=3] [tableID=112]
[2021/09/10 17:29:23.280 +08:00] [WARN] [fragment.go:381] [constructMPPTasksForSinglePartitionTable] [id=3] [tableID=112]
...
[2021/09/10 17:29:23.280 +08:00] [INFO] [mpp_gather.go:76] ["Dispatch mpp task"] [timestamp=427626637160349697] [ID=3] [address=192.168.197.180:3930] [plan="Table(date_dim)->Sel([eq(tpcds.date_dim.d_year, 2000) eq(tpcds.date_dim.d_moy, 1)])->Send(4, )"] [pf=474]
...
[2021/09/10 17:29:23.280 +08:00] [INFO] [mpp_gather.go:76] ["Dispatch mpp task"] [timestamp=427626637160349697] [ID=3] [address=192.168.197.180:3930] [plan="Table(date_dim)->Sel([eq(tpcds.date_dim.d_year, 2000) eq(tpcds.date_dim.d_moy, 1)])->Send(6, )"] [pf=278]
...

After I added a mutex to protect taskID, the query successed.

func (s *SessionVars) AllocMPPTaskID(startTS uint64) int64 {
	s.mppTaskIDAllocator.mu.Lock()
	defer s.mppTaskIDAllocator.mu.Unlock()
	if s.mppTaskIDAllocator.lastTS == startTS {
		s.mppTaskIDAllocator.taskID++
		log.Warn("AllocMPPTaskID", zap.Int64("id", s.mppTaskIDAllocator.taskID))
		return s.mppTaskIDAllocator.taskID
	}
	s.mppTaskIDAllocator.lastTS = startTS
	s.mppTaskIDAllocator.taskID = 1
	return 1
}

#23747 Tried to fix it but I think it doesn't resolve the problem completely.

4. What is your TiDB version? (Required)

master

@github-actions
Copy link

Please check whether the issue should be labeled with 'affects-x.y' or 'backport-x.y.z',
and then remove 'needs-more-info' label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. component/tiflash severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants