Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance of searchCachedRegion could be enhanced #532

Open
chrysan opened this issue Jun 22, 2022 · 3 comments
Open

performance of searchCachedRegion could be enhanced #532

chrysan opened this issue Jun 22, 2022 · 3 comments

Comments

@chrysan
Copy link

chrysan commented Jun 22, 2022

When many regions are cached in memory, searchCachedRegion becomes slower and holds RegionCache global read lock for longer time, and then makes other queries who load new regions wait for write lock. When QPS grows, the mutex contention becomes even worse and query latency grows.

image

findRegionByKey waits for write lock:

goroutine 8144205746 [semacquire]:goroutine 8144205746 [semacquire]:sync.runtime_SemacquireMutex(0xc0003ae014, 0x0, 0x1) /usr/local/go/src/runtime/sema.go:71 +0x47sync.(*Mutex).lockSlow(0xc0003ae010) /usr/local/go/src/sync/mutex.go:138 +0xfcsync.(*Mutex).Lock(...) /usr/local/go/src/sync/mutex.go:81sync.(*RWMutex).Lock(0xc0003ae010) /usr/local/go/src/sync/rwmutex.go:98 +0x97github.com/pingcap/tidb/store/tikv.(*RegionCache).findRegionByKey(0xc0003ae000, 0xc4040d51a8, 0xc0eb43fce0, 0x13, 0x13, 0xc031612e00, 0x7fcaad8be1f0, 0x0, 0x40) /home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/src/github.com/pingcap/tidb/store/tikv/region_cache.go:582 +0x6c2

searchCachedRegion holds read lock:

goroutine 8056902450 [runnable]:goroutine 8056902450 [runnable]:github.com/pingcap/tidb/store/tikv.(*RegionCache).searchCachedRegion.func1(0x3886ec0, 0xc263c1d180, 0xc1864d85c0) /home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/src/github.com/pingcap/tidb/store/tikv/region_cache.go:914 +0x173github.com/google/btree.(*node).iterate(0xc323d37c80, 0xffffffffffffffff, 0x3886ec0, 0xc1864d85c0, 0x0, 0x0, 0x101, 0xc0322f0088, 0xc44df40101) /home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/pkg/mod/github.com/google/btree@v1.0.0/btree.go:557 +0x1cdgithub.com/google/btree.(*node).iterate(0xc225ed4980, 0xffffffffffffffff, 0x3886ec0, 0xc1864d85c0, 0x0, 0x0, 0x101, 0xc0322f0088, 0xc44df40101) /home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/pkg/mod/github.com/google/btree@v1.0.0/btree.go:549 +0x115github.com/google/btree.(*node).iterate(0xc34e0e7640, 0xffffffffffffffff, 0x3886ec0, 0xc1864d85c0, 0x0, 0x0, 0x101, 0xc0322f0088, 0xc44df40101) /home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/pkg/mod/github.com/google/btree@v1.0.0/btree.go:549 +0x115github.com/google/btree.(*node).iterate(0xc0d1673e40, 0xffffffffffffffff, 0x3886ec0, 0xc1864d85c0, 0x0, 0x0, 0xc0322f0101, 0xc0322f0088, 0x20) /home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/pkg/mod/github.com/google/btree@v1.0.0/btree.go:549 +0x115github.com/google/btree.(*node).iterate(0xc3d65de240, 0xffffffffffffffff, 0x3886ec0, 0xc1864d85c0, 0x0, 0x0, 0x1, 0xc0322f0088, 0x32) /home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/pkg/mod/github.com/google/btree@v1.0.0/btree.go:549 +0x115github.com/google/btree.(*BTree).DescendLessOrEqual(...) /home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/pkg/mod/github.com/google/btree@v1.0.0/btree.go:795github.com/pingcap/tidb/store/tikv.(*RegionCache).searchCachedRegion(0xc0003ae000, 0xc44df41aa0, 0x1c, 0x30, 0x11f8800, 0xb) /home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/src/github.com/pingcap/tidb/store/tikv/region_cache.go:914 +0x2aegithub.com/pingcap/tidb/store/tikv.(*RegionCache).findRegionByKey(0xc0003ae000, 0xc0322f07a8, 0xc44df41aa0, 0x1c, 0x30, 0x11bf700, 0xc0322f0318, 0x11f483c, 0x0)
@chrysan
Copy link
Author

chrysan commented Jun 23, 2022

BTW, memory usage of region cache could be tracked in case of risk of oom.

@chrysan
Copy link
Author

chrysan commented Jun 28, 2022

Another finding is, cached regions are much more than real live regions:
image
image

This use case has many "truncate table". The eviction of cached regions could be enhanced.

@disksing
Copy link
Collaborator

We may consider using skiplist as a replacement. Compared to btree, skiplist can have a smaller granularity of locks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants