Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: fix stale read ops metric #878

Merged
merged 8 commits into from
Jul 13, 2023
Merged

Conversation

crazycs520
Copy link
Contributor

@crazycs520 crazycs520 commented Jul 11, 2023

The stale read ops metric is not right for some time. I found this issue when test stale read with reload all tikv node.

This PR is base on #877

image

after fixed in this PR, the stale read ops, stale read req ops and stale read traffic metrics is like following when reload all tikv:

image

Signed-off-by: crazycs520 <crazycs520@gmail.com>
Signed-off-by: crazycs520 <crazycs520@gmail.com>
Signed-off-by: crazycs520 <crazycs520@gmail.com>
Signed-off-by: crazycs520 <crazycs520@gmail.com>
@crazycs520 crazycs520 marked this pull request as ready for review July 11, 2023 09:38
Signed-off-by: crazycs520 <crazycs520@gmail.com>
Signed-off-by: crazycs520 <crazycs520@gmail.com>
@cfzjywxk cfzjywxk requested review from cfzjywxk, you06 and ekexium July 12, 2023 05:01
internal/locate/region_request.go Outdated Show resolved Hide resolved
internal/locate/region_request.go Outdated Show resolved Hide resolved
Signed-off-by: crazycs520 <crazycs520@gmail.com>
Signed-off-by: crazycs520 <crazycs520@gmail.com>
var isLocalTraffic bool
if staleReadCollector != nil && s.replicaSelector != nil {
if target := s.replicaSelector.targetReplica(); target != nil {
isLocalTraffic = target.store.IsLabelsMatch(s.replicaSelector.labels)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's no label set, is the logic working as expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no label set, isLocalTraffic will always be true. All traffic and qps will be local. I think this is expected.

if staleReadCollector != nil && s.replicaSelector != nil {
if target := s.replicaSelector.targetReplica(); target != nil {
isLocalTraffic = target.store.IsLabelsMatch(s.replicaSelector.labels)
staleReadCollector.onReq(req, isLocalTraffic)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move the statistic code after s.sendReqToRegion finishes with no rpc error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, since function staleReadCollector .onResp will be called after rpc finished.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And always need to record request qps and traffic no matter request success.

@cfzjywxk cfzjywxk merged commit 1e3aeab into tikv:tidb-6.5 Jul 13, 2023
crazycs520 added a commit to crazycs520/client-go that referenced this pull request Jul 13, 2023
Signed-off-by: crazycs520 <crazycs520@gmail.com>
cfzjywxk pushed a commit that referenced this pull request Jul 13, 2023
Signed-off-by: crazycs520 <crazycs520@gmail.com>
crazycs520 added a commit to crazycs520/client-go that referenced this pull request Jul 13, 2023
Signed-off-by: crazycs520 <crazycs520@gmail.com>
disksing added a commit that referenced this pull request Jul 14, 2023
Signed-off-by: crazycs520 <crazycs520@gmail.com>
Co-authored-by: disksing <i@disksing.com>
iosmanthus added a commit that referenced this pull request Aug 11, 2023
* client-go: add some key range info to error when PD returned no region (#862)

Signed-off-by: Chao Wang <cclcwangchao@hotmail.com>

* *: refine non-global stale-read request retry logic (#863)

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* Fix the issue that primary pessimistic lock may be left not cleared after GC (#866)

* Fix the issue that primary pessimistic lock may be left not cleared after GC

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

* Fix mysteriously shown up thing that makes compilation failed

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

* Fix test effectiveness (forgot to set txn2 to pessimistic txn); add more strict checks

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

* Address comments

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

---------

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Co-authored-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

* add explicit request source type to label the external request like lightning/br (#868)

Signed-off-by: nolouch <nolouch@gmail.com>

* use '%d' instead of '%q' for some int values in error message (#875)

Signed-off-by: Chao Wang <cclcwangchao@hotmail.com>

* format key in error message in method `scanRegions` (#876)

Signed-off-by: Chao Wang <cclcwangchao@hotmail.com>

* make cop request timeout a config paramter (#865)

* update

Signed-off-by: Spade A <u6748471@anu.edu.au>

* update

Signed-off-by: Spade A <u6748471@anu.edu.au>

* update

Signed-off-by: Spade A <u6748471@anu.edu.au>

* update

Signed-off-by: Spade A <u6748471@anu.edu.au>

---------

Signed-off-by: Spade A <u6748471@anu.edu.au>

* region_cache: support check pending tiflash peer (#821)

Signed-off-by: guo-shaoge <shaoge1994@163.com>
Co-authored-by: disksing <i@disksing.com>

* *: add `SnapshotIterReverse` and make `iterReverse` supports `lowerBound` (#883)

Signed-off-by: Jason Mo <mohangjie1995@gmail.com>

* *: fix stale read ops metric (#878) (#889)

Signed-off-by: crazycs520 <crazycs520@gmail.com>
Co-authored-by: disksing <i@disksing.com>

* add gc options (#828)

Signed-off-by: weedge <weege007@gmail.com>
Co-authored-by: disksing <i@disksing.com>

* reload region cache when store is resolved from invalid status (#843)

Signed-off-by: you06 <you1474600@gmail.com>
Co-authored-by: disksing <i@disksing.com>

* ci: update setup-go action (#904)

Signed-off-by: disksing <i@disksing.com>

* fix unexpected slow query during GC running after stop 1 tikv-server (#899) (#909)

* fix unexpected slow query during GC running after stop 1 tikv-server

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* fix test

Signed-off-by: crazycs520 <crazycs520@gmail.com>

---------

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* resource_manager: ignore ru metrics for background request (#872)

Signed-off-by: husharp <jinhao.hu@pingcap.com>
Co-authored-by: disksing <i@disksing.com>

* add more log for diagnose (#915)

* add more log for diagnose

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* fix

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* add more log for diagnose

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* add more log

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* address comment

Signed-off-by: crazycs520 <crazycs520@gmail.com>

---------

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* use context logger as much as possible (#908)

* use context logger as much as possible

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* refine

Signed-off-by: crazycs520 <crazycs520@gmail.com>

---------

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* Resume max retry time check for stale read retry with leader option(#903) (#911)

* Resume max retry time check for stale read retry with leader option

Signed-off-by: cfzjywxk <lsswxrxr@163.com>

* add cancel

Signed-off-by: cfzjywxk <lsswxrxr@163.com>

---------

Signed-off-by: cfzjywxk <lsswxrxr@163.com>

* request_source: remove default label (#890)

* request_source: remove default label

Signed-off-by: nolouch <nolouch@gmail.com>

* add a function to set request source task type (#925)

* add a function to set request source task type

Signed-off-by: glorv <glorvs@163.com>

* ci: update go version (#936)

* ci: update go version

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* fix test

Signed-off-by: crazycs520 <crazycs520@gmail.com>

---------

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* use tidb_kv_read_timeout as first kv request timeout (#919)

* support tidb_kv_read_timeout as first round kv request timeout

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* fix ci

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* fix ci

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* fix ci

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* fix ci

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* fix ci

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* update comment

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* refine test

Signed-off-by: crazycs520 <crazycs520@gmail.com>

---------

Signed-off-by: crazycs520 <crazycs520@gmail.com>

* [pick] resource_control: bypass some internal urgent request (#938)

* resource_control: bypass some internal urgent request (#884)

Signed-off-by: nolouch <nolouch@gmail.com>

* resourcecontrol: fix nil pointer (#900)

Signed-off-by: nolouch <nolouch@gmail.com>

---------

Signed-off-by: nolouch <nolouch@gmail.com>

---------

Signed-off-by: Chao Wang <cclcwangchao@hotmail.com>
Signed-off-by: crazycs520 <crazycs520@gmail.com>
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Signed-off-by: nolouch <nolouch@gmail.com>
Signed-off-by: Spade A <u6748471@anu.edu.au>
Signed-off-by: guo-shaoge <shaoge1994@163.com>
Signed-off-by: Jason Mo <mohangjie1995@gmail.com>
Signed-off-by: weedge <weege007@gmail.com>
Signed-off-by: you06 <you1474600@gmail.com>
Signed-off-by: disksing <i@disksing.com>
Signed-off-by: husharp <jinhao.hu@pingcap.com>
Signed-off-by: cfzjywxk <lsswxrxr@163.com>
Signed-off-by: glorv <glorvs@163.com>
Signed-off-by: iosmanthus <myosmanthustree@gmail.com>
Co-authored-by: 王超 <cclcwangchao@hotmail.com>
Co-authored-by: crazycs <crazycs520@gmail.com>
Co-authored-by: MyonKeminta <9948422+MyonKeminta@users.noreply.github.com>
Co-authored-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Co-authored-by: ShuNing <nolouch@gmail.com>
Co-authored-by: Spade  A <71589810+SpadeA-Tang@users.noreply.github.com>
Co-authored-by: guo-shaoge <shaoge1994@163.com>
Co-authored-by: disksing <i@disksing.com>
Co-authored-by: Hangjie Mo <mohangjie1995@gmail.com>
Co-authored-by: weedge <weege007@gmail.com>
Co-authored-by: you06 <you1474600@gmail.com>
Co-authored-by: Hu# <jinhao.hu@pingcap.com>
Co-authored-by: cfzjywxk <lsswxrxr@163.com>
Co-authored-by: glorv <glorvs@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants