-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: fix stale read ops metric #878
Conversation
Signed-off-by: crazycs520 <crazycs520@gmail.com>
Signed-off-by: crazycs520 <crazycs520@gmail.com>
Signed-off-by: crazycs520 <crazycs520@gmail.com>
Signed-off-by: crazycs520 <crazycs520@gmail.com>
Signed-off-by: crazycs520 <crazycs520@gmail.com>
Signed-off-by: crazycs520 <crazycs520@gmail.com>
var isLocalTraffic bool | ||
if staleReadCollector != nil && s.replicaSelector != nil { | ||
if target := s.replicaSelector.targetReplica(); target != nil { | ||
isLocalTraffic = target.store.IsLabelsMatch(s.replicaSelector.labels) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's no label set, is the logic working as expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If no label set, isLocalTraffic
will always be true. All traffic and qps will be local. I think this is expected.
if staleReadCollector != nil && s.replicaSelector != nil { | ||
if target := s.replicaSelector.targetReplica(); target != nil { | ||
isLocalTraffic = target.store.IsLabelsMatch(s.replicaSelector.labels) | ||
staleReadCollector.onReq(req, isLocalTraffic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we move the statistic code after s.sendReqToRegion
finishes with no rpc error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, since function staleReadCollector .onResp
will be called after rpc finished.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And always need to record request qps and traffic no matter request success.
Signed-off-by: crazycs520 <crazycs520@gmail.com>
Signed-off-by: crazycs520 <crazycs520@gmail.com>
* client-go: add some key range info to error when PD returned no region (#862) Signed-off-by: Chao Wang <cclcwangchao@hotmail.com> * *: refine non-global stale-read request retry logic (#863) Signed-off-by: crazycs520 <crazycs520@gmail.com> * Fix the issue that primary pessimistic lock may be left not cleared after GC (#866) * Fix the issue that primary pessimistic lock may be left not cleared after GC Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com> * Fix mysteriously shown up thing that makes compilation failed Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com> * Fix test effectiveness (forgot to set txn2 to pessimistic txn); add more strict checks Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com> * Address comments Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com> --------- Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com> Co-authored-by: MyonKeminta <MyonKeminta@users.noreply.github.com> * add explicit request source type to label the external request like lightning/br (#868) Signed-off-by: nolouch <nolouch@gmail.com> * use '%d' instead of '%q' for some int values in error message (#875) Signed-off-by: Chao Wang <cclcwangchao@hotmail.com> * format key in error message in method `scanRegions` (#876) Signed-off-by: Chao Wang <cclcwangchao@hotmail.com> * make cop request timeout a config paramter (#865) * update Signed-off-by: Spade A <u6748471@anu.edu.au> * update Signed-off-by: Spade A <u6748471@anu.edu.au> * update Signed-off-by: Spade A <u6748471@anu.edu.au> * update Signed-off-by: Spade A <u6748471@anu.edu.au> --------- Signed-off-by: Spade A <u6748471@anu.edu.au> * region_cache: support check pending tiflash peer (#821) Signed-off-by: guo-shaoge <shaoge1994@163.com> Co-authored-by: disksing <i@disksing.com> * *: add `SnapshotIterReverse` and make `iterReverse` supports `lowerBound` (#883) Signed-off-by: Jason Mo <mohangjie1995@gmail.com> * *: fix stale read ops metric (#878) (#889) Signed-off-by: crazycs520 <crazycs520@gmail.com> Co-authored-by: disksing <i@disksing.com> * add gc options (#828) Signed-off-by: weedge <weege007@gmail.com> Co-authored-by: disksing <i@disksing.com> * reload region cache when store is resolved from invalid status (#843) Signed-off-by: you06 <you1474600@gmail.com> Co-authored-by: disksing <i@disksing.com> * ci: update setup-go action (#904) Signed-off-by: disksing <i@disksing.com> * fix unexpected slow query during GC running after stop 1 tikv-server (#899) (#909) * fix unexpected slow query during GC running after stop 1 tikv-server Signed-off-by: crazycs520 <crazycs520@gmail.com> * fix test Signed-off-by: crazycs520 <crazycs520@gmail.com> --------- Signed-off-by: crazycs520 <crazycs520@gmail.com> * resource_manager: ignore ru metrics for background request (#872) Signed-off-by: husharp <jinhao.hu@pingcap.com> Co-authored-by: disksing <i@disksing.com> * add more log for diagnose (#915) * add more log for diagnose Signed-off-by: crazycs520 <crazycs520@gmail.com> * fix Signed-off-by: crazycs520 <crazycs520@gmail.com> * add more log for diagnose Signed-off-by: crazycs520 <crazycs520@gmail.com> * add more log Signed-off-by: crazycs520 <crazycs520@gmail.com> * address comment Signed-off-by: crazycs520 <crazycs520@gmail.com> --------- Signed-off-by: crazycs520 <crazycs520@gmail.com> * use context logger as much as possible (#908) * use context logger as much as possible Signed-off-by: crazycs520 <crazycs520@gmail.com> * refine Signed-off-by: crazycs520 <crazycs520@gmail.com> --------- Signed-off-by: crazycs520 <crazycs520@gmail.com> * Resume max retry time check for stale read retry with leader option(#903) (#911) * Resume max retry time check for stale read retry with leader option Signed-off-by: cfzjywxk <lsswxrxr@163.com> * add cancel Signed-off-by: cfzjywxk <lsswxrxr@163.com> --------- Signed-off-by: cfzjywxk <lsswxrxr@163.com> * request_source: remove default label (#890) * request_source: remove default label Signed-off-by: nolouch <nolouch@gmail.com> * add a function to set request source task type (#925) * add a function to set request source task type Signed-off-by: glorv <glorvs@163.com> * ci: update go version (#936) * ci: update go version Signed-off-by: crazycs520 <crazycs520@gmail.com> * fix test Signed-off-by: crazycs520 <crazycs520@gmail.com> --------- Signed-off-by: crazycs520 <crazycs520@gmail.com> * use tidb_kv_read_timeout as first kv request timeout (#919) * support tidb_kv_read_timeout as first round kv request timeout Signed-off-by: crazycs520 <crazycs520@gmail.com> * fix ci Signed-off-by: crazycs520 <crazycs520@gmail.com> * fix ci Signed-off-by: crazycs520 <crazycs520@gmail.com> * fix ci Signed-off-by: crazycs520 <crazycs520@gmail.com> * fix ci Signed-off-by: crazycs520 <crazycs520@gmail.com> * fix ci Signed-off-by: crazycs520 <crazycs520@gmail.com> * update comment Signed-off-by: crazycs520 <crazycs520@gmail.com> * refine test Signed-off-by: crazycs520 <crazycs520@gmail.com> --------- Signed-off-by: crazycs520 <crazycs520@gmail.com> * [pick] resource_control: bypass some internal urgent request (#938) * resource_control: bypass some internal urgent request (#884) Signed-off-by: nolouch <nolouch@gmail.com> * resourcecontrol: fix nil pointer (#900) Signed-off-by: nolouch <nolouch@gmail.com> --------- Signed-off-by: nolouch <nolouch@gmail.com> --------- Signed-off-by: Chao Wang <cclcwangchao@hotmail.com> Signed-off-by: crazycs520 <crazycs520@gmail.com> Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com> Signed-off-by: nolouch <nolouch@gmail.com> Signed-off-by: Spade A <u6748471@anu.edu.au> Signed-off-by: guo-shaoge <shaoge1994@163.com> Signed-off-by: Jason Mo <mohangjie1995@gmail.com> Signed-off-by: weedge <weege007@gmail.com> Signed-off-by: you06 <you1474600@gmail.com> Signed-off-by: disksing <i@disksing.com> Signed-off-by: husharp <jinhao.hu@pingcap.com> Signed-off-by: cfzjywxk <lsswxrxr@163.com> Signed-off-by: glorv <glorvs@163.com> Signed-off-by: iosmanthus <myosmanthustree@gmail.com> Co-authored-by: 王超 <cclcwangchao@hotmail.com> Co-authored-by: crazycs <crazycs520@gmail.com> Co-authored-by: MyonKeminta <9948422+MyonKeminta@users.noreply.github.com> Co-authored-by: MyonKeminta <MyonKeminta@users.noreply.github.com> Co-authored-by: ShuNing <nolouch@gmail.com> Co-authored-by: Spade A <71589810+SpadeA-Tang@users.noreply.github.com> Co-authored-by: guo-shaoge <shaoge1994@163.com> Co-authored-by: disksing <i@disksing.com> Co-authored-by: Hangjie Mo <mohangjie1995@gmail.com> Co-authored-by: weedge <weege007@gmail.com> Co-authored-by: you06 <you1474600@gmail.com> Co-authored-by: Hu# <jinhao.hu@pingcap.com> Co-authored-by: cfzjywxk <lsswxrxr@163.com> Co-authored-by: glorv <glorvs@163.com>
The stale read ops metric is not right for some time. I found this issue when test stale read with reload all tikv node.
This PR is base on #877
after fixed in this PR, the
stale read ops
,stale read req ops
andstale read traffic
metrics is like following when reload all tikv: