Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support choose the replicas by labels #21540

Closed
Yisaer opened this issue Dec 7, 2020 · 10 comments
Closed

Support choose the replicas by labels #21540

Yisaer opened this issue Dec 7, 2020 · 10 comments
Assignees
Labels
type/enhancement The issue or PR belongs to an enhancement.

Comments

@Yisaer
Copy link
Contributor

Yisaer commented Dec 7, 2020

Development Task

In #21094, we are going to support the stale read for timestamp bounded read-only transactions. Thus, the tidb-server would be allowed to read data from follower or leanrer peer though they may haven't the latest data If the target peer and the tidb-server located in the same dc.

Here is one example:

image

Implementation

To achieve this target, we need to solve 2 problem:

  1. How to know the dc information for each region peer and tidb-server.
  2. How to let tidb-server send tikvRequest by dc selector(choose the same dc replica) instead of role selector(only choose leader)

For the first problem, we can match the value of zone label for the tidb-server and the target peer's located store. For example, if the tidb-server-A has the label zone=dc-1and store-A have the label zone=dc-1 too, let's say they located in the same dc.

For the second problem, there exist 2 ways to fetch data from kv storage, the coprocessor and the snapshot. Each of them will use RegionStore to get the target store address by the region they wanted to read. The RegionStore maintain the stores of the region.

accessIndex    [NumAccessMode][]int

In RegionStore, accessIndex used to control whether send kvrequest to TiKV or TiFlash, similarly, we can maintain the dcIndex to store the region's store information groups by the dc-location. And in store/tikv/RPCContext, we will maintain the label selector for each tikvRequest.

@Yisaer Yisaer added the type/enhancement The issue or PR belongs to an enhancement. label Dec 7, 2020
@djshow832
Copy link
Contributor

I suggest maintaining label in Store and do not add a label selector. RPCContext contains the peer that you have chosen.
Besides, consider the following cases:

  1. There is more than one store in the same DC. You should select a store randomly or in a round-robin style to avoid hotspot. (YugabyteDB chooses randomly selecting one, see https://github.com/yugabyte/yugabyte-db/blob/master/src/yb/client/client-internal.cc#L318).
  2. There is no store in the same DC. Should you read the leader or randomly choose a replica?

@Yisaer
Copy link
Contributor Author

Yisaer commented Dec 8, 2020

There is no store in the same DC. Should you read the leader or randomly choose a replica?

I haven't decided yet. I think we can choose the leader directly as the fallback.

@nolouch
Copy link
Member

nolouch commented Dec 9, 2020

We should consider user behavior about how to enable the choose by labels:

  1. Session Variable, such as follower read, user use set @@tidb_replica_read = '<target value>';
  2. Special SQL, START TRANSACTION READ ONLY WITH TIMESTAMP BOUND EXACT STALENESS '00:00:05'; Will the label be automatically selected by label or need additional session variables?

@Yisaer
Copy link
Contributor Author

Yisaer commented Dec 9, 2020

I tend to create a new session variable like @@tidb_match_label_replica_read (default empty).

  1. If it is defined, like set @@tidb_match_label_replica_read = 'zone=dc-1,key=value', the tidb-server will try to read the data from matched replicas.

  2. If tidb_match_label_replica_read is defined and tidb_replica_read is defined as leader, the tidb-server will directly read data from the leader without considering label matching.

  3. If tidb_match_label_replica_read is defined and tidb_replica_read is defined as follower, the tidb-server will select the follower replicas which match the labels to read.

WDYT? @nolouch @djshow832

@nolouch
Copy link
Member

nolouch commented Dec 10, 2020

can we extend tidb_replica_read rather than add new?

@Yisaer
Copy link
Contributor Author

Yisaer commented Dec 10, 2020

What about tidb_replica_read=txn_scope? Then tidb-server would read replicas which have the zone=txnScope label and read leader as the fallback.

@nolouch
Copy link
Member

nolouch commented Dec 10, 2020

@Yisaer How to limit the scope on stale read ?

@djshow832
Copy link
Contributor

Another option:
After the user sets txn_scope in the TiDB configuration and a stale read transaction is started, TiDB will read the replica whose label contains txn_scope. That is, txn_scope replaces label for TiDB instances.
Pros: Users don't need to specify any session variables in applications, which reduces complexity.
Cons: The configuration name txn_scope is not suitable in this scenario.

@nolouch
Copy link
Member

nolouch commented Dec 10, 2020

@djshow832 I think it's ok in the stale read scenario. In the other normal read scenario, I think maybe we also need to consider an option to switch to read on the matching label.

@djshow832
Copy link
Contributor

@djshow832 I think it's ok in the stale read scenario. In the other normal read scenario, I think maybe we also need to consider an option to switch to read on the matching label.

What are the normal read scenarios? Any examples?

@Yisaer Yisaer closed this as completed Aug 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

3 participants