Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: enhancement of ts validity check for all types of requests #57786

Open
7 tasks
cfzjywxk opened this issue Nov 28, 2024 · 1 comment
Open
7 tasks

server: enhancement of ts validity check for all types of requests #57786

cfzjywxk opened this issue Nov 28, 2024 · 1 comment
Labels
affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. priority/P0 The issue has P0 priority. severity/major sig/transaction SIG:Transaction type/bug The issue is confirmed as a bug. type/enhancement The issue or PR belongs to an enhancement.

Comments

@cfzjywxk
Copy link
Contributor

cfzjywxk commented Nov 28, 2024

Enhancement

Currently, some KV request-related and timestamp-related interfaces are directly exposed to the upper-layer TiDB. This can lead to issues due to inappropriate usages, including:

  1. Incomplete KV Request Error Handling:
    For example, inadequate handling of various KV error types, such as backoff, retries, and others.

  2. Incorrect Manually Specified Timestamps:
    Providing timestamps that exceed the maximum pre-allocated by PD can break the linearizability of the entire cluster.

Similar issues have occurred, such as:

These problems highlight the need to refine these interfaces and enforce stricter controls.

For internal task execution, internal sessions and internal transactions should be used to operate, avoiding directly assembling and sending KV requests.

Possible enhancements

  1. Restrict the interfaces of distsql and its executorBuilder. For exmaple, a executor builder offer interface to set the startTS without any reminding and checking https://github.com/pingcap/tidb/blob/master/pkg/distsql/request_builder.go#L251.
  2. Restrict the usages of Storage interfaces. For example, a snapshot could be generated by any input timestampes https://github.com/pingcap/tidb/blob/master/pkg/kv/kv.go#L695.
  3. Adding ts validity check in both tidb and tikv. For example, checking if the being used timestamp is within the gap of some configured allowed or tolerated clockdifft, return error if it's a problematic ts.

There are mainly 3 interfaces provided for the usages in tidb from the persipective of kv-client and storage

  • The Storage interface, it exposes the Snapshot(TS) interface
  • The DistSQL interface, it exposes the SetStartTS interface
  • The Transaction interface

Detailed tasks

TiDB

  • Deprecate the exposed [SetStartTS](https://github.com/pingcap/tidb/blob/master/pkg/distsql/request_builder.go#L251) interface. Internal tasks should no longer set any timestamp value directly. Instead, provide options such as AllocatedTS or MaxUint64TS. For DAG builders within a transaction, introduce a builderWithTxn interface to enforce using the activated transaction's startTS.
  • Add constraint checks to ensure only PD-allocated timestamps or MaxUint64 are used when fetching snapshots through the Storage.Snapshot interface.
  • Unify timestamp validation logic for tidb_snapshot and AS OF TIMESTAMP features related to stale reads.
  • (Long term) Abstract a TimeStamp type similar to TiKV's implementation, adding a field to indicate the source of the timestamp (e.g., whether it is allocated by the PD server or calculated).

Client-go

  • Try to validate read ts for all RPC requests tikv/client-go#1513 Implement a timestamp validation mechanism to ensure the read timestamp is valid for all KV read request types(actually write request types are needed too). If the condition is violated, make the TiDB server panic (or use log.Fatal) and provide detailed source request information.
    • For new releases, consider adding a configuration option or system variable to set the allowable gap.
    • For stable releases, log errors with detailed debug information instead of triggering a panic.

TiKV

  • Enhancement: Add timestamp safety boundary to prevent unreasonable max_ts updates tikv/tikv#17916
    Add a timestamp validation mechanism to ensure that updates to the max_ts value remain within an allowable gap from the PD-allocated timestamp. If the gap is exceeded, make the TiKV server panic and log the source request details.
    • For stable releases, log detailed error information rather than panicking.
  • (Long term) Extend the TimeStamp type to include a field indicating its source. This field can be used to validate whether the timestamp is eligible for updating the max_ts.
@cfzjywxk cfzjywxk added type/enhancement The issue or PR belongs to an enhancement. priority/P0 The issue has P0 priority. labels Nov 28, 2024
@cfzjywxk cfzjywxk added type/bug The issue is confirmed as a bug. sig/transaction SIG:Transaction severity/major affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. labels Dec 9, 2024
@ti-chi-bot ti-chi-bot bot added may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 labels Dec 9, 2024
@cfzjywxk
Copy link
Contributor Author

cfzjywxk commented Dec 9, 2024

The bug label is used to pick these check enhancements to LTS release versions.

@cfzjywxk cfzjywxk removed may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 labels Dec 10, 2024
@cfzjywxk cfzjywxk changed the title server: prevent KV request-related interfaces and user-specified timestamp interfaces from being called directly server: enhancement of ts validity check for all types of requests Dec 13, 2024
ekexium pushed a commit to ekexium/tidb that referenced this issue Jan 14, 2025
ekexium pushed a commit to ekexium/tidb that referenced this issue Jan 14, 2025
ekexium pushed a commit to ekexium/tidb that referenced this issue Jan 14, 2025
ekexium pushed a commit to ekexium/tidb that referenced this issue Jan 14, 2025
ekexium pushed a commit to ekexium/tidb that referenced this issue Jan 15, 2025
ekexium pushed a commit to ekexium/tidb that referenced this issue Jan 15, 2025
ekexium pushed a commit to ekexium/tidb that referenced this issue Jan 15, 2025
ekexium pushed a commit to ekexium/tidb that referenced this issue Jan 15, 2025
ekexium pushed a commit to ekexium/tidb that referenced this issue Jan 15, 2025
ekexium pushed a commit to ekexium/tidb that referenced this issue Jan 15, 2025
ekexium pushed a commit to ekexium/tidb that referenced this issue Jan 15, 2025
ekexium pushed a commit to ekexium/tidb that referenced this issue Jan 15, 2025
ekexium pushed a commit to ekexium/tidb that referenced this issue Jan 15, 2025
ekexium pushed a commit to ekexium/tidb that referenced this issue Jan 15, 2025
ekexium pushed a commit to ekexium/tidb that referenced this issue Jan 16, 2025
ekexium pushed a commit to ekexium/tidb that referenced this issue Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. priority/P0 The issue has P0 priority. severity/major sig/transaction SIG:Transaction type/bug The issue is confirmed as a bug. type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

1 participant