-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: RFC for TTL tables #39264
docs: RFC for TTL tables #39264
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
/run-check-issue-triage-complete |
Co-authored-by: Mattias Jonsson <mjonss@users.noreply.github.com>
Co-authored-by: Mattias Jonsson <mjonss@users.noreply.github.com>
Co-authored-by: Mattias Jonsson <mjonss@users.noreply.github.com>
Co-authored-by: Mattias Jonsson <mjonss@users.noreply.github.com>
- Range: [10m0s, 8760h0m0s] | ||
- Default: 1h | ||
|
||
- `tidb_ttl_job_schedule_window_start_time` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't like the time window option. Data is very likely to be deleted incomplete in each time window.
- In most times, secondary indexes will not be created to avoid the hotspot of the insert. In this scene, we can also reduce some unnecessary scans by caching the statistical information. For example, we can cache the created time of the oldest row for each region after a job finished. When the next job starts, it can check the all the regions, if the data of one region do not have any updates and its cached time is not expired, just skip that region. | ||
- If a TTL table has some Tiflash replicas, we can the TiFlash instead of TiKV. | ||
- In the future, we can schedule the tasks from one job to multiple nodes instead of executing them only in one node. This approach will improve the resource utilization of the cluster. It also means we can execute more tasks in concurrency at the same time that makes the scan and delete faster. | ||
- If a table does not have any secondary index, we can do some further optimizations. One optimization is that to push down the scan and delete to TiKV side without data exchanging between TiDB and TiKV. It is somewhat like what GCWorker dose. In this way, a new coprocessor command "TTLGC" will be introduced and when a job starts, TiDB will send "TTLGC" commands to each region and TiKV will then scan and delete the expired rows (TiKV should delete expired rows in a non-transactional way). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- scan with stale read to avoid lock contention with main traffic write
- scan and delete with LOW_PRIORITY?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
integrate with resource control framework?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scan and delete with LOW_PRIORITY has already been implemented in the first version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/merge |
This pull request has been accepted and is ready to merge. Commit hash: a202c34
|
/merge |
This pull request has been accepted and is ready to merge. Commit hash: 87c3465
|
What problem does this PR solve?
Issue Number: close #39263
Problem Summary:
What is changed and how it works?
Check List
Tests
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.