Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizing hotspot small tables #25293

Closed
27 of 32 tasks
tiancaiamao opened this issue Jun 9, 2021 · 17 comments · Fixed by #29477
Closed
27 of 32 tasks

Optimizing hotspot small tables #25293

tiancaiamao opened this issue Jun 9, 2021 · 17 comments · Fixed by #29477
Assignees
Labels
challenge-program type/enhancement The issue or PR belongs to an enhancement.

Comments

@tiancaiamao
Copy link
Contributor

tiancaiamao commented Jun 9, 2021

Focus this week Current status
- Disable DDL for some cases
- Finish performance test report
Target: Make this feature GA

- KR1: Finish developing 95%
- KR2: Finish testing 90%
- KR3: Document, GA announce, etc 90%
Plan for the next 4 weeks Status indicator
Finish all testing The previous week is spend on core bank performance testing,
and the result is expected

Description

When a table is too small, it is located in just one region, that region would become a hotspot, and such hot spots would cause a performance bottleneck. By directly caching the small table data in the TiDB layer, such hot spot issues can be solved.

For a small, often used and rarely changed table, caching the whole table in memory in the TiDB server can improve performance.

Document Collection

Talent Challenge Program information

Milestones and action items

Milestone 1: Support schema change for caching table, Expected finish date: TBD

Milestone 2: Support reading and writing on cached table, Expected finish date: TBD

Milestone 3: Test with abnormal injection cases

  • Test by inject failpoint during some DDL change phase
  • Chaos testing for the read/write operation

Misc:

Limitation & Known bugs

@tiancaiamao
Copy link
Contributor Author

I'll just use the comment for the project progess here for the weekly report.
The date is recorded, so I don't need to write it.
The timeline is reversed, so the latest progress is on the bottom.

@tiancaiamao
Copy link
Contributor Author

tiancaiamao commented Oct 25, 2021

2021/10/25 This week:

@tiancaiamao
Copy link
Contributor Author

tiancaiamao commented Nov 1, 2021

2021/11/01 This week:

@tiancaiamao
Copy link
Contributor Author

2021/11/08 This week:

  • The first PR for read operation is merged
  • The next PR to fix cache condition(issue introduced by the previous one) is merged
  • Support write operation is pushed and under active review, @lcwangchao gave some insightful comments, several TODOs are found there:
    • loadDataFromOriginalTable should use a seperate transaction
    • The safety of "begin; sleep(1h)..." case is not well considered in the past
    • lock intend state should introduce a oldReadLease field to handle corner case
    • Both read and write operation need to renew lease for their locks

@JayLZhou
Copy link
Contributor

@tiancaiamao please add this issue into compatibility check with other feature

@tiancaiamao
Copy link
Contributor Author

tiancaiamao commented Dec 6, 2021

2021/12/06

  • No progress last week

Blocked by PR reviewing #30066, and it's been two weeks pending for review and 3 weeks since this work started. @bb7133

@tiancaiamao
Copy link
Contributor Author

2021/12/11

  • No progress on developing
  • The blocking PR #30066 finally get merged (after 3 weeks)
    so renew write lock lease #30206 is now ready for review after it.

@tiancaiamao
Copy link
Contributor Author

2021/12/20

  • #30206 got merged, so the development is basically done
  • Planning for testing now

@tiancaiamao
Copy link
Contributor Author

2021/12/27

  • The testing plan document was written last week
  • The design document PR was updated and is now ready for review
  • There are 2 new PRs are filed:
    • update the display of 'show create table' #30951
    • introduce a @@tidb_cache_table_lease sys variable #31018

@tiancaiamao
Copy link
Contributor Author

2022/01/03

  • This feature will not be GA in this sprint(5.4), because the testing is not finished yet
  • Several PRs are filed this week for ecosystem tools compatibility #31105, #31106, #31191
  • 1 bug is found #31077 during test and fixed
  • The project will be paused for next weeks because @tiancaiamao will be oncall in the fire brigade team

@tiancaiamao
Copy link
Contributor Author

2022/01/17

  • In the last two weeks, the project is paused because of the oncall.
    This week the SQL infra team is in the bug jail again...

@tiancaiamao
Copy link
Contributor Author

tiancaiamao commented Feb 14, 2022

2022/02/14

There is no update for a long time during the Chinese new year (Spring festival)
Last week it was restarted.

During this period of time, several bugs are found during the performance test in the core banking scene.

  • Repeated renew lease operation cause performance decrease issue/31474
  • Support prepared plan cache for cached table issue/32003
  • Aggregation push down to coprocesson conflict with cached table issue/32157

They're all fixed, and a basic QPS metric for the table cache visited is added pull/32171

@tiancaiamao
Copy link
Contributor Author

2022/02/21

The performance test in the core bank scenario is done.
After caching some tables, the latency decrease by 23.8%, and throughput increase by 27.8% in RC isolation level.
And the latency decrease by 28.4%, and throughput increase by 35% in RR isolation level.

Three issues are found in the last week:

And I've filed PR for the first two #32387, they are under reviewing.
The last one is a performance issue, it's not cause by cached table itself, it's a long existing problem of the UnionScan executor.

@tiancaiamao
Copy link
Contributor Author

tiancaiamao commented Feb 28, 2022

2022/02/28

Last week there is not much progress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
challenge-program type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants