-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Less contention in cache, part 4 #61250
Conversation
This is an automated comment for commit 3f9ca1c with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page
Successful checks
|
39d1c10
to
bb55a0c
Compare
…to less-contentaion-in-cache-part4
db0e2c0
to
3e4d9bf
Compare
src/Interpreters/Cache/FileCache.cpp
Outdated
if (limits_satisfied) | ||
keep_up_free_space_ratio_task->scheduleAfter(general_reschedule_ms); | ||
else | ||
keep_up_free_space_ratio_task->schedule(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it's a realistic scenario: if a task fails to free up the space repetitively (collectCandidatesForEviction
returns false
), it will be re-scheduled for immediate processing. Can it stuck in this state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is not a realistic scenario that we fail to free up required space because of not enough evictable/freeable cache entries, but limits_satisfied
will also be false
if we did not free up all the needed space because of the eviction batch size (by default it is quite small), so I thought it makes sense to schedule immediately in this case. I do not think it can get stuck in this state because this could possibly happen if cache constantly grows in size at a high speed, but in this case we should also get into lock contention case and get into another reschedule here:
ClickHouse/src/Interpreters/Cache/FileCache.cpp
Lines 969 to 973 in 3e4d9bf
auto lock = tryLockCache(); | |
/// To avoid deteriorating contention on cache, | |
/// proceed only if cache is not heavily used. | |
if (!lock) |
I will add profile event for the number of eviction tries, so we will be able to see if it is stuck in constant reschedule on the dashboard.
Looks like
02967_analyzer_fuzz
02967_analyzer_fuzz Stateless tests (release) — fail: 1, passed: 6497, skipped: 12 02967_analyzer_fuzz Stateless tests (tsan) [4/5] — fail: 1, passed: 1244, skipped: 18 02967_analyzer_fuzz Stateless tests (ubsan) [1/2] — fail: 1, passed: 3201, skipped: 31 02967_analyzer_fuzz
It complains about:
so not related to changes as well
|
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Less contention in filesystem cache (part 4). Allow to keep filesystem cache not filled to the limit by doing additional eviction in the background (controlled by
keep_free_space_size(elements)_ratio
). This allows to release pressure from space reservation for queries (ontryReserve
method). Also this is done in a lock free way as much as possible, e.g. should not block normal cache usage.Contains commits from #61163, so should be merged/reviewed after.
Modify your CI run:
NOTE: If your merge the PR with modified CI you MUST KNOW what you are doing
NOTE: Checked options will be applied if set before CI RunConfig/PrepareRunConfig step
Include tests (required builds will be added automatically):
Exclude tests:
Extra options:
Only specified batches in multi-batch jobs: