Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tiflash can't gc completely after set tiflash replica 0 #5975

Closed
lilinghai opened this issue Sep 21, 2022 · 2 comments · Fixed by #6010
Closed

tiflash can't gc completely after set tiflash replica 0 #5975

lilinghai opened this issue Sep 21, 2022 · 2 comments · Fixed by #6010
Assignees

Comments

@lilinghai
Copy link

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

image
image

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiFlash version? (Required)

master

@lilinghai lilinghai added the type/bug The issue is confirmed as a bug. label Sep 21, 2022
@breezewish breezewish self-assigned this Sep 21, 2022
@breezewish
Copy link
Member

breezewish commented Sep 21, 2022

Since we use a lot of logical split now, the estimated rows and size (which is based on pack statistics) is no longer accurate. The inaccurate rows and size misleads the GC and prevented DTFiles being recycled.

We also need to continuously observe whether there are other problems when the estimated rows and size is very inaccurate.

There are two possible solution directions:

A. Make estimated rows and size more accurate.
B. Always use the real rows and size to drive segment GC.

@JaySon-Huang
Copy link
Contributor

An example is that a segment with range [27192,30593), while the DMFile in stable contains two packs. The first pack contains rows [16994, 23791) ∪ [33991, 35386) and the second pack contains rows [35386, 37381).

No data is valid under the segment, which is generated by logical split, but now we can not execute gc on segment like this. Now the gc routines use the min-max index for decision-making

bool shouldCompactStableWithTooMuchDataOutOfSegmentRange(const DMContext & context, //
const SegmentPtr & seg,
const SegmentSnapshotPtr & snap,
const SegmentPtr & prev_seg,
const SegmentPtr & next_seg,
double invalid_data_ratio_threshold,
const LoggerPtr & log)
{
auto [first_pack_included, last_pack_included] = snap->stable->isFirstAndLastPackIncludedInRange(context, seg->getRowKeyRange());
// Do a quick check about whether the DTFile is completely included in the segment range

std::pair<bool, bool> StableValueSpace::Snapshot::isFirstAndLastPackIncludedInRange(const DMContext & context, const RowKeyRange & range) const
{
// Usually, this method will be called for some "cold" key ranges.
// Loading the index into cache may pollute the cache and make the hot index cache invalid.
// So don't refill the cache if the index does not exist.
bool first_pack_included = false;
bool last_pack_included = false;
for (size_t i = 0; i < stable->files.size(); i++)
{
const auto & file = stable->files[i];
auto filter = DMFilePackFilter::loadFrom(
file,
context.db_context.getGlobalContext().getMinMaxIndexCache(),
/*set_cache_if_miss*/ false,
{range},
RSOperatorPtr{},
IdSetPtr{},
context.db_context.getFileProvider(),
context.getReadLimiter(),
context.tracing_id);
const auto & use_packs = filter.getUsePacks();
if (i == 0)
{
// TODO: this check may not be correct when support multiple files in a stable, let's just keep it now for simplicity
first_pack_included = use_packs.empty() || use_packs[0];
}
if (i == stable->files.size() - 1)
{
// TODO: this check may not be correct when support multiple files in a stable, let's just keep it now for simplicity
last_pack_included = use_packs.empty() || use_packs.back();
}
}
return std::make_pair(first_pack_included, last_pack_included);
}

JaySon-Huang pushed a commit that referenced this issue Sep 22, 2022
close #5975

Signed-off-by: Wish <breezewish@outlook.com>
ti-chi-bot pushed a commit that referenced this issue Oct 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants