Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More metrics about write stall cause by disk IO during apply raftlog #8448

Closed
JaySon-Huang opened this issue Dec 1, 2023 · 0 comments · Fixed by #8446
Closed

More metrics about write stall cause by disk IO during apply raftlog #8448

JaySon-Huang opened this issue Dec 1, 2023 · 0 comments · Fixed by #8446
Labels
type/enhancement The issue or PR belongs to an enhancement.

Comments

@JaySon-Huang
Copy link
Contributor

JaySon-Huang commented Dec 1, 2023

Enhancement

In one investigation, we found that there there maybe write stall caused by the logic directly generating a ColumnFileTiny persisted on PageStorage. This will happen when the raft log contains more than 1k rows or 1MB bytes.
And because it involves disk IO on PageStorage, it could lead to slow RaftLog apply, write stall happens.

bool is_small = limit < dm_context->delta_cache_limit_rows / 4
&& alloc_bytes < dm_context->delta_cache_limit_bytes / 4;
// For small column files, data is appended to MemTableSet, then flushed later.
// For large column files, data is directly written to PageStorage, while the ColumnFile entry is appended to MemTableSet.
if (is_small)
{
if (segment->writeToCache(*dm_context, block, offset, limit))
{
updated_segments.push_back(segment);
break;
}
}
else
{
// If column file haven't been written, or the pk range has changed since last write, then write it and
// delete former written column file.
if (!write_column_file || (write_column_file && write_range != rowkey_range))
{
wbs.rollbackWrittenLogAndData();
wbs.clear();
// In this case we will construct a ColumnFile that does not contain block data in the memory.
// The block data has been written to PageStorage in wbs.
write_column_file = ColumnFileTiny::writeColumnFile(*dm_context, block, offset, limit, wbs);
wbs.writeLogAndData();
write_range = rowkey_range;
}
// Write could fail, because other threads could already updated the instance. Like split/merge, merge delta.
if (segment->writeToDisk(*dm_context, write_column_file))
{
updated_segments.push_back(segment);
break;
}
}
}

We need more metrics to identify this situation and trouble shooting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant