Skip to content

Commit

Permalink
r/stm_manager: yield to prevent busy loops when apply results in error
Browse files Browse the repository at this point in the history
If an apply results in error the stm manager should yield to prevent
busy looping and consuming a lot of CPU cycles. Fixed yielding in
background apply fiber and added a yield if foreground apply didn't make
progress.

Signed-off-by: Michał Maślanka <michal@redpanda.com>
  • Loading branch information
mmaslankaprv authored and bharathv committed May 22, 2024
1 parent 08af2b2 commit e121770
Showing 1 changed file with 18 additions and 3 deletions.
21 changes: 18 additions & 3 deletions src/v/raft/state_machine_manager.cc
Original file line number Diff line number Diff line change
Expand Up @@ -363,6 +363,18 @@ ss::future<> state_machine_manager::try_apply_in_foreground() {
batch_applicator(default_ctx, machines, _as, _log),
model::no_timeout);

if (max_last_applied == model::offset{}) {
vlog(
_log.warn,
"no progress has been made during state machine apply. Current "
"next offset: {}",
_next);
/**
* If no progress has been made, yield to prevent busy looping
*/
co_await ss::sleep_abortable(100ms, _as);
co_return;
}
_next = std::max(model::next_offset(max_last_applied), _next);
vlog(_log.trace, "updating _next offset with: {}", _next);
} catch (const ss::timed_out_error&) {
Expand Down Expand Up @@ -433,10 +445,13 @@ ss::future<> state_machine_manager::background_apply_fiber(
try {
model::record_batch_reader reader = co_await _raft->make_reader(
config);
co_await std::move(reader).consume(
auto last_applied_before = entry->stm->last_applied_offset();
auto last_applied_after = co_await std::move(reader).consume(
batch_applicator(background_ctx, {entry}, _as, _log),
model::no_timeout);

if (last_applied_before >= last_applied_after) {
error = true;
}
} catch (...) {
error = true;
vlog(
Expand All @@ -446,7 +461,7 @@ ss::future<> state_machine_manager::background_apply_fiber(
std::current_exception());
}
if (error) {
co_await ss::sleep_abortable(1s, _as);
co_await ss::sleep_abortable(100ms, _as);
}
}
units.return_all();
Expand Down

0 comments on commit e121770

Please sign in to comment.