Skip to content
This repository was archived by the owner on Oct 18, 2023. It is now read-only.

xCheckpoint: move blocking ops to use bottomless context #702

Merged
merged 1 commit into from
Sep 26, 2023

Conversation

Horusiath
Copy link
Contributor

@Horusiath Horusiath commented Sep 26, 2023

Some context - we started observing panics on checkpoint calls with following trace:

2023-09-26T07:41:16.425109Z TRACE sqld: database checkpoint
2023-09-26T07:41:16.425266Z TRACE xCheckpoint{emode=3 busy_handler=Some(0x103e7e8d0) busy_arg=0x600003728000 sync_flags=10 n_buf=4096 z_buf=0x7ff12d04ac00 frames_in_wal=0x70000f7ee304 backfilled_frames=0x70000f7ee308}: sqld::replication::primary::logger: bottomless checkpoint
2023-09-26T07:41:16.425462Z ERROR xCheckpoint{emode=3 busy_handler=Some(0x103e7e8d0) busy_arg=0x600003728000 sync_flags=10 n_buf=4096 z_buf=0x7ff12d04ac00 frames_in_wal=0x70000f7ee304 backfilled_frames=0x70000f7ee308}: tracing_panic: A panic occurred panic.payload="there is no reactor running, must be called from the context of a Tokio 1.x runtime" panic.location="sqld/src/replication/primary/logger.rs:229:27"
2023-09-26T07:41:16.425636Z ERROR xCheckpoint{emode=3 busy_handler=Some(0x103e7e8d0) busy_arg=0x600003728000 sync_flags=10 n_buf=4096 z_buf=0x7ff12d04ac00 frames_in_wal=0x70000f7ee304 backfilled_frames=0x70000f7ee308}: sqld_libsql_bindings::wal_hook: panic in call to xframe: there is no reactor running, must be called from the context of a Tokio 1.x runtime:
   0: std::backtrace::Backtrace::create
   1: std::backtrace::Backtrace::force_capture
   2: sqld_libsql_bindings::wal_hook::xCheckpoint
   3: _sqlite3BtreeCheckpoint
   4: _sqlite3VdbeExec
   5: _sqlite3_step
   6: rusqlite::row::Rows::get_expected_row
   7: core::ops::function::FnOnce::call_once{{vtable.shim}}
   8: std::sys_common::backtrace::__rust_begin_short_backtrace
   9: core::ops::function::FnOnce::call_once{{vtable.shim}}
  10: std::sys::unix::thread::Thread::new::thread_start
  11: __pthread_start

2023-09-26T07:41:16.425968Z  WARN sqld: failed to execute checkpoint: Internal Error: `Failed to receive response via oneshot channel: channel closed

It turns out that a current tokio runtime context doesn't exists when xCheckpoint is called. To fix it we pull it from bottomless hook.

Second fix is related to LibsqlConnection which was using OS thread spawn instead of tokio spawn.

@MarinPostma MarinPostma force-pushed the checkpoints-on-bottomless-runtime-context branch from ebd7148 to 5369e0f Compare September 26, 2023 09:25
@MarinPostma MarinPostma added this pull request to the merge queue Sep 26, 2023
Merged via the queue into main with commit 067c248 Sep 26, 2023
@MarinPostma MarinPostma deleted the checkpoints-on-bottomless-runtime-context branch September 26, 2023 10:07
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants