feat(stages): recover senders for both DB and static file transactions #6089

shekhirin · 2024-01-16T13:47:02Z

With this PR, when recovering senders in the SenderRecovery pipeline stage, we will look into both database table Transactions and static files for Transactions segment. It becomes relevant when #5733 is merged.

One thing to note with the current implementation is that we no longer retrieve RawKey and RawValue from the database when querying transactions for sender recovery, but instead decode them before passing the transactions to threads for recovery. The bench showed there's no significant difference in performance between the two approaches.

Holesky benchmark (main vs this PR):

ubuntu@reth4:~/reth-snaps-alexey$ hyperfine './reth_main stage --chain holesky run --datadir ~/.local/share/reth/holesky --from 0 --to 808004 senders'
Benchmark 1: ./reth_main stage --chain holesky run --datadir ~/.local/share/reth/holesky --from 0 --to 808004 senders
  Time (mean ± σ):     48.451 s ±  0.335 s    [User: 1696.042 s, System: 6.879 s]
  Range (min … max):   47.847 s … 48.946 s    10 runs

ubuntu@reth4:~/reth-snaps-alexey$ hyperfine './reth_6089_raw stage --chain holesky run --datadir ~/.local/share/reth/holesky --from 0 --to 808004 senders'
Benchmark 1: ./reth_6089_raw stage --chain holesky run --datadir ~/.local/share/reth/holesky --from 0 --to 808004 senders
  Time (mean ± σ):     52.559 s ±  0.146 s    [User: 1762.202 s, System: 8.185 s]
  Range (min … max):   52.248 s … 52.708 s    10 runs

In addition to that, we need to keep in mind that now transactions, headers, and receipts can be in static files, so when calculating the stage checkpoint, we need to look in both database and static files. I introduced a trait StatsReader for that, and replaced all calls to tx.entries::<T>() with it.

…files

…y-snapshot

shekhirin · 2024-01-25T14:45:06Z

Okay, it performs worse... I need to return the raw query functionality

ubuntu@reth4:~/reth-snaps-alexey$ hyperfine './reth_main stage --chain holesky run --datadir ~/.local/share/reth/holesky --from 0 --to 808004 senders'
Benchmark 1: ./reth_main stage --chain holesky run --datadir ~/.local/share/reth/holesky --from 0 --to 808004 senders
  Time (mean ± σ):     48.451 s ±  0.335 s    [User: 1696.042 s, System: 6.879 s]
  Range (min … max):   47.847 s … 48.946 s    10 runs

ubuntu@reth4:~/reth-snaps-alexey$ hyperfine './reth_6089 stage --chain holesky run --datadir ~/.local/share/reth/holesky --from 0 --to 808004 senders'
Benchmark 1: ./reth_6089 stage --chain holesky run --datadir ~/.local/share/reth/holesky --from 0 --to 808004 senders
  Time (mean ± σ):     64.176 s ±  0.451 s    [User: 1700.714 s, System: 15.671 s]
  Range (min … max):   63.374 s … 64.920 s    10 runs

shekhirin · 2024-01-25T16:32:31Z

Raw query looks good

ubuntu@reth4:~/reth-snaps-alexey$ hyperfine './reth_main stage --chain holesky run --datadir ~/.local/share/reth/holesky --from 0 --to 808004 senders'
Benchmark 1: ./reth_main stage --chain holesky run --datadir ~/.local/share/reth/holesky --from 0 --to 808004 senders
  Time (mean ± σ):     48.451 s ±  0.335 s    [User: 1696.042 s, System: 6.879 s]
  Range (min … max):   47.847 s … 48.946 s    10 runs

ubuntu@reth4:~/reth-snaps-alexey$ hyperfine './reth_6089_raw stage --chain holesky run --datadir ~/.local/share/reth/holesky --from 0 --to 808004 senders'
Benchmark 1: ./reth_6089_raw stage --chain holesky run --datadir ~/.local/share/reth/holesky --from 0 --to 808004 senders
  Time (mean ± σ):     52.559 s ±  0.146 s    [User: 1762.202 s, System: 8.185 s]
  Range (min … max):   52.248 s … 52.708 s    10 runs

This reverts commit e9e217d.

onbjerg

lgtm

onbjerg · 2024-01-25T17:54:15Z

crates/stages/benches/criterion.rs

        setup::unwind_hashes,
        stage,
-        1..DEFAULT_NUM_BLOCKS,
+        1..=DEFAULT_NUM_BLOCKS,


was this a bug before?

or well "bug", i guess this is in a benchmark, so doesn't rly matter

the behavior didn't change: even though we were passing an exclusive range, we constructed StageRange struct with range.start and range.end

…ender-recovery-snapshot

onbjerg

lgtm

emhane

I like that you don't constraint generics before it's necessary, it improves readability of the code. it also makes the code more flexible, and therefore easier to test since generic types without constraints can easily be mocked.

in some places the code can panic by calling expect. in these situations, are you sure you want the node to stop execution? I'd suggest using debug_assert and dumping as much state as could be helpful to debug to log in the message. instead in run mode, consider if it's better to conditionally continue execution by if let Some(..) .. and if let Ok(..) ... in some cases it may indeed be best for the node to stop.

emhane · 2024-01-29T14:06:34Z

crates/stages/src/stages/sender_recovery.rs

-    let (tx_id, transaction) =
-        entry.map_err(|e| Box::new(SenderRecoveryStageError::StageError(e.into())))?;
-    let tx_id = tx_id.key().expect("key to be formated");
+    let tx = tx.value().expect("value to be formated");


for example here, could it be possible that one tx value is not formatted but other tx values are? in that case, it's better not to panic

not really, we expect the values both in the database and static files always be decodable, so if it fails, it's a critical unrecoverable error and something went wrong: either user interfered with the database, or we have a bug.

alright, in case of a bug, would be nice to get more context, same advice you gave me last week

btw, here, and in tracing, it's nice if the variable names match the log message keys exactly. makes it easier to debug. so,

let my_var = 1; debug!(target: "..", my_var=my_var, "" );

…ender-recovery-snapshot

shekhirin · 2024-01-29T19:05:28Z

Okay there's some sort of a bug with ETA, probably related to stage_checkpoint calculation changes

2024-01-29T19:03:16.582026Z  INFO reth_node_core::events: Stage finished executing pipeline_stages=4/13 stage=SenderRecovery checkpoint=836683 target=836683 stage_progress=100.00% stage_eta=4341721years 8months 26days 23h 27m 58s

shekhirin · 2024-01-31T14:32:40Z

ETA bug is unrelated to these changes, the progress is calculated correctly

shekhirin added 4 commits January 16, 2024 13:28

feat(stages): recover senders for transactions in both db and static …

e017a7e

…files

query snapshots too

3ea1def

return bench targets back

e6875cb

fix test

0ca5eb7

shekhirin added C-enhancement New feature or request A-staged-sync Related to staged sync (pipelines and stages) A-static-files Related to static files labels Jan 18, 2024

shekhirin added 2 commits January 18, 2024 12:23

count entries from both db and snapshot

c2c03ca

cargo fmt

057f9dc

shekhirin added the A-db Related to the database label Jan 18, 2024

shekhirin marked this pull request as ready for review January 18, 2024 13:40

shekhirin requested review from rakita, joshieDo, onbjerg and rkrasiuk as code owners January 18, 2024 13:40

shekhirin marked this pull request as draft January 18, 2024 13:40

shekhirin removed request for onbjerg, rakita, rkrasiuk and joshieDo January 18, 2024 13:40

shekhirin marked this pull request as ready for review January 18, 2024 13:42

shekhirin requested review from joshieDo, mattsse and onbjerg January 18, 2024 13:52

onbjerg approved these changes Jan 18, 2024

View reviewed changes

Merge remote-tracking branch 'origin/main' into alexey/sender-recover…

a159af1

…y-snapshot

shekhirin added 3 commits January 25, 2024 15:40

raw values

dd8740c

fmt

ed0c660

allocate capacity

98b9081

comment querying from snapshots to prevent slowdown

e9e217d

shekhirin force-pushed the alexey/sender-recovery-snapshot branch from ed39c9a to e9e217d Compare January 25, 2024 16:23

shekhirin requested a review from onbjerg January 25, 2024 16:44

shekhirin changed the base branch from main to feat/static-files January 25, 2024 17:39

shekhirin requested review from gakonst, DaniPopes, Rjected and Evalir as code owners January 25, 2024 17:39

shekhirin changed the base branch from feat/static-files to main January 25, 2024 17:40

shekhirin force-pushed the alexey/sender-recovery-snapshot branch from 7fd3598 to f394358 Compare January 25, 2024 17:43

shekhirin requested a review from emhane as a code owner January 25, 2024 17:43

shekhirin changed the base branch from main to feat/static-files January 25, 2024 17:43

shekhirin changed the base branch from feat/static-files to main January 25, 2024 17:43

shekhirin force-pushed the alexey/sender-recovery-snapshot branch from f394358 to 8394742 Compare January 25, 2024 17:44

Revert "comment querying from snapshots to prevent slowdown"

019d1bf

This reverts commit e9e217d.

shekhirin force-pushed the alexey/sender-recovery-snapshot branch from 8394742 to 019d1bf Compare January 25, 2024 17:45

shekhirin changed the base branch from main to feat/static-files January 25, 2024 17:49

onbjerg approved these changes Jan 25, 2024

View reviewed changes

Merge remote-tracking branch 'origin/feat/static-files' into alexey/s…

68c5296

…ender-recovery-snapshot

shekhirin force-pushed the alexey/sender-recovery-snapshot branch from 7d6bb56 to 68c5296 Compare January 26, 2024 13:14

onbjerg approved these changes Jan 29, 2024

View reviewed changes

emhane reviewed Jan 29, 2024

View reviewed changes

Merge remote-tracking branch 'origin/feat/static-files' into alexey/s…

f360e36

…ender-recovery-snapshot

pass tx.value() error upstream

8e8a059

shekhirin merged commit d1abb98 into feat/static-files Jan 31, 2024
24 checks passed

shekhirin deleted the alexey/sender-recovery-snapshot branch January 31, 2024 14:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(stages): recover senders for both DB and static file transactions #6089

feat(stages): recover senders for both DB and static file transactions #6089

shekhirin commented Jan 16, 2024 •

edited

Loading

shekhirin commented Jan 25, 2024 •

edited

Loading

shekhirin commented Jan 25, 2024 •

edited

Loading

onbjerg left a comment

onbjerg Jan 25, 2024

onbjerg Jan 25, 2024

shekhirin Jan 25, 2024

onbjerg left a comment

emhane left a comment

emhane Jan 29, 2024

shekhirin Jan 29, 2024

emhane Jan 29, 2024

emhane Jan 29, 2024

shekhirin commented Jan 29, 2024

shekhirin commented Jan 31, 2024

feat(stages): recover senders for both DB and static file transactions #6089

feat(stages): recover senders for both DB and static file transactions #6089

Conversation

shekhirin commented Jan 16, 2024 • edited Loading

shekhirin commented Jan 25, 2024 • edited Loading

shekhirin commented Jan 25, 2024 • edited Loading

onbjerg left a comment

Choose a reason for hiding this comment

onbjerg Jan 25, 2024

Choose a reason for hiding this comment

onbjerg Jan 25, 2024

Choose a reason for hiding this comment

shekhirin Jan 25, 2024

Choose a reason for hiding this comment

onbjerg left a comment

Choose a reason for hiding this comment

emhane left a comment

Choose a reason for hiding this comment

emhane Jan 29, 2024

Choose a reason for hiding this comment

shekhirin Jan 29, 2024

Choose a reason for hiding this comment

emhane Jan 29, 2024

Choose a reason for hiding this comment

emhane Jan 29, 2024

Choose a reason for hiding this comment

shekhirin commented Jan 29, 2024

shekhirin commented Jan 31, 2024

shekhirin commented Jan 16, 2024 •

edited

Loading

shekhirin commented Jan 25, 2024 •

edited

Loading

shekhirin commented Jan 25, 2024 •

edited

Loading