Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(engine): introduce sync implementation of StateRootTask #12378

Merged
merged 13 commits into from
Nov 18, 2024

Conversation

fgimenez
Copy link
Member

@fgimenez fgimenez commented Nov 7, 2024

Towards #12053

Replaces async implementation with sync given that EngineApiTreeHandler is sync and runs in its own thread without setting up ny tokio runtime. We can always run this same implementation in an async context using spwan_blocking + send.

It also adds a bench to compare std::sync::mpsc::channel with crossbeam_channel::unbounded for receiving state updates.

use super::{StateRootConfig, StateRootHandle, StateRootResult};
use reth_trie::updates::TrieUpdates;
use revm_primitives::{EvmState, B256};
use std::sync::mpsc::{self, Receiver, RecvError};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth checking if we get perf improvements by using crossbeam channels

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the previous message, i was benching the wrong thing (the channel for sending the final result instead of the channel for sending each state update).

ok, we have the right bench here 7239538 to compare crossbeam_channel::unbounded and std::sync::mpsc::channel() when sending EvmState (just sending items, not taking into account how they are processed), these are the results:

state_stream_channels/std_channel/1
                        time:   [49.237 µs 50.471 µs 51.552 µs]
state_stream_channels/crossbeam_channel/1
                        time:   [47.803 µs 50.518 µs 55.016 µs]
state_stream_channels/std_channel/10
                        time:   [286.21 µs 293.42 µs 297.39 µs]
Found 4 outliers among 10 measurements (40.00%)
  1 (10.00%) low severe
  1 (10.00%) low mild
  2 (20.00%) high mild
state_stream_channels/crossbeam_channel/10
                        time:   [285.84 µs 295.41 µs 306.75 µs]
state_stream_channels/std_channel/100
                        time:   [1.0524 ms 1.0585 ms 1.0670 ms]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
state_stream_channels/crossbeam_channel/100
                        time:   [1.0317 ms 1.0411 ms 1.0510 ms]

so, mostly the same performance for both implementations, slightly better crossbeam for big state changes (100 accounts), but this is not going to be very frequent.

given these results I think it would be better to keep std::syn::mpsc::channel, one dep less, wdyt?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, then this doesn't really matter.
crossbeam is already part of the dep tree so we can use it or not

Comment on lines 47 to 51
rayon::spawn(move || {
debug!(target: "engine::tree", "Starting sync state root task");
let result = self.run();
let _ = tx.send(result);
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think rayon is a bit problematic here because this does a lot of IO.

since we're always using this, we could use a dedicated background thread for this that loops/recv StateRootSyncTasks

similar to the existing persistence task

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed, this is what we had initially 891edaa recovered, also switched from mpsc::channel to mpsc::sync_channel(1)

@fgimenez fgimenez added A-blockchain-tree Related to sidechains, reorgs and pending blocks A-trie Related to Merkle Patricia Trie implementation labels Nov 8, 2024
@fgimenez fgimenez added the C-perf A change motivated by improving speed, memory usage or disk footprint label Nov 8, 2024
@fgimenez fgimenez force-pushed the fgimenez/srt-sync-async-impls branch 3 times, most recently from d16d1c3 to d0f48d5 Compare November 11, 2024 19:06
@fgimenez fgimenez force-pushed the fgimenez/srt-sync-async-impls branch from d0f48d5 to fdb55e9 Compare November 12, 2024 15:34
@fgimenez fgimenez force-pushed the fgimenez/srt-sync-async-impls branch from fdb55e9 to 5d2048f Compare November 13, 2024 13:34
Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Member

@rkrasiuk rkrasiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@rkrasiuk rkrasiuk added this pull request to the merge queue Nov 18, 2024
Merged via the queue into main with commit 8339c71 Nov 18, 2024
41 checks passed
@rkrasiuk rkrasiuk deleted the fgimenez/srt-sync-async-impls branch November 18, 2024 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-blockchain-tree Related to sidechains, reorgs and pending blocks A-trie Related to Merkle Patricia Trie implementation C-perf A change motivated by improving speed, memory usage or disk footprint
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants