-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(revm): Integrate State #3512
Conversation
709867f
to
b41aaff
Compare
Wanted to do full sync with revm state as It passes first 5M blocks, and set target to 17M. It is a lot slower so I left a few timestamps to figure out where the time goes. Commit 4c5453d A number of executed blocks is hardcoded to 10k will do it for 50k and 200k just to compare numbers. One log is here:
66s for execution where 7s is writing/commiting data to the database. Out of 66s, 41s is spent in revm and 19s is spent on merging transitions. The great thing is that merging transitions can be done in parallel with evm execution. Few more logs to see consistency of times:
|
for hardcoded 50k blocks commit: One line:
Block fetch 22s In executor it is: Transition merge needs to be addresses and optimized. Multiple measurements to see consistency:
|
I forgot to remove some test cloning, now it is a lot better. Logs are with 50k blocks:
Time spend:
In total for this sample, it takes 474s. Things that can be done in parallel are:
Parallel work in total is 76s that is just 16% of execution stage, while evm transact is 320s which is 67%, writing to db is 61s or 12,8% and apply of revm output is 14.7s or 2.5%. We can additionally write changesets in the background it is just append operation while writing plain state at the end. |
86e1261
to
f4f232f
Compare
b865c01
to
a32615e
Compare
Codecov Report
... and 20 files with indirect coverage changes
Flags with carried forward coverage won't be shown. Click here to find out more.
|
This reverts commit 762245e.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great start - thank you.
Main thought is we should split this up into a stacked PR, starting with the deeper changes:, and ensuring each type is well documented and we know why it's needed. Right now things are super verbose and require fine handling to set them up correctly, or a lot of context is needed in mind to use it properly.
Suggested path forward is to split into a few PRs which we stack on top of each other
- The glue code between Revm and Reth, mostly in
cache.rs
, with clear description on what the wrapper BundleState is doing - Modify the actual executor & processor code
- RPC & TxPool changes (if any required after our review)
Any driveby changes like some type casts to u128 etc. should go in separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see how this will work, and overall this is a big improvement.
I went over most of it and left a lot of comments on everything that caught my eye.
for the alias issues,
I'd like to avoid Revm
aliases, because this makes it very confusing if we don't use the revm aliases everywhere, perhaps a better solution here is if we need to have wrappers, then these could be prefixed with Reth
instead or something else.
re rpc related changes:
what's the motivation behind replacing CacheDB with RevmState if we always disable the bundle state anyway?
we're only interested in the statediff, does the statediff output of revmState and cacheDB now differ?
} | ||
|
||
if TAKE { | ||
if UNWIND { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are these constants?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UNWIND
(Or previously TAKE
) would remove (take/unwind) things from the database. We can maybe continue here: https://github.com/paradigmxyz/reth/pull/3512/files#r1309408492
crates/revm/src/processor.rs
Outdated
use std::{sync::Arc, time::Instant}; | ||
use tracing::{debug, trace}; | ||
|
||
/// Main block executor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs docs that explain what this type does
/// Write reverts to database. | ||
/// | ||
/// Note:: Reverts will delete all wiped storage from plain state. | ||
pub fn write_to_db<'a, TX: DbTxMut<'a> + DbTx<'a>>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the naming is very confusing to me, because writing reverts results in wiped storage, because this does not actually write anything and instead deletes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I inherited this, can bikeshed a little on this, should be fruitful.
little history, this pattern was used when we still were not sure how to handle Provider
and needed to decide on the pattern for how to write things to the database. At that point in time this was one of the promising paths that we had.
What we have decided and have now is @joshieDo had split the Provider
with the traits so I think this should be moved there and have BundeStateProvider
trait that will have commit_state(bundle: State)
and commit_reverts(reverts:Reverts)
. This feels nicer imo and it shouldn't be a big change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function both writes/cleans up empty plainstate, but it also appends (the reverts) to the changesets tables. Makes me wonder if we should call reverts Prestate's or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created the issue for this as it is debt from before revm state: #4620
Co-authored-by: Alexey Shekhirin <a.shekhirin@gmail.com>
Co-authored-by: Bjerg <onbjerg@users.noreply.github.com>
Co-authored-by: Bjerg <onbjerg@users.noreply.github.com>
Co-authored-by: Bjerg <onbjerg@users.noreply.github.com>
Co-authored-by: Bjerg <onbjerg@users.noreply.github.com>
Co-authored-by: Bjerg <onbjerg@users.noreply.github.com>
Co-authored-by: Bjerg <onbjerg@users.noreply.github.com>
Co-authored-by: Bjerg <onbjerg@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
amazing
left my last (very) pedantic nits
@@ -950,10 +957,17 @@ impl<DB: Database, C: Consensus, EF: ExecutorFactory> BlockchainTree<DB, C, EF> | |||
let old_tip = self.block_indices.canonical_tip(); | |||
// Merge all chain into one chain. | |||
let mut new_canon_chain = chains_to_promote.pop().expect("There is at least one block"); | |||
trace!(target: "blockchain_tree", ?new_canon_chain, "Merging chains"); | |||
let mut have_append = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's rename this variable, to something like chain_promoted|appended
post_state: &PostState, | ||
executor: &mut Executor<DB>, | ||
bundle_state: &BundleStateWithReceipts, | ||
client: &impl StateProviderFactory, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dislike this refs to impl style,
would prefer &S
here and add generic to function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would prefer &impl
here but don't mind changing it
pub fn populate_account_balance_nonce_diffs<DB, I>( | ||
state_diff: &mut StateDiff, | ||
db: DB, | ||
db: &DB, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DB is autoimpl for &
db: &DB, | |
db: DB, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed it from few places: d3cec9f
crates/revm/src/processor.rs
Outdated
use std::{sync::Arc, time::Instant}; | ||
use tracing::{debug, trace}; | ||
|
||
/// Main block executor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still would like to see more docs here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed those, will add
} | ||
|
||
/// Creates a new executor from the given chain spec and database. | ||
pub fn new_with_db<DB: StateProvider + 'a>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style:
pub fn new_with_db<DB: StateProvider + 'a>( | |
pub fn with_db<DB: StateProvider + 'a>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer constructors to be prefixed with new_
, as I relate with_
to the builders.
@@ -278,6 +278,9 @@ pub enum RpcInvalidTransactionError { | |||
/// The transaction is before Spurious Dragon and has a chain ID | |||
#[error("Transactions before Spurious Dragon should not have a chain ID.")] | |||
OldLegacyChainId, | |||
/// The transitions is before Berlin and has access list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// The transitions is before Berlin and has access list | |
/// The transaction is before Berlin and has access list |
/// Type used to initialize revms bundle state. | ||
pub type BundleStateInit = | ||
HashMap<Address, (Option<Account>, Option<Account>, HashMap<H256, (U256, U256)>)>; | ||
|
||
/// Types used inside RevertsInit to initialize revms reverts. | ||
pub type AccountRevertInit = (Option<Option<Account>>, Vec<StorageEntry>); | ||
|
||
/// Type used to initialize revms reverts. | ||
pub type RevertsInit = HashMap<BlockNumber, HashMap<Address, AccountRevertInit>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still really want to get rid of these,
In any case they should not be public and it should be explained what these represent, but I'd prefer actual types over aliases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be work related to this: #4614
|
||
/// A change to the state of the world. | ||
#[derive(Default)] | ||
pub struct StateChange(pub StateChangeset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
many changes?
pub struct StateChange(pub StateChangeset); | |
pub struct StateChanges(pub StateChangeset); |
pub execution_duration: Duration, | ||
/// The amount of time it took to apply state changes to cached state | ||
pub apply_state_duration: Duration, | ||
/// Apply Post state execution changes duration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doc is not very informative and uses outdated post state terminology
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should have been post execution state change
as the name implies we have state change post execution.
Would update docs
/// Block execution statistics | ||
#[derive(Clone, Debug, Default)] | ||
pub struct BlockExecutorStats { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this type feels a bit out of place here in between the traits, can move to end of file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really excited for this work. Finally!
Main reaction is as usual...documentation & further simplification of the components / terminology. There's a few moving parts so the more we can document things like the processor the better.
Things I find confusing:
- change.rs file is big / a lot of things going on, would love more docs & refactoring in structs vs type re-exports and being sensitive about what gets exposed to consumers of libraries
- provider that has UNWIND/TAKE is still a bit "a lot of things in one place" so would love if we can break that down
Otherwise mainly nits and nothing from my side blocking this going forward.
I am also realizing an example on the Revm repo which shows how to the new API can be consumed, as well as comparing it with CacheDB could be powerful)
crates/revm/src/processor.rs
Outdated
/// Post execution state change that include block reward, withdrawals and iregular DAO hardfork | ||
/// state change. | ||
pub fn post_execution_state_change( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: post is a bit weird word here - apply
maybe
Cargo.toml
Outdated
@@ -59,7 +59,7 @@ resolver = "2" | |||
[workspace.package] | |||
version = "0.1.0-alpha.8" | |||
edition = "2021" | |||
rust-version = "1.70" # Remember to update .clippy.toml and README.md | |||
rust-version = "1.70" # Remember to update .clippy.toml and README.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rust-version = "1.70" # Remember to update .clippy.toml and README.md | |
rust-version = "1.70" # Remember to update .clippy.toml and README.md |
nit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was auto formating by plugin, will revert back, but we should address what is our default formatting for toml. Had the same problem with revm where Dani has its own config (width 100 iirc)
[patch.crates-io] | ||
revm = { git = "https://github.com/bluealloy/revm", branch = "release/v25" } | ||
revm-interpreter = { git = "https://github.com/bluealloy/revm", branch = "release/v25" } | ||
revm-precompile = { git = "https://github.com/bluealloy/revm", branch = "release/v25" } | ||
revm-primitives = { git = "https://github.com/bluealloy/revm", branch = "release/v25" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wooo
*address, | ||
Account { nonce: account.nonce.unwrap_or(0), balance: account.balance, bytecode_hash }, | ||
(Some(None), storage.keys().map(|k| StorageEntry::new(*k, U256::ZERO)).collect()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit this Some(None)
is a bit overhead to think about - maybe the bundle builder helps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be, linked the issue for crossreference #4614
bin/reth/src/init.rs
Outdated
let mut all_reverts_init: RevertsInit = HashMap::new(); | ||
all_reverts_init.insert(0, reverts_init); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let mut all_reverts_init: RevertsInit = HashMap::new(); | |
all_reverts_init.insert(0, reverts_init); | |
let all_reverts_init: RevertsInit = HashMap::from([(0, reverts_init)]); |
think this should work also
} | ||
|
||
// Create header log bloom. | ||
let logs_bloom = receipts_with_bloom.iter().fold(Bloom::zero(), |bloom, r| bloom | r.bloom); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit i'd just add a method on the struct which does this as it seems like a common operation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be addressed by Receipts
type: #4621
gas_spent_by_tx: self | ||
.receipts | ||
.last() | ||
.map(|block_r| { | ||
block_r | ||
.iter() | ||
.enumerate() | ||
.map(|(id, tx_r)| { | ||
( | ||
id as u64, | ||
tx_r.as_ref() | ||
.expect("receipts have not been pruned") | ||
.cumulative_gas_used, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would consider making a method on receipts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be addressed by Receipts
type: #4621
/// Receipts. | ||
receipts: Vec<Vec<Option<Receipt>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this Vec<Vec<Option<_>> represent? Can we document what the outer/inner vec mean and when a Receipt can be none
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documented it. And can be addressed by Receipts
type: #4621
|
||
/// Type used to initialize revms bundle state. | ||
pub type BundleStateInit = | ||
HashMap<Address, (Option<Account>, Option<Account>, HashMap<H256, (U256, U256)>)>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the double option i wonder if we could replace it with an enum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to this: #4614
/// Write reverts to database. | ||
/// | ||
/// Note:: Reverts will delete all wiped storage from plain state. | ||
pub fn write_to_db<'a, TX: DbTxMut<'a> + DbTx<'a>>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function both writes/cleans up empty plainstate, but it also appends (the reverts) to the changesets tables. Makes me wonder if we should call reverts Prestate's or something.
a9d2e0f
to
79345c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lfg
Integrate revm State inside reth and remove instances of CacheDB and PostState and replace all functionality that PostState had.
Handling account status inside revm helped us to be more certain that transitions and changesets are correctly created.
revm PR that explains inner workings of State can be seen here: bluealloy/revm#499
Additionally, bump revm to the newest version.
Rename the
Executor
toEVMProcessor
and addtake_output
andtake_stats
to it, this allows us to get an aggregated output of the execution of multiple blocks andtake_stats
allows us to see how much time a particular operation took.🤖 Generated by Copilot at d017931
This pull request improves the
primitives
crate by removing unnecessary code from theBytecode
struct, fixing its deserialization logic, and suppressing a clippy lint warning in thebits.rs
module.🤖 Generated by Copilot at d017931