Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update disk root more frequently #29565

Closed

Conversation

Brindrajsinh-Chauhan
Copy link

@Brindrajsinh-Chauhan Brindrajsinh-Chauhan commented Apr 17, 2024

This PR adds the ability to update the diskRoot after a time threshold along with the memory threshold.

Current Scenario

For permissioned/private networks, where there can be a lot of empty blocks, it can take a lot of time/transactions to meet the ~4MB memory limit before the disk root is updated. Thus when a chain experiences an unclean shutdown, the rewind logic which takes the state back to the disk Root causes a massive rewind which can take a long time to build back up.

Proposed Scenario

This PR adds a commit threshold that triggers a disk root update once the threshold has been crossed. It still maintains the concept of 128 layers above the disk root but provides the ability to merge the flatten layer with the disk layer more frequently for chain with less transaction activity. This ensure that the disk root is updated at a threshold thus preventing massive rollbacks. This is disabled by default and can be enabled by the following in the config file
AllowForceUpdate - toggle to enable/disable it
CommitThreshold - Desired commit threshold for diskRoot update

@holiman
Copy link
Contributor

holiman commented Apr 18, 2024

We do have this:

// SetTrieFlushInterval configures how often in-memory tries are persisted
// to disk. The value is in terms of block processing time, not wall clock.
// If the value is shorter than the block generation time, or even 0 or negative,
// the node will flush trie after processing each block (effectively archive mode).
func (api *DebugAPI) SetTrieFlushInterval(interval string) error {

Implemented by @s1na here

It makes it possible to set this via the debug namespace. Is it sufficient for your purpose?

@karalabe
Copy link
Member

@holiman OP is talking about the snapshot flush threshold, not the trie of. I'm a bit reluctant to touch that part since we want to merge snapshots and pathdb, so maybe less special code at this point would be nice.

@Brindrajsinh-Chauhan
Copy link
Author

Brindrajsinh-Chauhan commented Apr 18, 2024

@karalabe Thank you for the feedback. The pathDB part you mentioned would be referring to this right https://github.com/ethereum/go-ethereum/tree/master/triedb/pathdb. and the Issue mentioned here #28617. I can take a look at the and see how the implementation can be updated. Would be grateful to get some direction or pointers to move this forward.
Thank you

@Brindrajsinh-Chauhan
Copy link
Author

@karalabe @holiman Updated the implementation. Would be grateful if you could provide feedback on the same

PathDB already had a parameter force here which has a similar purpose of bypassing the limit comparison and flush to disk. This PR surfaced this parameter to the layerTree.cap method and adds the same parameter to snapshotTree.Cap method. This is believed to take out the special implementation from the tree layers which would prevent any issues when the implementations are merged in the future.

This PR also updates the time based threshold to number of commits to that tree instead to maintain similar implementations across both methods. Through the force boolean flag, it attempts to update the diskLayer after every n commits to the tree. If the layers are less than 128, it still does not commit it to disk thus maintaining the top 128 layers.

@Brindrajsinh-Chauhan
Copy link
Author

@karalabe @holiman @rjl493456442 Would be very grateful to get your feedback on the updated logic to and resolve this issue.
Really appreciate your support thus far.
Thank you

@holiman
Copy link
Contributor

holiman commented Sep 13, 2024

@holiman OP is talking about the snapshot flush threshold, not the trie of. I'm a bit reluctant to touch that part since we want to merge snapshots and pathdb, so maybe less special code at this point would be nice.

I agree with this. We have lots of plans for the future of our state machinery, and I don't think we should take on further complexity in those areas at this time.

@holiman holiman closed this Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants