Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream reader for chain state #1519

Merged
merged 273 commits into from
Feb 21, 2024
Merged

Conversation

MujkicA
Copy link
Contributor

@MujkicA MujkicA commented Nov 27, 2023

Relevant issue: #1209

Relevant previous PRs: #1474

This PR builds upon the previous work by introducing a resumable, parallelized (re)genesis process.

Starting with the run command, users provide a snapshot metadata file containing paths to the chain config file and files containing chain state items (such as coins, messages, contracts, contract states, and balances), which are loaded via streaming.

Each item group in the genesis process is handled by a separate worker, allowing for parallel loading. Workers stream file contents in batches, with the batch size adaptable for future performance optimization.

A database transaction is committed every time an item group is succesfully loaded. Resumability is achieved by recording the last loaded group index within the same db tx. If loading is aborted, the remaining workers are shutdown. Upon restart, workers resume from the last processed group.

Database tables used for the genesis process are cleared once it's finalized.

Contract States and Balances
The use of uniform-sized batches may result in batches containing items from multiple contracts. We expect that best performance is achieved by selecting a batch size that typically encompasses an entire contract's state or balance, allowing for immediate initialization of relevant Merkle trees through nodes_from_set.

Note:

  • The significant diff count primarily arises from configuring and updating the test contract with a large state.
  • The most significant changes start from the genesis.rs, so it might be a good entry point for the review. Analysing how workers process the batches will guide you through the majority of the changes. Coin and message configs are simply loaded into the db, contract configs are loaded but the contracts tree root cannot be calculated until the contract states and balances workers are finished.
    When inserting contract states and balances, we also have to init/update the corresponding trees.

CHANGELOG.md Show resolved Hide resolved
Copy link
Collaborator

@xgreenx xgreenx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super puper cool change, thank you a lot!=) Left some optional comments but I think its already in a good shape to be merged into the feature branch=)

I think we need to merge it to the master as sooner as possible because it touches some ongoing areas

crates/fuel-core/src/database/vm_database.rs Outdated Show resolved Hide resolved
.github/workflows/ci.yml Outdated Show resolved Hide resolved
crates/fuel-core/src/database.rs Show resolved Hide resolved
crates/fuel-core/src/state/rocks_db.rs Show resolved Hide resolved
crates/fuel-core/src/service/genesis/workers.rs Outdated Show resolved Hide resolved
crates/fuel-core/src/service/genesis/runner.rs Outdated Show resolved Hide resolved
xgreenx added a commit that referenced this pull request Feb 17, 2024
)

Almost closes #1583

The change moves the `OwnedCoins` and `OwnedMessageIds` tables and their
implementation to the GraphQL module. Now, the off-chain worker is
responsible for tracking owners of coins and messages. For that purpose,
`Executor` generates execution events to track which messages/coins are
consumed and created during execution. The worker uses these events and
updates corresponding tables.

It is not required to produce these events from the executor's
perspective since anyone can calculate them based on the imported block
and events from the relayer(knowing the execution rules of the state
transition). However, I decided to embed that logic into the executor to
simplify support for it. In the future, if we decide to change how coins
and messages are created/spent, we don't need to update a bunch of
places.

Events from the `ExecutionResult` are insufficient to complete the
change since we also need to support coins and messages from the genesis
block. I added a new function into the `genesis` module to perform an
update of the `OwnedCoins` and `OwnedMessageIds` tables from the
`StateConfig`(in the future, it will be done by an off-chain regenesis
process). The initial idea was to emit events, but after reviewing the
#1519 I realized that we can
have so many events that it will be hard to fit them into the memory.
So, I decided to implement a separate function that later can work with
batches and be parallelizable.

---------

Co-authored-by: Mitchell Turner <james.mitchell.turner@gmail.com>
Co-authored-by: Brandon Kite <brandonkite92@gmail.com>
@MujkicA MujkicA merged commit a4021f1 into feature/regenesis-support Feb 21, 2024
32 checks passed
@MujkicA MujkicA deleted the feature/snapshot_reading branch February 21, 2024 13:32
crypto523 pushed a commit to crypto523/fuel-core that referenced this pull request Oct 7, 2024
Part of the FuelLabs/fuel-core#1583.

The change moves the genesis block execution and commitment from the
`FuelService::new` to the `FuelService::Task::into_task`.

It allows us to notify other services about the genesis block because
all services are already subscribed to the block importer(it is what we
need for FuelLabs/fuel-core#1583 to process
new messages inside the off-chain worker). Plus, it adds support for the
`async` syntax(it will be used by the parallel regenesis process from
FuelLabs/fuel-core#1519).

Moving genesis block initialization from the constructor to the starting
level breaks p2p because `P2PService` requires knowing the `Genesis`
type to create `FuelP2PService`(It is used to filter connections with
peers). Because of that, I moved the creation of the `FuelP2PService` to
`UninitializedTask::into_task` where the genesis block is already
available.
crypto523 pushed a commit to crypto523/fuel-core that referenced this pull request Oct 7, 2024
…635)

Almost closes FuelLabs/fuel-core#1583

The change moves the `OwnedCoins` and `OwnedMessageIds` tables and their
implementation to the GraphQL module. Now, the off-chain worker is
responsible for tracking owners of coins and messages. For that purpose,
`Executor` generates execution events to track which messages/coins are
consumed and created during execution. The worker uses these events and
updates corresponding tables.

It is not required to produce these events from the executor's
perspective since anyone can calculate them based on the imported block
and events from the relayer(knowing the execution rules of the state
transition). However, I decided to embed that logic into the executor to
simplify support for it. In the future, if we decide to change how coins
and messages are created/spent, we don't need to update a bunch of
places.

Events from the `ExecutionResult` are insufficient to complete the
change since we also need to support coins and messages from the genesis
block. I added a new function into the `genesis` module to perform an
update of the `OwnedCoins` and `OwnedMessageIds` tables from the
`StateConfig`(in the future, it will be done by an off-chain regenesis
process). The initial idea was to emit events, but after reviewing the
FuelLabs/fuel-core#1519 I realized that we can
have so many events that it will be hard to fit them into the memory.
So, I decided to implement a separate function that later can work with
batches and be parallelizable.

---------

Co-authored-by: Mitchell Turner <james.mitchell.turner@gmail.com>
Co-authored-by: Brandon Kite <brandonkite92@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants