-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Off-chain database migration #1619
Comments
I'm not sure if this is an issue or a feature. I think we should make a clearer distinction between those. Typically Scrum has Issues, Features, and Epochs, where issues are the smallest and features capture a group of issues and epochs capture a long-term business objective. |
It is a master issue to track progress on the migration of the off-chain table. Since we decided to split migration into two parts, on-chain and off-chain, it has its own deliverables. The whole database migration belongs to the regenesis feature, but because the regenesis feature itself took 4 months, we need smaller tickets to track work=) |
Mostly refactoring to allow for #1619 . Not final but close. Some cleanup remains here or there. That will be bundled with the actual feature itself. The `SnapshotReader` and `SnapshotWriter` (previously the `StateReader` and `StateWriter`) now: * have a single `read` (or `write`) method generic over the table type * no longer use `*Config` structs in the interface -- all reading and writing is to be done over tables * `ChainConfig` is now written and read by the `SnapshotWriter` and `SnapshotReader` (since it is actually part of the snapshot). The genesis progress is now a generic `String` -> `u64` mapping with no enumerations for the key (previously we had an enum with a variant for each table). Every table gets its own parquet file. Json is still in a single file with the `StateConfig` schema. Depending on whether we want the offchain tables in the `StateConfig` we can either drop them or include them in the next PR. I avoided enumerating the tables as much as possible to lessen coupling. A new table should ideally require: 1. a call to write::<NewTable> when generating the snapshot 2. an implementation of `ProcessState<NewTable>` to describe how to import it on regenesis 3. an implementation of `AsTable` to describe how (if at all) this table can be extracted from a in-memory `StateConfig` 4. a call to `workers.spawn::<NewTable>` to run a worker to import it from the snapshot. After this PR we'll add in the off-chain tables and identify and regenerate dependent tables (ideally with batching + resumability). --------- Co-authored-by: xgreenx <xgreenx9999@gmail.com>
related to: #1619 Doesn't close the issue, still need to migrate a few tables. The following tables are now part of regenesis: OnChain: * Transactions (saved in snapshot) OffChain: * TransactionStatuses (saved in snapshot) * OwnedTransactions (saved in snapshot) * OwnedMessageIds (derived from Messages in snapshot) * OwnedCoins (derived from Coins in snapshot) * ContractsInfo (derived from Transactions in snapshot) We have open questions to @xgreenx: 1. Should we regenesize `FuelBlockIdsToHeights`? We attempted it but it caused issues with the "don't commit changes related to more than one block" guard. 2. Also what about the restoring the following tables: * Metadata * Statistics * All relayer tables * ProcessedTransactions There are opportunities for optimization, namely we're reading some snapshot data twice (e.g. Transactions are read once to restore the `Transactions` table and once to derive the `ContractsInfo` table). That could probably be done in one go writing to both on chain and off chian tables at once. --------- Co-authored-by: Hannes Karppila <hannes.karppila@gmail.com> Co-authored-by: xgreenx <xgreenx9999@gmail.com>
Is done as part of the #1545 |
No description provided.
The text was updated successfully, but these errors were encountered: