Skip to content

Commit

Permalink
feat(resharding): flat storage resharding mvp (#12164)
Browse files Browse the repository at this point in the history
The PR adds early MVP capabilities of resharding flat storage (V3). 

Main addition is `FlatStorageResharder` and all the toolings around
that. Also, you can see traces of an early attempt to tie-in the
resharder to existing flat storage code, mainly the flat storage
creator.

- `FlatStorageResharder` takes care of everything related to resharding
the flat storage.
- Its running tasks can be interrupted by the a controller (integrated
with existing resharding handle)
  - Uses the concept of a scheduler to run tasks in the background
- `ReshardingEventType` is an utility enum to represent types of
resharding. There's one for now, but it makes easier adding more.

## Achievements
- Preparing flat storage's content for children after a shard split, for
account-id based keys only.
- Deletion of parent flat storage

## Missing pieces
- Catchup phase for children and creation of proper flat storage
- Handling more complex key-values (not account-id based)
- Integration with resharding manager and flat storage creator
- Additional tests
- Metrics

Missing pieces will likely be done in another PR.

---
EDIT: integrated with ShardLayoutV2, fixed all unit tests, re-arranged
description.
  • Loading branch information
Trisfald authored Oct 14, 2024
1 parent 0c135f2 commit 04eb2c7
Show file tree
Hide file tree
Showing 16 changed files with 1,314 additions and 30 deletions.
5 changes: 5 additions & 0 deletions chain/chain-primitives/src/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,9 @@ pub enum Error {
/// GC error.
#[error("GC Error: {0}")]
GCError(String),
/// Resharding error.
#[error("Resharding Error: {0}")]
ReshardingError(String),
/// Anything else
#[error("Other Error: {0}")]
Other(String),
Expand Down Expand Up @@ -269,6 +272,7 @@ impl Error {
| Error::CannotBeFinalized
| Error::StorageError(_)
| Error::GCError(_)
| Error::ReshardingError(_)
| Error::DBNotFoundErr(_) => false,
Error::InvalidBlockPastTime(_, _)
| Error::InvalidBlockFutureTime(_)
Expand Down Expand Up @@ -392,6 +396,7 @@ impl Error {
Error::NotAValidator(_) => "not_a_validator",
Error::NotAChunkValidator => "not_a_chunk_validator",
Error::InvalidChallengeRoot => "invalid_challenge_root",
Error::ReshardingError(_) => "resharding_error",
}
}
}
Expand Down
20 changes: 19 additions & 1 deletion chain/chain/src/flat_storage_creator.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
//! `CatchingUp`: moves flat storage head forward, so it may reach chain final head.
//! `Ready`: flat storage is created and it is up-to-date.
use crate::flat_storage_resharder::FlatStorageResharder;
use crate::types::RuntimeAdapter;
use crate::{ChainStore, ChainStoreAccess};
use assert_matches::assert_matches;
Expand Down Expand Up @@ -388,6 +389,13 @@ impl FlatStorageShardCreator {
FlatStorageStatus::Disabled => {
panic!("initiated flat storage creation for shard {shard_id} while it is disabled");
}
// If the flat storage is undergoing resharding it means it was previously created
// successfully, but resharding itself hasn't been finished. This case is a no-op
// because the flat storage resharder has already been created in
// `create_flat_storage_for_current_epoch`.
FlatStorageStatus::Resharding(_) => {
return Ok(true);
}
};
Ok(false)
}
Expand All @@ -403,10 +411,13 @@ pub struct FlatStorageCreator {
impl FlatStorageCreator {
/// For each of tracked shards, either creates flat storage if it is already stored on DB,
/// or starts migration to flat storage which updates DB in background and creates flat storage afterwards.
///
/// Also resumes any resharding operation which was already in progress.
pub fn new(
epoch_manager: Arc<dyn EpochManagerAdapter>,
runtime: Arc<dyn RuntimeAdapter>,
chain_store: &ChainStore,
flat_storage_resharder: &FlatStorageResharder,
num_threads: usize,
) -> Result<Option<Self>, Error> {
let flat_storage_manager = runtime.get_flat_storage_manager();
Expand All @@ -420,6 +431,7 @@ impl FlatStorageCreator {
&epoch_manager,
&flat_storage_manager,
&runtime,
&flat_storage_resharder,
)?;

// Create flat storage for the shards in the next epoch. This only
Expand Down Expand Up @@ -447,6 +459,7 @@ impl FlatStorageCreator {
epoch_manager: &Arc<dyn EpochManagerAdapter>,
flat_storage_manager: &FlatStorageManager,
runtime: &Arc<dyn RuntimeAdapter>,
_flat_storage_resharder: &FlatStorageResharder,
) -> Result<HashMap<ShardUId, FlatStorageShardCreator>, Error> {
let epoch_id = &chain_head.epoch_id;
tracing::debug!(target: "store", ?epoch_id, "creating flat storage for the current epoch");
Expand All @@ -473,6 +486,10 @@ impl FlatStorageCreator {
);
}
FlatStorageStatus::Disabled => {}
FlatStorageStatus::Resharding(_status) => {
// TODO(Trisfald): call resume
// flat_storage_resharder.resume(shard_uid, &status, ...)?;
}
}
}

Expand Down Expand Up @@ -502,7 +519,8 @@ impl FlatStorageCreator {
}
FlatStorageStatus::Empty
| FlatStorageStatus::Creation(_)
| FlatStorageStatus::Disabled => {
| FlatStorageStatus::Disabled
| FlatStorageStatus::Resharding(_) => {
// The flat storage for children shards will be created
// separately in the resharding process.
}
Expand Down
Loading

0 comments on commit 04eb2c7

Please sign in to comment.