Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement freezer database #508

Merged
merged 50 commits into from
Nov 26, 2019
Merged

Implement freezer database #508

merged 50 commits into from
Nov 26, 2019

Conversation

michaelsproul
Copy link
Member

@michaelsproul michaelsproul commented Aug 23, 2019

Issue Addressed

#499

Proposed Changes

  • Freezer database for all states older than the last finalized checkpoint.
  • Frozen states are stored efficiently by breaking out their FixedVector<T, N> fields into tables.
  • A migration thread runs in the background to copy states from the hot database to the freezer. Care is taken to avoid leaving the database in a half-copied state.
  • Freezer database is opt-in via the --db disk option. The old disk database remains available under --db simple-disk.

Shortcomings

  • No metrics for the freezer DB (yet)
  • The maximum distance for the freezer to lag finalization by isn't yet configurable. Wasn't sure whether it should be a command-line arg or just a config file parameter.

Additional Info

  • There are more fields of the BeaconState that we could store efficiently, but I'll leave that for future work.

Copy link
Member Author

@michaelsproul michaelsproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-review

beacon_node/rpc/src/lib.rs Outdated Show resolved Hide resolved
beacon_node/store/src/hot_cold_store.rs Outdated Show resolved Hide resolved
beacon_node/store/src/hot_cold_store.rs Outdated Show resolved Hide resolved
beacon_node/store/src/impls/beacon_state.rs Outdated Show resolved Hide resolved
eth2/types/src/beacon_state.rs Outdated Show resolved Hide resolved
@michaelsproul michaelsproul added the ready-for-review The code is ready for review label Aug 23, 2019
@michaelsproul michaelsproul self-assigned this Aug 23, 2019
@paulhauner paulhauner added under-review A reviewer has only partially completed a review. and removed ready-for-review The code is ready for review labels Aug 27, 2019
@paulhauner
Copy link
Member

I'm going to start a review on this. I expect to be done tomorrow!

Copy link
Member

@paulhauner paulhauner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like what you've done here! It's so exciting to see storage optimization!!

I've left a bunch of unimportant comments here, sorry for the noise. I think the only main thing I'd like to see is the RwLock pushed away from the "user".

In summary, awesome!

beacon_node/client/src/beacon_chain_types.rs Outdated Show resolved Hide resolved
beacon_node/rpc/src/validator.rs Outdated Show resolved Hide resolved
eth2/types/src/beacon_state.rs Outdated Show resolved Hide resolved
eth2/types/src/beacon_state.rs Outdated Show resolved Hide resolved
beacon_node/beacon_chain/src/beacon_chain.rs Outdated Show resolved Hide resolved
beacon_node/store/src/chunked_vector.rs Outdated Show resolved Hide resolved
beacon_node/store/src/hot_cold_store.rs Show resolved Hide resolved
}

/// Fetch a state from the store.
fn get_state<E: EthSpec>(
Copy link
Member

@paulhauner paulhauner Aug 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WRT needing additional additional fields in put/get, I was thinking we could split StoreItem into get/put and do :

However, I'm not sure that's better than what you have here. I'm just mentioning it FYI.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change your mind on this? I was thinking it might be quite nice, and would resolve your concern about phase1 as a first class citizen

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why I changed my mind on this, but I've changed it back again. I'm still down to have separate Put/Get traits. Perhaps not for this PR.

beacon_node/store/src/migrate.rs Outdated Show resolved Hide resolved
eth2/state_processing/src/per_epoch_processing.rs Outdated Show resolved Hide resolved
@paulhauner paulhauner added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed under-review A reviewer has only partially completed a review. labels Aug 28, 2019
@michaelsproul michaelsproul mentioned this pull request Aug 30, 2019
@paulhauner paulhauner added this to the Public Testnet milestone Nov 21, 2019
@michaelsproul michaelsproul added ready-for-review The code is ready for review and removed work-in-progress PR is a work-in-progress labels Nov 26, 2019
@michaelsproul
Copy link
Member Author

michaelsproul commented Nov 26, 2019

This is ready for review now. The previous issue with non-zero genesis values has been resolved, with special handling for them in chunked_vector.rs, and tests in store_tests.rs. Other recent changes include:

  • Integrating with the new BeaconChain builders and CLI interface. The freezer database will default to ~/.lighthouse/freezer_db, but can be adjusted using -freezer-dir. It should play nicely with the random data directories and backup process too.
  • Storing the freezer database's "split slot" in the database, so that a node started from an existing database can re-establish the split slot, and know what data lies in the freezer and what data lies in the hot database.
  • PartialBeaconStates stored in the freezer DB no longer carry committee caches -- they're not required, and omitting them should save on space.

Copy link
Member

@paulhauner paulhauner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super clean! Not a whole lot has changed since the last review :)

I'm not recommending any changes, happy to squerge!

beacon_node/beacon_chain/tests/store_tests.rs Outdated Show resolved Hide resolved
}

/// Fetch a state from the store.
fn get_state<E: EthSpec>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why I changed my mind on this, but I've changed it back again. I'm still down to have separate Put/Get traits. Perhaps not for this PR.


/// Provides a wrapper for an iterator that returns a given `T` before it starts returning results of
/// the `Iterator`.
pub struct ReverseChainIterator<T, I> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm, state Iterators aren't presently optimized for KPDB at the moment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not yet, but they're not obviously suboptimal AFAICT

@@ -51,6 +72,20 @@ pub trait Store: Sync + Send + Sized {
I::db_delete(self, key)
}

/// Store a state in the store.
fn put_state<E: EthSpec>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not suggesting any change here, but I was considering these functions and how the Store trait alone is a bit inflexible.

I was thinking that if we need to keep adding functionality to this (I'm not saying we will, it just seems possible) that perhaps we could use a builder instead.

At this present moment, we have the following requirements:

  • Ability to have type-specific store/put impls (we have this already)
  • Ability to specify parameters of the request (we don't already have this).

E.g.,

let state: BeaconState<E> = RequestBuilder::new(store)
  .slot(42)
  .err_if_none()  // <-- seems cool
  .get()?;

Once again, I'm not suggesting a change to the PR, I'm just thinking out loud :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that does seem cool! Would be worth investigating for sure

@paulhauner paulhauner added ready-to-merge and removed ready-for-review The code is ready for review labels Nov 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants