Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

Handle reorganizations in the state cache #2490

Merged
merged 7 commits into from
Oct 7, 2016
Merged

Conversation

arkpar
Copy link
Collaborator

@arkpar arkpar commented Oct 6, 2016

Tests pending

@arkpar arkpar added A0-pleasereview 🤓 Pull request needs code review. B0-patch labels Oct 6, 2016
@coveralls
Copy link

Coverage Status

Coverage increased (+0.05%) to 86.464% when pulling abbf0ef on state-cache-reorg into ecf098e on master.


// Check if the account can be returned from cache by matching current block parent hash against canonical
// state and filtering out account modified in later blocks.
fn is_allowed(addr: &Address, parent_hash: &Option<H256>, modifications: &VecDeque<BlockChanges>) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this a relatively expensive check to do on each cache lookup? Can it be made more efficient by creating a view over a specific subset of the cache (excluding non-canonical entries based on a given block hash) which can be queried in constant time?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not expensive at all compared to the database or even HashMap lookup. In most cases it will break out on the first loop iteration. Current implementation support sharing the cache with other StateDB objects, which can commit independently. Creating a "view" would make it impossible.

@@ -459,6 +458,8 @@ impl Client {
enacted: route.enacted.clone(),
retracted: route.retracted.len()
});
let is_canon = route.omitted.len() == 0;
Copy link
Contributor

@gavofyork gavofyork Oct 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird way of determining canonicality - do you really mean is_canon? what's wrong with the method above? probably more direct to check whether this block is the last enacted.

@@ -486,7 +486,7 @@ impl State {
let mut addresses = self.cache.borrow_mut();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit_cache -> what does this even mean? a "cache" cannot be committed - it is simply a speedily accessible version of what is on a secondary storage device (i.e. disk). if it's something that has changed without its backing store then it's a "buffer".

@@ -486,7 +486,7 @@ impl State {
let mut addresses = self.cache.borrow_mut();
trace!("Committing cache {:?} entries", addresses.len());
for (address, a) in addresses.drain().filter(|&(_, ref a)| !a.is_dirty()) {
self.db.cache_account(address, a.account);
self.db.cache_account(address, a.account, a.state == AccountState::Commited);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache_account doesn't particularly follow as an operation i might want to do as part of commit_cache. one suggests loading from disk into memory and the other suggests propagating from memory back to disk.

modifications: VecDeque<BlockChanges>,
}

struct CacheQueueItem {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs needed.

#[derive(Debug)]
// Accumulates a list of accounts changed in a block.
struct BlockChanges {
number: BlockNumber,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more docs please

parent_hash: Option<H256>,
// Hash of the committing block
commit_hash: Option<H256>,
// Number of the committing block
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explain what None means, too.

account_bloom: Arc<Mutex<Bloom>>,
// Hash of the block on top of which this instance was created
parent_hash: Option<H256>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explain what None means, too.

// Hash of the block on top of which this instance was created
parent_hash: Option<H256>,
// Hash of the committing block
commit_hash: Option<H256>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explain what None means, too.

@@ -34,6 +38,24 @@ pub const ACCOUNT_BLOOM_HASHCOUNT_KEY: &'static [u8] = b"account_hash_count";
struct AccountCache {
/// DB Account cache. `None` indicates that account is known to be missing.
accounts: LruCache<Address, Option<Account>>,
/// Accounts changed in recent blocks. Ordered by block number.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are they all strictly parent-child?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, there can be multiple blocks with the same number

Ok(records)
}

/// Update canonical cache. This should be called after the block has been commited and the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Update canonical cache" not a particularly enlightening description of what this mammoth function is meant to be doing. A lot more detail needed. What is its purpose and how does it achieve that?

@arkpar
Copy link
Collaborator Author

arkpar commented Oct 6, 2016

Renamed and documented a few methods

Ok(records)
}

/// Apply pending cache changes and synchronize canonical
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"pending cache changes" -> is it fair to say these are "cache changes" but "buffered changes" or "uncommitted changes"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the changes are already committed to the disk at this point. "Buffered" works

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe "recently committed cache"? if they are on disk then it's completely fair to call it a cache.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, most of these don't get "committed" anywhere. Thy are just transferred from State cache to StateDB cache in atomic manner.

self.cache_overlay.clear();
let mut cache = self.account_cache.lock();
cache.accounts.clear();
/// Add pending cache change.
Copy link
Contributor

@gavofyork gavofyork Oct 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would prefer to avoid "cache changes" or even "cache modifications". "buffered changes" or "uncommitted changes" are better. if we really are mixing caching and buffering into the same data structure then stlil prefer to rename "cache changes" to "dirty cache entries".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean by "dirty". Some of these changes are just accounts cached in the State locally and were not modified at all.

Copy link
Contributor

@gavofyork gavofyork Oct 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Add pending cache change" suggests they were all modified.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache change means here that cache is modified, not the data that is cached. I.e. something is read from disk and added to the cache: cache is modified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see - in that case call it cache_populated or more_data_cached?

// all its parent until we hit the canonical block,
// checking against all the intermediate modifications.
let mut iter = modifications.iter();
while let Some(ref m) = iter.next() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if they're the same number and containing the account, but not the block in question? shouldn't it ignore it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should check against changes made in all children of the parent block and canonical children of the first canonical parent of the parent block. Completely correct check would be to complicated and error-prone gaining little. This implementation is more simple although it may result in some of the accounts being re-cached in stale blocks. This is a rare case and it does not affect performance because most of the blocks access a lot of accounts, but only modify a few.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.05%) to 86.457% when pulling aab0a59 on state-cache-reorg into ecf098e on master.

@arkpar arkpar force-pushed the state-cache-reorg branch from aab0a59 to 666b0be Compare October 6, 2016 15:48
@arkpar
Copy link
Collaborator Author

arkpar commented Oct 6, 2016

Renamed "pending" to buffered and added a baic test.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.02%) to 86.434% when pulling 666b0be on state-cache-reorg into 6c1b2fb on master.

@arkpar arkpar force-pushed the state-cache-reorg branch from 666b0be to 79bbc26 Compare October 6, 2016 19:16
@coveralls
Copy link

Coverage Status

Coverage increased (+0.05%) to 86.488% when pulling 79bbc26 on state-cache-reorg into 0c7a287 on master.

@@ -523,11 +523,12 @@ impl State {
Ok(())
}

fn commit_cache(&mut self) {
/// Merge local cache into shared canonical state cache.
fn update_shared_cache(&mut self) {
Copy link
Contributor

@gavofyork gavofyork Oct 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

propagate_to_global_cache

Ok(records)
}

/// Apply buffered cache changes and synchronize canonical
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"buffered cache changes" what does this mean? "cache change" here presumably doesn't mean "population form disk"? "Apply" to what?

@coveralls
Copy link

Coverage Status

Coverage increased (+0.06%) to 86.494% when pulling d6480e3 on state-cache-reorg into 0c7a287 on master.

let mut cache = self.account_cache.lock();
cache.accounts.clear();
/// Add pending cache change.
/// The change is queued to be applied in `commit`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's a "change", why allow modified == false?

@arkpar arkpar force-pushed the state-cache-reorg branch from d6480e3 to 0f80b7b Compare October 7, 2016 10:40
struct AccountCache {
/// DB Account cache. `None` indicates that account is known to be missing.
accounts: LruCache<Address, Option<Account>>,
/// Accounts changed in recently committed blocks. Ordered by block number.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Information on the modifications in recently committed blocks; specifically which addresses
changed in which block."

@@ -42,28 +74,32 @@ struct AccountCache {
/// For canonical clones cache changes are accumulated and applied
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"cache changes"

@@ -42,28 +74,32 @@ struct AccountCache {
/// For canonical clones cache changes are accumulated and applied
/// on commit.
/// For non-canonical clones cache is cleared on commit.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"For non-canonical clones, any dirty entries are dropped on commit."

@gavofyork gavofyork added A8-looksgood 🦄 Pull request is reviewed well. and removed A0-pleasereview 🤓 Pull request needs code review. labels Oct 7, 2016
@arkpar arkpar force-pushed the state-cache-reorg branch from 3ce5988 to 59167f3 Compare October 7, 2016 11:00
@gavofyork gavofyork merged commit 72ec936 into master Oct 7, 2016
@gavofyork gavofyork deleted the state-cache-reorg branch October 7, 2016 11:34
@coveralls
Copy link

Coverage Status

Coverage increased (+0.05%) to 86.44% when pulling 59167f3 on state-cache-reorg into 4655fd0 on master.

jacogr added a commit that referenced this pull request Oct 7, 2016
* commit '72ec9366ad45526d16f6c78b5dfb32d16ba5aa6c':
  Handle reorganizations in the state cache (#2490)
  terminate after 30 seconds (#2513)
jacogr added a commit that referenced this pull request Oct 8, 2016
* js: (228 commits)
  registration in place
  Backports to master (#2530)
  lookup hash
  ethcore_hashContent call
  single input for commit/filename
  basic githubhint layout
  Handle reorganizations in the state cache (#2490)
  terminate after 30 seconds (#2513)
  allow updates of the secure token
  Using pending block only if not old (#2514)
  Caching optimizations (#2505)
  rework connection display
  basic test for manual token
  Fixed overflow panic in handshake_panic (#2495)
  Trim password from file (#2503)
  Fixing RPC Filter conversion to EthFilter (#2500)
  init token updates take place
  initial token connection - WIP
  Fixing error message for transactions (#2496)
  Adjustable stack size for EVM (#2483)
  ...

# Conflicts:
#	js/src/dapps/registry/Application/application.js
#	js/src/dapps/registry/Container.js
#	js/src/dapps/registry/actions.js
#	js/src/dapps/registry/reducers.js
jacogr added a commit that referenced this pull request Oct 8, 2016
* js: (23 commits)
  alignment back to normal
  Print backtrace on panic (#2535)
  NEVER export class instance functions
  stop wrongly using main app IdentityIcon
  pass value through as-is
  events is using a proper table
  fixes
  registration in place
  Backports to master (#2530)
  lookup hash
  ethcore_hashContent call
  single input for commit/filename
  basic githubhint layout
  Handle reorganizations in the state cache (#2490)
  terminate after 30 seconds (#2513)
  registry: fix setting A records
  registry: reducer returned invalid state
  registry: don't hash for A records
  registry: sha3 value first
  registry: connect record management component
  ...
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A8-looksgood 🦄 Pull request is reviewed well.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants