Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

Use im::HashMap for storage_changes #9465

Closed
wants to merge 7 commits into from
Closed

Conversation

sorpaas
Copy link
Collaborator

@sorpaas sorpaas commented Sep 4, 2018

rel #9427

This uses im::HashMap for Account::storage_changes. In the future we should refactor to use PDS for other account changes, but right now I'm still figuring out some entanglement related to state and state_db mod.

The previous bottleneck was in Account::clone_dirty, where we do a full clone of all changes at each checkpoint. This PR relieves that issue. The complexity comparison is as follows:

  • Get: O(1) -> O(logn)
  • Insert: O(1) -> O(logn)
  • Clone: O(n) -> O(1)

Note that O(logn) is actually a small number -- if a storage uses up all 256-bit variations and we consider the complexity to be O(log(2)n), then it will only take at most 256 hops.

I'm still thinking about how to do a benchmark on this change, but currently haven't thought of a good way.

@sorpaas sorpaas added A0-pleasereview 🤓 Pull request needs code review. M4-core ⛓ Core client code / Rust. labels Sep 4, 2018
@sorpaas sorpaas added this to the 2.1 milestone Sep 4, 2018
Copy link
Collaborator

@dvdplm dvdplm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool stuff. Exciting to see the benchmark of this!

use std::time::Duration;
use ethereum_types::{H128, H256, U256, Address};
use hash::keccak;
use im::{HashMap as IMHashMap};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just use im::HashMap as IMHashMap;

@@ -30,6 +30,7 @@ use trie::{Trie, Recorder};
use ethtrie::{TrieFactory, TrieDB, SecTrieDB, Result as TrieResult};
use pod_account::*;
use rlp::{RlpStream, encode};
use im::{HashMap as IMHashMap};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@dvdplm
Copy link
Collaborator

dvdplm commented Sep 5, 2018

Do you plan to add benchmarks in this PR or in a separate one? The potential performance impact is my only concern at this point.

@ordian
Copy link
Collaborator

ordian commented Sep 5, 2018

I agree with David, that since the whole point of switching is performance optimization, it would be nice to have some benchmarks to justify the changes. Also note, that im is a thread-safe version of the library, there is also im-rc, which is thread unsafe and faster. Although maybe we can eliminate some locks with im.

Copy link
Contributor

@andresilva andresilva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM but I'd also like to see some benchmarks. Also as @ordian mentioned maybe we can switch to the thread unsafe version since we won't be removing any locks at this point?

@5chdn 5chdn added A5-grumble 🔥 Pull request has minor issues that must be addressed before merging. and removed A0-pleasereview 🤓 Pull request needs code review. labels Sep 6, 2018
Copy link
Contributor

@lexfrl lexfrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, looks like this is a first usage of the persistent data structures in parity-ethereum! Would be great to take a look on some benchmarks also! 👍

for (k, v) in self.storage_changes.drain() {
let old_storage_changes = {
let mut tmp = im::HashMap::new();
::std::mem::swap(&mut tmp, &mut self.storage_changes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite related to this specific PR, but I just wonder, what will happen if it'll fail to commit? Should we concern about this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then that error will be propagate out and cause the block import to fail. I think we handled that correctly in Client.

@sorpaas
Copy link
Collaborator Author

sorpaas commented Sep 10, 2018

I did a benchmark on this, it's actually slower:

	#[test]
	fn storage_bench() {
		use std::time::{Duration, Instant};
		let now = Instant::now();

		let a = 10.into();

		let db = get_temp_state_db();
		let mut state = State::new(db, U256::from(0), Default::default());

		for i in 0..1000000 {
			state.set_storage(&a, H256::from(&U256::from(i)), H256::from(&U256::from(20u64))).unwrap();
		}

		state.checkpoint();
		state.set_storage(&a, H256::from(&U256::from(0)), H256::from(&U256::from(20u64))).unwrap();

		for i in 0..1000000 {
			state.storage_at(&a, &H256::from(&U256::from(i))).unwrap();
		}

		println!("Elapsed: {:?}", now.elapsed());
	}

I think the reason is due to Arc, whose locks give performance penalty. I'm re-thinking about the plan on #9427:

So for the short-term, I think it would be okay to keep the current state checkpointing logic as is -- I don't think that's part of our current performance bottleneck right now.

@sorpaas sorpaas closed this Sep 10, 2018
@sorpaas sorpaas mentioned this pull request Sep 10, 2018
@dvdplm
Copy link
Collaborator

dvdplm commented Sep 10, 2018

Thank you for trying this out, these things can be subtle! :)

Most storage changes are actually really small,

Is it true that while most might be really small, there are some that are really big?

@sorpaas
Copy link
Collaborator Author

sorpaas commented Sep 10, 2018

Is it true that while most might be really small, there are some that are really big?

Yes that's true. Some indeed can be big.

@5chdn 5chdn deleted the sp-state-refactor branch November 27, 2018 11:42
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A5-grumble 🔥 Pull request has minor issues that must be addressed before merging. M4-core ⛓ Core client code / Rust.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants