core: split the db blocks into components, move TD out top level #1779

karalabe · 2015-09-07T17:45:00Z

This PR features the following:

Split the block storage into separate headers and bodies (need pure header mode of operation).
Support retrieving the raw RLP header/body data (during query, no need to parse and re-flatten).
Refactor eth to use direct header/body accesses instead of entire block reads from the database.
Factor out the TD from the blocks and assign top level database entries (prep header only mode).
Drop direct database "by number" gets (don't duplicate chain manager logic (also has cache)).
Tests for all primitive (header, body, td, block) data storage and retrieval operations.
Tests for metadata (canonical number, head header, head block) data storage and retrieval.

robotally · 2015-09-07T17:45:00Z

Vote	Count	Reviewers
👍	2	@obscuren @zsfelfoldi
👎	0

Updated: Fri Sep 11 15:10:42 UTC 2015

karalabe · 2015-09-08T14:53:50Z

@fjl @obscuren Please review this PR so that maybe we could push it into the next release.

Gustav-Simonsson · 2015-09-08T19:49:25Z

core/chain_manager.go

@@ -84,13 +90,24 @@ type ChainManager struct {
 }

 func NewChainManager(chainDb common.Database, pow pow.PoW, mux *event.TypeMux) (*ChainManager, error) {
-	cache, _ := lru.New(blockCacheLimit)
+	headerCache, _ := lru.New(headerCacheLimit)


Is there a benchmark to show if/how much performance gain these caches give? Given that both leveldb and the OS will cache frequently used data I'm wary of adding yet another caching layer and the complexity that goes with it unless benchmarks show it's a bottleneck and would give significant performance boost.

You are partially right, but there's a catch. LevelDB keeps its data sorted by key on disk, rearranging them when some thresholds are crossed. As we're storing everything by hash, we are essentially doing random writes and random reads. From the write perspective it means, that disk caches get invalidated frequently (i.e. if there's a leveldb compaction, I'm betting everything goes stale). From the read perspective it means, that as every read is at a completely different location than the previous ones, there's extremely small chance of a disk cache hit, so we'll keep trashing the disk unless we find a way to do smarter caching.

Maybe the values are off and not completely correct. Those should be benchmarked somehow and experimented with, but the current chain size is inappropriate for these purposes as it's tiny compared to total available OS memory. Unless we start hitting 2-3x the OS memory limits, we won't see much action from these caches. However, I would not simply remove them just to readd them when the time comes.

Right, makes sense in context of our keys being effectively random as they are (even cryptographically secure) hashes. Note I'm not doubting that it would give something - I know you researched the leveldb caching thoroughly before - it's more about the practice of having benchmarks to back up optimisations. In my book this is similar to adding tests proving correctness of new functionality. Even if it's evident that caching here will give a boost, benchmarks will quantify it and inform us of where bottlenecks and prioritises are.

Agreed, but sadly we don't have a big enough network to realistically test it. We could write some superficial benchmarks, but those can generate any result we want. If we do a trashing benchmark, it will show that it matters, a naive benchmark may show it doesn't. But neither will represent reality. The reason I suggest to leave it in for now is because we introduced it at a certain point in Olympic to lighten the db load, and I'm assuming there was some actual benefit. However, we cannot really test it out now so we'll have to wait until high load appears again to verify (realistically I mean).

If we cannot test it then we shouldn't add it, because we cannot verify how much benefit it gives. That would be like making a bug fix but have no way to verify if the fix worked.

It's better we have some benchmarks to start with even if they do not cover everything or are artificial in nature. At least they would provide a starting point for reasoning about how different caches and cache sizes affect performance.

Debate it with @obscuren, he's the one who originally added the cache :P. I just extended it for the split database layout.

The cache is a relic. I'm completely fine removing it (and re-adding it if required + benchmarks)

I shall pick up this task in core-refactor or in a separate PR #1788

obscuren · 2015-09-10T13:49:43Z

👍 requires squashing

karalabe · 2015-09-10T14:29:34Z

Will do after enough fingers :D

zsfelfoldi · 2015-09-10T17:41:21Z

👍

fjl · 2015-09-10T17:50:29Z

eth/peer.go

+// SendBlockBodiesRLP sends a batch of block contents to the remote peer from
+// an already RLP encoded format.
+func (p *peer) SendBlockBodiesRLP(bodies []*blockBodyRLP) error {
+	return p2p.Send(p.rw, BlockBodiesMsg, blockBodiesRLPData(bodies))


Please use rlp.RawValue instead of *blockBodyRLP when #1778 is on develop (edit: it is now).
The conversion to blockBodiesRLPData is also unnecessary (package rlp knows how to encode slices of encoder types).

Btw, I don't get why we have these methods. IMHO using p2p.Send(p.rw, ...) in place of the method is just as clear. They're just boilerplate.

karalabe · 2015-09-11T14:43:59Z

@obscuren Rebased on develop, squashed into two commits (block split and extraction). Before merging please wait for a thumbs up from @fjl as he requested some minor changes that were enabled by his PR.

@fjl PTAL

fjl · 2015-09-11T14:53:55Z

                                                                                  I didn't really mean that the return types of GetHeaderRLP, GetBodyRLP should also be RawValue but it might make it even clearer.

core: split the db blocks into components, move TD out top level

fjl · 2015-09-11T15:10:40Z

                                                                                  Just to ‎make it clear: that part now LGTM.I must say that I only skimmed most of the PR, but others reviewed in detail so it's probably OK too. Some general background on the return types:Go's type system defines types as assignable when their underlying type is the same and they're not both named types. This is supposed to prevent bugs because a named type signals intent. As a consequence, returning a named type (rlp.RawValue) instead of a primitive type ([]byte) can have the unintended consequence that callers need to convert explicitly if they want to assign to some other named type. Returning a named type also makes less usable in caller-defined interfaces because they need to import the type's package just to list the type in the interface definition.‎ That doesn't apply here though because these aren't methods.

Most eth RPC calls that work with blocks crashed when the block was not found because they called Hash on a nil block. This is a regression introduced in cdc2662 (ethereum#1779). While here, remove the insane conversions in get*CountBy*. There is no need to construct a complete BlockRes and converting int->int64->*big.Int->[]byte->hexnum->string to format the length of a slice as hex.

obscuren added the in progress label Sep 7, 2015

karalabe mentioned this pull request Sep 7, 2015

core: split block storage into workable units #1767

Closed

4 tasks

karalabe added the core label Sep 7, 2015

karalabe force-pushed the split-block-storage-3000 branch 2 times, most recently from 25fe510 to 2e5dfb9 Compare September 8, 2015 14:24

karalabe added review and removed in progress labels Sep 8, 2015

karalabe changed the title ~~Split block storage 3000~~ core: split the db blocks into primitive components, move TD out top level too Sep 8, 2015

karalabe changed the title ~~core: split the db blocks into primitive components, move TD out top level too~~ core: split the db blocks into components, move TD out top level Sep 8, 2015

Gustav-Simonsson reviewed Sep 8, 2015
View reviewed changes

fjl mentioned this pull request Sep 9, 2015

core, trie: new trie #1405

Merged

karalabe force-pushed the split-block-storage-3000 branch from 55a7d1a to 0e1433f Compare September 9, 2015 11:06

fjl reviewed Sep 10, 2015
View reviewed changes

fjl added this to the 1.2.0 milestone Sep 10, 2015

karalabe force-pushed the split-block-storage-3000 branch from 0e1433f to c0a860b Compare September 11, 2015 14:27

karalabe added 2 commits September 11, 2015 17:42

core, eth: split the db blocks into headers and bodies

2b339cb

core: split out TD from database and all internals

cdc2662

karalabe force-pushed the split-block-storage-3000 branch from dd20888 to cdc2662 Compare September 11, 2015 14:42

obscuren added a commit that referenced this pull request Sep 11, 2015

Merge pull request #1779 from karalabe/split-block-storage-3000

0eac601

core: split the db blocks into components, move TD out top level

obscuren merged commit 0eac601 into ethereum:develop Sep 11, 2015

obscuren removed the review label Sep 11, 2015

fjl mentioned this pull request Sep 22, 2015

rpc/api: don't crash for unknown blocks #1842

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core: split the db blocks into components, move TD out top level #1779

core: split the db blocks into components, move TD out top level #1779

karalabe commented Sep 7, 2015

robotally commented Sep 7, 2015

karalabe commented Sep 8, 2015

Gustav-Simonsson Sep 8, 2015

karalabe Sep 9, 2015

Gustav-Simonsson Sep 9, 2015

karalabe Sep 9, 2015

Gustav-Simonsson Sep 9, 2015

karalabe Sep 9, 2015

obscuren Sep 10, 2015

obscuren Sep 10, 2015

obscuren commented Sep 10, 2015

karalabe commented Sep 10, 2015

zsfelfoldi commented Sep 10, 2015

fjl Sep 10, 2015

karalabe commented Sep 11, 2015

fjl commented Sep 11, 2015

fjl commented Sep 11, 2015

core: split the db blocks into components, move TD out top level #1779

core: split the db blocks into components, move TD out top level #1779

Conversation

karalabe commented Sep 7, 2015

robotally commented Sep 7, 2015

karalabe commented Sep 8, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

obscuren commented Sep 10, 2015

karalabe commented Sep 10, 2015

zsfelfoldi commented Sep 10, 2015

Choose a reason for hiding this comment

karalabe commented Sep 11, 2015

fjl commented Sep 11, 2015

fjl commented Sep 11, 2015