-
Notifications
You must be signed in to change notification settings - Fork 20.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: split the db blocks into components, move TD out top level #1779
Conversation
Updated: Fri Sep 11 15:10:42 UTC 2015 |
25fe510
to
2e5dfb9
Compare
@@ -84,13 +90,24 @@ type ChainManager struct { | |||
} | |||
|
|||
func NewChainManager(chainDb common.Database, pow pow.PoW, mux *event.TypeMux) (*ChainManager, error) { | |||
cache, _ := lru.New(blockCacheLimit) | |||
headerCache, _ := lru.New(headerCacheLimit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a benchmark to show if/how much performance gain these caches give? Given that both leveldb and the OS will cache frequently used data I'm wary of adding yet another caching layer and the complexity that goes with it unless benchmarks show it's a bottleneck and would give significant performance boost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are partially right, but there's a catch. LevelDB keeps its data sorted by key on disk, rearranging them when some thresholds are crossed. As we're storing everything by hash, we are essentially doing random writes and random reads. From the write perspective it means, that disk caches get invalidated frequently (i.e. if there's a leveldb compaction, I'm betting everything goes stale). From the read perspective it means, that as every read is at a completely different location than the previous ones, there's extremely small chance of a disk cache hit, so we'll keep trashing the disk unless we find a way to do smarter caching.
Maybe the values are off and not completely correct. Those should be benchmarked somehow and experimented with, but the current chain size is inappropriate for these purposes as it's tiny compared to total available OS memory. Unless we start hitting 2-3x the OS memory limits, we won't see much action from these caches. However, I would not simply remove them just to readd them when the time comes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, makes sense in context of our keys being effectively random as they are (even cryptographically secure) hashes. Note I'm not doubting that it would give something - I know you researched the leveldb caching thoroughly before - it's more about the practice of having benchmarks to back up optimisations. In my book this is similar to adding tests proving correctness of new functionality. Even if it's evident that caching here will give a boost, benchmarks will quantify it and inform us of where bottlenecks and prioritises are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, but sadly we don't have a big enough network to realistically test it. We could write some superficial benchmarks, but those can generate any result we want. If we do a trashing benchmark, it will show that it matters, a naive benchmark may show it doesn't. But neither will represent reality. The reason I suggest to leave it in for now is because we introduced it at a certain point in Olympic to lighten the db load, and I'm assuming there was some actual benefit. However, we cannot really test it out now so we'll have to wait until high load appears again to verify (realistically I mean).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we cannot test it then we shouldn't add it, because we cannot verify how much benefit it gives. That would be like making a bug fix but have no way to verify if the fix worked.
It's better we have some benchmarks to start with even if they do not cover everything or are artificial in nature. At least they would provide a starting point for reasoning about how different caches and cache sizes affect performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debate it with @obscuren, he's the one who originally added the cache :P. I just extended it for the split database layout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache is a relic. I'm completely fine removing it (and re-adding it if required + benchmarks)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I shall pick up this task in core-refactor
or in a separate PR #1788
55a7d1a
to
0e1433f
Compare
👍 requires squashing |
Will do after enough fingers :D |
👍 |
// SendBlockBodiesRLP sends a batch of block contents to the remote peer from | ||
// an already RLP encoded format. | ||
func (p *peer) SendBlockBodiesRLP(bodies []*blockBodyRLP) error { | ||
return p2p.Send(p.rw, BlockBodiesMsg, blockBodiesRLPData(bodies)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use rlp.RawValue
instead of *blockBodyRLP
when #1778 is on develop (edit: it is now).
The conversion to blockBodiesRLPData
is also unnecessary (package rlp knows how to encode slices of encoder types).
Btw, I don't get why we have these methods. IMHO using p2p.Send(p.rw, ...)
in place of the method is just as clear. They're just boilerplate.
0e1433f
to
c0a860b
Compare
dd20888
to
cdc2662
Compare
|
core: split the db blocks into components, move TD out top level
|
Most eth RPC calls that work with blocks crashed when the block was not found because they called Hash on a nil block. This is a regression introduced in cdc2662 (ethereum#1779). While here, remove the insane conversions in get*CountBy*. There is no need to construct a complete BlockRes and converting int->int64->*big.Int->[]byte->hexnum->string to format the length of a slice as hex.
This PR features the following: