chain ancestry hash storage #19319

holiman · 2019-03-23T14:47:12Z

This PR implements a hash storage, which is a fixed-size slice (currently set to 288 x 32 bytes) that is used to store ancestor hashes. When a new block is imported, it is added to the storage.

Reorgs are handled internally, by just swapping out the hash at the given number and wiping information about descendants (an O(1) operation)

This makes fetching ancestry hashes from a given reference header an O(1) operation. In the case that all hashes are not availalbe, iteration can be done from the earliest point.

This PR also adds a new mehtod to the ChainContext interface, to expose this method to the EVM.
The lookup is currently only used by the blockhash operation, but can also be used instead of iterating when we receive a GetBlockHeadersMsg from a peer.

The structure is not threadsafe, but all current uses of it happens from within already mutex:ed places. For use in p2p communication, we'd probably need to make it larger. It's at 8k now, but even storing the last million hashes would only take up 32M of memory.

I'll follow up with charts later. This PR is an alternative to #19291

csplawn918 · 2019-03-23T15:26:31Z

Amazing solution great thinking. Problem solver a very important gift. Thank you for your commit---ment. Your fellow problem solver

holiman · 2019-03-25T10:01:29Z

This PR totally removes BLOCKHASH from the top 25. This PR is run3, and run2 is #19291. and run1 is current master.

As for why SMOD jumps up, I believe it's one of the least common opcodes, and thus the stats about that one has an extreme variance.

holiman · 2019-03-25T10:27:40Z

holiman · 2019-03-25T15:52:31Z

@zsfelfoldi is this something that the LES client could make use of ?

Matthalp-zz

Using a circular buffer to cache block hash lookups makes sense to me (if the great performance results weren't enough!). I've left a few comments that you can feel free to address. There are some other nits, but figured I would start with more high-level questions/comments.

Matthalp-zz · 2019-03-26T18:29:10Z

core/blockchain.go

+	}
+	if target > number {
+		// Should never happen
+		log.Error("Ancestor number must be <= descendant", "ancestor target", target, "descendant", number)


It seems like this code base's convention is to use panic for these scenarios. If it were me, I would have introduced a second error return value, but that's not the way this code base handles error conditions like these.

Well, it's a caller error if so. The blockhash opcode has internal guards to prevent letting through a too-recent number, but I don't want to place a panic there -- if it's used from the p2p layer, I don't want a flaw in that code to crash the client.

Right. It may be worth adding a comment to this method to make it explicit that this method is designed solely for BLOCKHASH so future contributors know the scope of this method.

Well, it can be used for other purposes, however, you can't ask "give me the ancestor at 105 for [block 100, hash H] and expect anything other than zeroes....

core/hashstorage.go

Matthalp-zz · 2019-03-26T18:39:35Z

core/hashstorage.go

+}
+
+// Newest returns the most recent (number, hash) stored
+func (hs *HashStorage) Newest() (uint64, common.Hash) {


Is there a future use for this method? From what I can tell it's only used in tests.

Only used in tests -- but it's very useful in those, and might be useful in the future

Matthalp-zz · 2019-03-26T18:43:12Z

core/blockchain.go

@@ -1736,3 +1743,47 @@ func (bc *BlockChain) SubscribeLogsEvent(ch chan<- []*types.Log) event.Subscript
 func (bc *BlockChain) SubscribeBlockProcessingEvent(ch chan<- bool) event.Subscription {
 	return bc.scope.Track(bc.blockProcFeed.Subscribe(ch))
 }
+
+// GetAncestorHash return the hash of ancestor at the given number
+func (bc *BlockChain) GetAncestorHash(ref *types.Header, target uint64) common.Hash {


Is there a reason not to put this logic in HeaderChain?

No, it should probably wind up there instead. Right now, however, it's only used for EVM execution of BLOCKHASH, and I was more confident in how the blockchain works than how the headerchain works.

If this is becomes used from any other place, like p2p, then it should definitely go into headerchain instead.

Makes sense. Thanks for clarifying.

core/blockchain.go

Matthalp-zz · 2019-03-26T18:52:57Z

core/blockchain.go

+		number = header.Number.Uint64() - 1
+		hash = header.ParentHash
+	}
+	return hash


Is it true that there are cases where hash will not correspond to target?

Yes. If we get an old sidechain. If hashhistory contains e..g blocks [1000 .. 1288], and we import sideblock on 1100 which requests block 900. Then the hash history can be used to jump back to block 1000, but the remaining 100 headers are iterated by this loop

Right. Because to get here we have already established we are on the canonical chain. Thanks!

Since we've established we are on the canonical chain, doesn't this mean that the for loop can be replaced with rawdb.ReadCanonicalHash (or a wrapper around it)?

It could, but it creates the invariant that the hash storage is used to track canon blocks. Right now, there's no such invariant within the hash storage itself

I'll think about it

Canonical blocks will be the common case, so its something to consider.

holiman · 2019-03-27T08:00:24Z

Thanks for reviewing!

core/blockchain.go

Matthalp-zz · 2019-03-27T14:11:32Z

core/headerchain.go

+// fast lookups for EVM execution. This should probably be improved if this
+// method becomes more used.
+func (hc *HeaderChain) GetAncestorHash(ref *types.Header, target uint64) common.Hash {
+	number := ref.Number.Uint64() - 1


I haven't checked if this method is ever used on HeaderChain (or LightChain, but if it's not then I would consider not including this code. The only place I could see it was maybe used was to adhere to the ChainContext interface, but neither of these chains support transaction processing (which is what the ChainContext says it is for).

No, they're not used, but they had to have it to provide ChainContext. I didn't investigate further why, but these are totally unused methods

Light clients can actually execute transactions on demand (i.e. CALL via RPC). It might be slow since it needs to pull state from remote nodes, but it should nonetheless be able to, so you do need the method.

core/hashstorage.go

karalabe · 2019-04-01T12:15:06Z

core/hashstorage.go

+}
+
+// Create a new storage with a header in it.
+func NewHashStorage(hdr *types.Header) *HashStorage {


It seems weird to me that to create a data structure, we already need an item to put into it. Why tie the two together? Can't we just have a constructor and a setter? Also, perhaps we could make the number of hashes it can contain a constructor paramter instead of hard coding it?

+1. Also a name nit: header is already pretty short that hdr is on that shortness-readability line.

The reason is that since it's a circular buffer, it simplifies things if it cannot be empty. Then we have to add checks to distinguish between if it's full or empty, basically.

core/hashstorage_test.go

core/hashstorage.go

Matthalp-zz · 2019-04-02T20:46:40Z

Moving to [hashBufferElems]common.Hash LGTM

core/hashbuffer.go

holiman · 2019-04-03T07:55:32Z

rebased to fix conflicts.
I'm not quite certain where to add it, if we want to move it into headerchain. The headerchain has WriteHeader, but that one is only called from Blockchain.InsertHeaderChain, which is not called during regular block import

core/hashbuffer.go

holiman · 2019-11-19T08:04:24Z

Squashed and rebased on master

…p is not on canon chain

holiman requested review from karalabe and zsfelfoldi as code owners March 23, 2019 14:47

holiman changed the title ~~WIP: Chain ancestry hash storage~~ chain ancestry hash storage Mar 25, 2019

Matthalp-zz reviewed Mar 26, 2019

View reviewed changes

Matthalp-zz reviewed Mar 27, 2019

View reviewed changes

core/blockchain.go Outdated Show resolved Hide resolved

Matthalp-zz reviewed Mar 27, 2019

View reviewed changes

karalabe reviewed Apr 1, 2019

View reviewed changes

core/hashstorage.go Outdated Show resolved Hide resolved

karalabe reviewed Apr 1, 2019

View reviewed changes

core/hashstorage_test.go Outdated Show resolved Hide resolved

karalabe reviewed Apr 1, 2019

View reviewed changes

core/hashstorage.go Outdated Show resolved Hide resolved

karalabe reviewed Apr 1, 2019

View reviewed changes

core/hashstorage.go Outdated Show resolved Hide resolved