-
Notifications
You must be signed in to change notification settings - Fork 20.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chain ancestry hash storage #19319
chain ancestry hash storage #19319
Conversation
Amazing solution great thinking. Problem solver a very important gift. Thank you for your commit---ment. Your fellow problem solver |
This PR totally removes As for why |
@zsfelfoldi is this something that the LES client could make use of ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a circular buffer to cache block hash lookups makes sense to me (if the great performance results weren't enough!). I've left a few comments that you can feel free to address. There are some other nits, but figured I would start with more high-level questions/comments.
core/blockchain.go
Outdated
} | ||
if target > number { | ||
// Should never happen | ||
log.Error("Ancestor number must be <= descendant", "ancestor target", target, "descendant", number) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this code base's convention is to use panic
for these scenarios. If it were me, I would have introduced a second error
return value, but that's not the way this code base handles error conditions like these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's a caller error if so. The blockhash
opcode has internal guards to prevent letting through a too-recent number, but I don't want to place a panic
there -- if it's used from the p2p layer, I don't want a flaw in that code to crash the client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. It may be worth adding a comment to this method to make it explicit that this method is designed solely for BLOCKHASH
so future contributors know the scope of this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it can be used for other purposes, however, you can't ask "give me the ancestor at 105
for [block 100, hash H]
and expect anything other than zeroes....
core/hashstorage.go
Outdated
} | ||
|
||
// Newest returns the most recent (number, hash) stored | ||
func (hs *HashStorage) Newest() (uint64, common.Hash) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a future use for this method? From what I can tell it's only used in tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only used in tests -- but it's very useful in those, and might be useful in the future
@@ -1736,3 +1743,47 @@ func (bc *BlockChain) SubscribeLogsEvent(ch chan<- []*types.Log) event.Subscript | |||
func (bc *BlockChain) SubscribeBlockProcessingEvent(ch chan<- bool) event.Subscription { | |||
return bc.scope.Track(bc.blockProcFeed.Subscribe(ch)) | |||
} | |||
|
|||
// GetAncestorHash return the hash of ancestor at the given number | |||
func (bc *BlockChain) GetAncestorHash(ref *types.Header, target uint64) common.Hash { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason not to put this logic in HeaderChain
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it should probably wind up there instead. Right now, however, it's only used for EVM execution of BLOCKHASH
, and I was more confident in how the blockchain works than how the headerchain works.
If this is becomes used from any other place, like p2p, then it should definitely go into headerchain instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Thanks for clarifying.
core/blockchain.go
Outdated
number = header.Number.Uint64() - 1 | ||
hash = header.ParentHash | ||
} | ||
return hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it true that there are cases where hash
will not correspond to target
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. If we get an old sidechain. If hashhistory contains e..g blocks [1000
.. 1288
], and we import sideblock on 1100
which requests block 900
. Then the hash history can be used to jump back to block 1000
, but the remaining 100
headers are iterated by this loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Because to get here we have already established we are on the canonical chain. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we've established we are on the canonical chain, doesn't this mean that the for
loop can be replaced with rawdb.ReadCanonicalHash
(or a wrapper around it)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could, but it creates the invariant that the hash storage is used to track canon blocks. Right now, there's no such invariant within the hash storage itself
I'll think about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Canonical blocks will be the common case, so its something to consider.
Thanks for reviewing! |
// fast lookups for EVM execution. This should probably be improved if this | ||
// method becomes more used. | ||
func (hc *HeaderChain) GetAncestorHash(ref *types.Header, target uint64) common.Hash { | ||
number := ref.Number.Uint64() - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't checked if this method is ever used on HeaderChain
(or LightChain
, but if it's not then I would consider not including this code. The only place I could see it was maybe used was to adhere to the ChainContext
interface, but neither of these chains support transaction processing (which is what the ChainContext
says it is for).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, they're not used, but they had to have it to provide ChainContext
. I didn't investigate further why, but these are totally unused methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Light clients can actually execute transactions on demand (i.e. CALL via RPC). It might be slow since it needs to pull state from remote nodes, but it should nonetheless be able to, so you do need the method.
core/hashstorage.go
Outdated
} | ||
|
||
// Create a new storage with a header in it. | ||
func NewHashStorage(hdr *types.Header) *HashStorage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems weird to me that to create a data structure, we already need an item to put into it. Why tie the two together? Can't we just have a constructor and a setter? Also, perhaps we could make the number of hashes it can contain a constructor paramter instead of hard coding it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Also a name nit: header
is already pretty short that hdr
is on that shortness-readability line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is that since it's a circular buffer, it simplifies things if it cannot be empty. Then we have to add checks to distinguish between if it's full or empty, basically.
Moving to |
rebased to fix conflicts. |
Squashed and rebased on master |
…p is not on canon chain
This PR implements a hash storage, which is a fixed-size slice (currently set to 288 x 32 bytes) that is used to store ancestor hashes. When a new block is imported, it is added to the storage.
Reorgs are handled internally, by just swapping out the hash at the given number and wiping information about descendants (an O(1) operation)
This makes fetching ancestry hashes from a given reference header an O(1) operation. In the case that all hashes are not availalbe, iteration can be done from the earliest point.
This PR also adds a new mehtod to the ChainContext interface, to expose this method to the EVM.
The lookup is currently only used by the blockhash operation, but can also be used instead of iterating when we receive a
GetBlockHeadersMsg
from a peer.The structure is not threadsafe, but all current uses of it happens from within already mutex:ed places. For use in p2p communication, we'd probably need to make it larger. It's at
8k
now, but even storing the last million hashes would only take up32M
of memory.I'll follow up with charts later. This PR is an alternative to #19291