-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Fix missing block number issue on forced canonicalization #12949
Conversation
There is this issue about missing block numbers on forced canonicalization. I looked over the code now 10000 times and there are possible ways this can be triggered, but I don't really know how this is triggered. So, this pr is going to solve the symptom and not the cause. The block number to hash mapping is set when we import a new best block. Forced canonicalization will now stop at the best block and it will canonicalize the other blocks later when the best block moved. As the error reports indicated that this issue mainly happened on major sync, there should not be any forks, so not doing the canonicalization directly shouldn't be that harmful. All known implementations should import all blocks as best block on major sync anyway (I mean somewhere there is the bug, but I didn't yet found it). I will also do some changes to Cumulus around some potential culprit for this issue. Closes: #12613
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
calling force canonicalize on the whole range of blocks sounds like a good idea to me, but on this decision I might be missing some subtleties.
a bit out of context, but there is one thing I sometime would find useful when working on these part of code, would be a better fork-tree (storing parent-child relation bidirectional and managed from client db with a good caching layer), not strictly needed but would make some things generally simpler and help debugging. (mentioning it as in the past I remember trying to implement that in some experimental branch)
IMO it would be simpler for |
But then |
Okay, I just checked the |
Sorry, I mean |
Yeah, I just wanted to hint this :P |
client/db/src/lib.rs
Outdated
let best_canonical = || self.storage.state_db.best_canonical().unwrap_or(0); | ||
let info = self.blockchain.info(); | ||
let best_number: u64 = self.blockchain.info().best_number.saturated_into(); | ||
let best_hash = info.best_hash; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be moved to line 1332 below, or removed entirely
bot merge |
Waiting for commit status. |
This pull request has been mentioned on Polkadot Forum. There might be relevant details there: https://forum.polkadot.network/t/polkadot-release-analysis-v0-9-37/1736/1 |
…#12949) * Fix missing block number issue on forced canonicalization There is this issue about missing block numbers on forced canonicalization. I looked over the code now 10000 times and there are possible ways this can be triggered, but I don't really know how this is triggered. So, this pr is going to solve the symptom and not the cause. The block number to hash mapping is set when we import a new best block. Forced canonicalization will now stop at the best block and it will canonicalize the other blocks later when the best block moved. As the error reports indicated that this issue mainly happened on major sync, there should not be any forks, so not doing the canonicalization directly shouldn't be that harmful. All known implementations should import all blocks as best block on major sync anyway (I mean somewhere there is the bug, but I didn't yet found it). I will also do some changes to Cumulus around some potential culprit for this issue. Closes: paritytech#12613 * Add some docs * Fix fix * Review comments * Review comments
…#12949) * Fix missing block number issue on forced canonicalization There is this issue about missing block numbers on forced canonicalization. I looked over the code now 10000 times and there are possible ways this can be triggered, but I don't really know how this is triggered. So, this pr is going to solve the symptom and not the cause. The block number to hash mapping is set when we import a new best block. Forced canonicalization will now stop at the best block and it will canonicalize the other blocks later when the best block moved. As the error reports indicated that this issue mainly happened on major sync, there should not be any forks, so not doing the canonicalization directly shouldn't be that harmful. All known implementations should import all blocks as best block on major sync anyway (I mean somewhere there is the bug, but I didn't yet found it). I will also do some changes to Cumulus around some potential culprit for this issue. Closes: paritytech#12613 * Add some docs * Fix fix * Review comments * Review comments
There is this issue about missing block numbers on forced canonicalization. I looked over the code now 10000 times and there are possible ways this can be triggered, but I don't really know how this is triggered. So, this pr is going to solve the symptom and not the cause. The block number to hash mapping is set when we import a new best block. Forced canonicalization will now stop at the best block and it will canonicalize the other blocks later when the best block moved. As the error reports indicated that this issue mainly happened on major sync, there should not be any forks, so not doing the canonicalization directly shouldn't be that harmful. All known implementations should import all blocks as best block on major sync anyway (I mean somewhere there is the bug, but I didn't yet found it).
I will also do some changes to Cumulus around some potential culprit for this issue.
Closes: #12613