[EC-343] Add block header validation during fast sync #380

LukasGasior1 · 2018-01-04T22:45:55Z

No description provided.

AlanVerbner · 2018-01-05T14:43:51Z

src/main/scala/io/iohk/ethereum/blockchain/sync/FastSync.scala

@@ -584,6 +626,11 @@ object FastSync {
  validators: Validators, peerEventBus: ActorRef, etcPeerManager: ActorRef, syncConfig: SyncConfig, scheduler: Scheduler): Props =
    Props(new FastSync(fastSyncStateStorage, appStateStorage, blockchain, validators, peerEventBus, etcPeerManager, syncConfig, scheduler))

+  // validation parameters (see: https://github.com/ethereum/go-ethereum/pull/1889)


Even if these values seem to be defined in the referenced PR, shouldn't be a good idea to put them within application.conf? (hidden from the normal user)

AlanVerbner · 2018-01-05T16:40:20Z

src/main/scala/io/iohk/ethereum/blockchain/sync/FastSync.scala

+              log.warning(s"Block header validation failed during fast sync at block ${header.number}: $error")
+              blacklist(peer.id, blacklistDuration, "block header validation failed")
+
+              // discard last N blocks


I don't quite follow this. If we found that a specific block is invalid, why previous N ones are dropped? Those might be valid, or not?

Since we only validate every ~Xth block (+ some randomization) we cannot be sure these are valid blocks.
See ethereum/go-ethereum#1889 for comprehensive explanation. In short:

With this caveat calculated, the fast sync should be modified so that up to the pivoting point - X, only every K=100-th header should be verified (at random), after which all headers up to pivot point + X should be fully verified before starting state database downloading. Note: if a sync fails due to header verification the last N headers must be discarded as they cannot be trusted enough.

Should it always be N when N are available?

In other words: forget the begining of the chain, I'm talking about a situation when we are in the [target - X, target] range. Shouldn't it be:

val numberOfBlocksToDiscard = if (target - current <= X) N + X - (target - current) else N

(maths may need double-checking)

So, N plus the number of blocks already validated in the X phase?

Why do you think we should drop the extra blocks? For me it doesn't seem to make a difference in this context. (is there a difference if validation fails at block target/2 or target-10? I think it's sufficient to drop N in both cases)

Somehow I thought we should revert to a fully validated block, but since N is not divisible by K that is not the case. Anyway, the way put it - no, I don't think it will make a difference 👍

AlanVerbner · 2018-01-05T16:43:52Z

src/main/scala/io/iohk/ethereum/blockchain/sync/FastSync.scala

+
+              // discard last N blocks
+              (header.number to ((header.number - N) max 1) by -1).foreach { n =>
+                blockchain.getBlockHeaderByNumber(n).foreach { header =>


Is this getBlockHeaderByNumber needed if we already have header.hash and removeBlock doesn't fail (afaik) if hash does not exist?

good catch, it's not needed. (I don't expect this piece of code to be called too often, but still)

Actually it is needed here. header was shadowed inside foreach, we only have one header (the one we start with). I changed to name to make that more clear

KonradStaniec

Code look goods, but i have one question.

According to
With this caveat calculated, the fast sync should be modified so that up to the pivoting point - X, only every K=100-th header should be verified (at random), after which all headers up to pivot point + X should be fully verified before starting state database downloading .
Shouldn't we also verify all blocks from targetBlock to targetBlock+x to be sure our targetblock is safe ?

LukasGasior1 · 2018-01-08T13:07:13Z

@KonradStaniec
I'm not sure I follow this part. After reaching targetBlock we switch to regular sync, which validates each block anyway. Also for should be fully verified before starting state database downloading - we start state database downloading at the beginning, parallel to block download, so this doesn't seem to make sense. On the other hand, if we detect target block is invalid, do we need to download the state again? Choose different target block? Or just try with different peer?

KonradStaniec · 2018-01-08T13:49:25Z

@LukasGasior1 I see, my misunderstanding comes from the fact that we have a diffrent algorithm that geth.
But there is a valid point in there, redownloading full state agian seems like overkill if target block is invalid, is it possible to switch algorithm to first downloading blockchain and then state ? ( i think i remeber that it was switched in the past but i dont remember why)

LukasGasior1 · 2018-01-08T14:21:05Z

It was because we want state as soon as possible, otherwise other peers may start prunning it and we end up with partially downloaded state that we cannot continue downloading (in fact it still may happen). Btw I think geth also does it this way (i.e it downloads state first).

rtkaczyk · 2018-01-08T15:57:41Z

@LukasGasior1

After reaching targetBlock we switch to regular sync, which validates each block anyway.

Right, but if we discover a block was invalid after switching to regular we won't be able to recover, will we?

rtkaczyk · 2018-01-08T16:04:12Z

Wait, now I'm not sure I follow...

@KonradStaniec wrote:

Shouldn't we also verify all blocks from targetBlock to targetBlock+x to be sure our targetblock is safe ?

We should full validate from targetBlock - x to targetBlock, which we are doing. Or am I missing something?

LukasGasior1 · 2018-01-08T16:07:26Z

@rtkaczyk it comes from this sentence (in geth PR):

With this caveat calculated, the fast sync should be modified so that up to the pivoting point - X, only every K=100-th header should be verified (at random), after which all headers up to pivot point + X should be fully verified before starting state database downloading

according to it, we should also validate pivot + X blocks before starting state download..which doesn't seem to make sense in our case

rtkaczyk · 2018-01-08T16:14:16Z

src/main/resources/application.conf

+    # See: https://github.com/ethereum/go-ethereum/pull/1889
+    fast-sync-block-validation-k = 100
+    fast-sync-block-validation-n = 2048
+    fast-sync-block-validation-x = 50


How did you choose this X value? In the PR (white paper) X=24

Ahh, so you doubled it because we treat target/pivot differently?

Yes, I was not sure how to interpret that sentence, but my intuition is that validating pivot - 2x to pivot is the same as validating pivot - x to pivot + x. (and validation after pivot happens anyway in regular sync).

KonradStaniec · 2018-01-08T16:26:24Z

@LukasGasior1 I am not sure how its implemented in geth ( i will look it up to be sure ) and i am still unsure how it should works but this paragraph makes me little unsure:
Using this caveat however would mean, that the pivot point can be considered secure only after N headers have been imported after the pivot itself. To prove the pivot safe faster, we stop the "gapped verificatios" X headers before the pivot point, and verify every single header onward, including an additional X headers post-pivot before accepting the pivot's state .
As I understand, our targetBlock and its state should not be considered safe to download unless there is at least X verified blocks after it. So if we would accept it and download it (this State) and after let's say X - 1 blocks, some block would fail, shouldn't we redownload whole state from some other block?

Also this are maybe question to some futre pr, as till now we were doing pretty fine without validation at all :)

rtkaczyk · 2018-01-09T22:40:46Z

So if we would accept it and download it (this State) and after let's say X - 1 blocks, some block would fail, shouldn't we redownload whole state from some other block?

I think we should address that. At the very least we should check whether our state root hash equals to the one on the accepted target block, and otherwise declare failure to sync (and plan something better for the future)

LukasGasior1 · 2018-01-10T11:07:08Z

So what we need to do is:

validate target block (check if our state hash == its state hash)
if this is fine we can be pretty sure our fast sync is valid (as target -X blocks were also validated)
then if we fail at block target+1 in regular sync, I think we should just drop this one block (in other words, RegularSync should behave as it currently does)

But if validation fails on target block (i.e our state != it's state) we declare failure (and drop the whole blockchain?). Note that if validation fails at something like target - 10 it's sufficient to just drop N blocks (as the gap between blocks we validate is smaller than N).

wdyt @KonradStaniec @rtkaczyk ?

rtkaczyk · 2018-01-10T11:31:30Z

sounds good

LukasGasior1 · 2018-01-10T11:53:43Z

Actually it doesn't sound good. validate target block (check if our state hash == its state hash) - there is no such thing as "our state hash". We can only do regular validation on target block (which we do), the questions is, do we need to drop the whole blockchain and declare a failure? Isn't it enough that we drop N blocks? (Note that if validation fails at something like target - 10 it's sufficient to just drop N blocks (as the gap between blocks we validate is smaller than N). I think it should also apply to target block == no need to declare failure(?))

LukasGasior1 · 2018-01-10T12:00:06Z

Let's consider a peer that sends us invalid target block when we start fast sync:
since we validate every K blocks he should be able to provide us with valid blocks up to some point (if he's able to do that to target block than it's a valid chain). Let's say validation fails at some block B (which may be as soon as block 1), we go back N blocks ending up at B-N. Now if validation fails again at this point, we go another N back and so on ultimately reaching block 1 (which we may as well reach on first failure). And I think here's where we're missing some logic: in this case we should restart fast sync and try to choose different target block. Currently we'll reach block 1 and retry to sync, still keeping the same target.

rtkaczyk · 2018-01-10T12:30:42Z

Since we start with downloading state, we first need to obtain the state root hash for the target block - this is what I meant by our state root hash. But this block cannot be validated until we download all preceding blocks, and when we do, it may surface that our state root hash was incorrect.

So rather than restarting fast sync, or redownloading the state, or whatever, we could do an easy/dumb thing and declare a failure.

As a follow-up we can come up with something better. Here's one idea:

download state after the blocks
upon reaching the target block, determine a new updated target (500 blocks behind the current top of the chain)
repeat above until, upon reaching the updated target block, we are in fact 500 blocks behind (+- some margin, say 10) the top of the chain
now downloading of the state is the same with regard to pruning in the source nodes, yet the state root hash has been validated

LukasGasior1 · 2018-01-10T12:36:27Z

Ok, it makes sense. If target fails validation we really should declare a failure. Besides that, I think we should also implement what I mentioned (choose different target if current (failed) block is less than N).

rtkaczyk · 2018-01-10T12:41:03Z

choose different target if current (failed) block is less than N

Sorry I did not get that. Do you mean choosing different target number, or different target header in the initialSyncState?

LukasGasior1 · 2018-01-10T12:43:29Z

I mean restarting fast sync from scratch, i.e asking different peer for (possibly) different block. But now I'm not sure it'll be needed anymore since we're going to fail if target is invalid anyway

rtkaczyk · 2018-01-10T12:54:09Z

So there are a few ways you can go about restarting:

Redownload the state from the target block that is now validated - but run into the risk of missing node data due to pruning
Choose new target block number - but run into the risk the obtained stateRootHash will be incorrect again, requiring another restart
Continue fast sync block download to a newer target block, and then redownload the state

\1. and 2. are not perfect, and 3. is similar to my idea above - if that would be implemented we wouldn't need to restart at all.

LukasGasior1 · 2018-01-10T12:58:05Z

would also make it possible to update our target block while fast sync is running fine (no validation errors). So that we don't have to worry about state prunning on the nodes we sync with, but this would require even more changes. For now I'll go with just declaring a failure and create a task for this improvement.

Łukasz Gąsior added 3 commits January 4, 2018 22:30

Add block header validation during fast sync

c9c9870

Remove unused method

74a5a60

cleanup

0e227f0

AlanVerbner reviewed Jan 5, 2018

View reviewed changes

Add configs

2bc3a32

KonradStaniec self-requested a review January 8, 2018 10:59

KonradStaniec reviewed Jan 8, 2018

View reviewed changes

rtkaczyk self-requested a review January 8, 2018 15:24

rtkaczyk reviewed Jan 8, 2018

View reviewed changes

Fail sync if target block validation fails

f848ad0

sys.exit intead of stopping actor

85dbb14

rtkaczyk approved these changes Jan 10, 2018

View reviewed changes

KonradStaniec approved these changes Jan 11, 2018

View reviewed changes

LukasGasior1 merged commit f397e16 into phase/release1_1 Jan 11, 2018

LukasGasior1 deleted the feature/fastSyncPowValidation branch January 11, 2018 11:59

KonradStaniec mentioned this pull request Jan 30, 2018

[EC-414] Feature/fast syc update targetblock #394

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[EC-343] Add block header validation during fast sync #380

[EC-343] Add block header validation during fast sync #380

LukasGasior1 commented Jan 4, 2018

AlanVerbner Jan 5, 2018

LukasGasior1 Jan 7, 2018

AlanVerbner Jan 5, 2018

LukasGasior1 Jan 5, 2018

rtkaczyk Jan 8, 2018

LukasGasior1 Jan 9, 2018

rtkaczyk Jan 9, 2018

AlanVerbner Jan 5, 2018

LukasGasior1 Jan 5, 2018

LukasGasior1 Jan 7, 2018

KonradStaniec left a comment •

edited

Loading

LukasGasior1 commented Jan 8, 2018

KonradStaniec commented Jan 8, 2018

LukasGasior1 commented Jan 8, 2018

rtkaczyk commented Jan 8, 2018

rtkaczyk commented Jan 8, 2018

LukasGasior1 commented Jan 8, 2018

rtkaczyk Jan 8, 2018

rtkaczyk Jan 8, 2018

LukasGasior1 Jan 8, 2018

KonradStaniec commented Jan 8, 2018

rtkaczyk commented Jan 9, 2018 •

edited

Loading

LukasGasior1 commented Jan 10, 2018 •

edited

Loading

rtkaczyk commented Jan 10, 2018

LukasGasior1 commented Jan 10, 2018

LukasGasior1 commented Jan 10, 2018

rtkaczyk commented Jan 10, 2018

LukasGasior1 commented Jan 10, 2018

rtkaczyk commented Jan 10, 2018

LukasGasior1 commented Jan 10, 2018

rtkaczyk commented Jan 10, 2018 •

edited

Loading

LukasGasior1 commented Jan 10, 2018

[EC-343] Add block header validation during fast sync #380

[EC-343] Add block header validation during fast sync #380

Conversation

LukasGasior1 commented Jan 4, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KonradStaniec left a comment • edited Loading

Choose a reason for hiding this comment

LukasGasior1 commented Jan 8, 2018

KonradStaniec commented Jan 8, 2018

LukasGasior1 commented Jan 8, 2018

rtkaczyk commented Jan 8, 2018

rtkaczyk commented Jan 8, 2018

LukasGasior1 commented Jan 8, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KonradStaniec commented Jan 8, 2018

rtkaczyk commented Jan 9, 2018 • edited Loading

LukasGasior1 commented Jan 10, 2018 • edited Loading

rtkaczyk commented Jan 10, 2018

LukasGasior1 commented Jan 10, 2018

LukasGasior1 commented Jan 10, 2018

rtkaczyk commented Jan 10, 2018

LukasGasior1 commented Jan 10, 2018

rtkaczyk commented Jan 10, 2018

LukasGasior1 commented Jan 10, 2018

rtkaczyk commented Jan 10, 2018 • edited Loading

LukasGasior1 commented Jan 10, 2018

KonradStaniec left a comment •

edited

Loading

rtkaczyk commented Jan 9, 2018 •

edited

Loading

LukasGasior1 commented Jan 10, 2018 •

edited

Loading

rtkaczyk commented Jan 10, 2018 •

edited

Loading