Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stage 5 block verification failed for #12733 #39

Open
SurfingNerd opened this issue Aug 29, 2021 · 3 comments
Open

Stage 5 block verification failed for #12733 #39

SurfingNerd opened this issue Aug 29, 2021 · 3 comments
Milestone

Comments

@SurfingNerd
Copy link
Collaborator

SurfingNerd commented Aug 29, 2021

We experienced the same error on the same block on different machines, while other machines with the similiar hardware could process the block

2021-08-29 12:26:36  Verifier #1 TRACE consensus  calling reward function for block 12733 isEpochEnd? false on address: 0x2000…0001
2021-08-29 12:26:36  Verifier #1 WARN client  Stage 5 block verification failed for #12733 (0xe487…f28d)
Error: Error(Block(InvalidStateRoot(Mismatch { expected: 0x4fe75d2d73f1be58188694f1a4377a38b06caa44fc92ea8da977c4af9a39f758, found: 0x6d79a20d1d5e7cd1afa35169ba3a474e79b66361f1dff02f7c654114ace431d9 })), State { next_error: None, backtrace: InternalBacktrace { backtrace: None } })
2021-08-29 12:26:36  Verifier #1 ERROR client
Bad block detected: Error(Block(InvalidStateRoot(Mismatch { expected: 0x4fe75d2d73f1be58188694f1a4377a38b06caa44fc92ea8da977c4af9a39f758, found: 0x6d79a20d1d5e7cd1afa35169ba3a474e79b66361f1dff02f7c654114ace431d9 })), State { next_error: None, backtrace: InternalBacktrace { backtrace: None } })
RLP: f9023af90235a0c8c91e45031e5a42eec4154168e130d2893afe5fb258c47812b1ee1e45970209a01dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347940000000000000000000000000000000000000000a04fe75d2d73f1be58188694f1a4377a38b06caa44fc92ea8da977c4af9a39f758a056e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421a056e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421bbd843b9aca008084612b1a7686506172697479b8608485ac5bb0219a4c38850b72cf79bb66c047a380f24174dd57cf3dabd19d6637c5a97c89f7710e3f8cf7a892c9195c8f17a5ac5620fe0e5163bac94bf8d6aa9017dfa0895b3b5b005cf7180ce31ae89570e4f423d7c3410fa194688dca0428c5c0c0
Header: Header { parent_hash: 0xc8c91e45031e5a42eec4154168e130d2893afe5fb258c47812b1ee1e45970209, timestamp: 1630214774, number: 12733, author: 0x0000000000000000000000000000000000000000, transactions_root: 0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421, uncles_hash: 0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347, extra_data: [80, 97, 114, 105, 116, 121], state_root: 0x4fe75d2d73f1be58188694f1a4377a38b06caa44fc92ea8da977c4af9a39f758, receipts_root: 0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421, log_bloom: 0xgas_used: 0, gas_limit: 1000000000, difficulty: 1, seal: [[184, 96, 132, 133, 172, 91, 176, 33, 154, 76, 56, 133, 11, 114, 207, 121, 187, 102, 192, 71, 163, 128, 242, 65, 116, 221, 87, 207, 61, 171, 209, 157, 102, 55, 197, 169, 124, 137, 247, 113, 14, 63, 140, 247, 168, 146, 201, 25, 92, 143, 23, 165, 172, 86, 32, 254, 14, 81, 99, 186, 201, 75, 248, 214, 170, 144, 23, 223, 160, 137, 91, 59, 91, 0, 92, 247, 24, 12, 227, 26, 232, 149, 112, 228, 244, 35, 215, 195, 65, 15, 161, 148, 104, 141, 202, 4, 40, 197]], hash: Some(0xe487e091510ab3fc1e52c0c3162f748e1a79a8907699c501cde6a628d1b0f28d) }

documentation for the import of an existing block: https://openethereum.github.io/Trace-NewBlock
Branch with additional logging information about whats going on: https://github.com/SurfingNerd/openethereum-3.x/tree/surfingnerd/stage-5-logging

@SurfingNerd
Copy link
Collaborator Author

we can confirm now that the error is reproduceable if trying to resync a node that happened to be part of the active validator set in the past.
If syncing as RPC node, the chain can sync without problem.
If syncing as regular validator - this problem can occur.

@SurfingNerd
Copy link
Collaborator Author

stage 5 verification fails, because of is_end_epoch argument cannot get calculated.
Why:

on close block, do_keygen() is getting called.
we have pending validators, so we try to figure out if the key gen is ready.
initalize_synckeygen reads all part_of_address for every validator.

In the section "// We are a validator: Decrypt and deserialize our row and compare it to the commitment."
".decrypt(&rows[our_idx as usize])"

This rows array holds the PART data.
A part data contains a secret message for every pending validator.

It looks like the validator "0x5C132de048605FB4e90B97a14cd25DA230dAf218" wrote a part,
with informations that we ("0x40703fe50d7cbd36bc70295d29a93ec05badc6bd") are unable to decrypt.

@dforsten
Copy link
Collaborator

All blocks are verified on import, requiring all contract calls to be executed exactly the same way on all nodes.
This includes system calls, which also have to be called with exactly the same arguments on all nodes.

Stage 5 verification excecutes all contract calls as well as all system calls to verify that the same state results as specified in the block's header.

If the state after all calls differs from the state root specified in the block's header stage 5 verification fails!

The block rewards contract is executed when the block is getting closed, and only accepts a single boolean argument: is_end_epoch

If for some reason on some pending validators this argument is false when it should be true. This results in a stage 5 verification error, which marks the block as "bad block" which prevents it from being imported.

If that block in fact was correct, syncing will fail indefinitely because this block is falsely marked as "bad" and no following block can be imported either.

So how can this happen?

The on_close_block() function calls do_keygen() to determine if there are enough "Parts" and "Acks" to finish the KeyGen Phase and start a new epoch with a new validator set.

do_keygen() calls initalize_synckeygen() which uses the part_of_address() function to read all available Parts and passes them to the hbbft library for processing, which uses the private key of the pending validator to decrypt the information other pending validators have written for it into their Part contributions.

On decrypting the information written into the Part for us by other pending validators we encounter a decryption error. This happens in the handle_part_or_fault() function in the hbbft library:

    // We are a validator: Decrypt and deserialize our row and compare it to the commitment.
        let ser_row = self
            .sec_key
            .decrypt(&rows[our_idx as usize])
            .map_err(|_| PartFault::DecryptRow)?;    

rows is the array of information encrypted with the public keys of each pending validator, contained in the "Part" written into the KeyGen contract by each pending validator.

It looks like the validator "0x5C132de048605FB4e90B97a14cd25DA230dAf218" wrote a part, with information that we ("0x40703fe50d7cbd36bc70295d29a93ec05badc6bd") are unable to decrypt.

This fails the generation of the new hbbft Keys even if technically there would be enough Parts and Acks written to finish Key generation.
This causes the block reward system call to be called with the is_end_epoch argument as false, when it should be true, thus causing a different block state, failing stage 5 verification.

@SurfingNerd SurfingNerd added this to the v4.0 milestone Jan 20, 2022
@SurfingNerd SurfingNerd removed the v4.0 label Jan 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants