Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to restore network using export/import blocks - finality is lagging #1669

Open
2 tasks done
serkixenos opened this issue Sep 21, 2023 · 2 comments
Open
2 tasks done
Labels
I2-bug The node fails to follow expected behavior. I10-unconfirmed Issue might be valid, but it's not yet known.

Comments

@serkixenos
Copy link

Is there an existing issue?

  • I have searched the existing issues

Experiencing problems? Have you tried our Stack Exchange first?

  • This is not a support question.

Description of bug

If substrate node runs dev network for some time (problem observed with ~600 blocks), after exporting blocks and importing them into a new base-path, last finalized block is moved back, and finality lags. Basically, user is unable to backup and restore the network by using export/import blocks subcommands.

Steps to reproduce

  • Build substrate sdk, polkadot-v1.1.0 (no significant changes in latest master 93bb492)
cargo build
  • Start substrate-node in dev mode
./target/debug/substrate-node --base-path ./ --dev
  • Leave it running until ~600 blocks are built, and stop the node. Note the last finalized block (585 in the log)
2023-09-21 18:39:09 ✨ Imported #588 (0xdc33…bfa4)    
2023-09-21 18:39:09 💤 Idle (0 peers), best: #588 (0xdc33…bfa4), finalized #585 (0xd0d3…4939), ⬇ 0 ⬆ 0    
2023-09-21 18:39:12 🙌 Starting consensus session on top of parent 0xdc330fe9b0bf8e2919c458b0ebfa8d4c5a73551d17b1796957de053b3722bfa4    
2023-09-21 18:39:12 🎁 Prepared block for proposing at 589 (3 ms) [hash: 0x98b843c00f67c94aa16a18bca7b17c65585a7fb51b8b63a7f471d614916710d2; parent_hash: 0xdc33…bfa4; extrinsics (1): [0x7857…793b]    
2023-09-21 18:39:12 🔖 Pre-sealed block for proposal at 589. Hash now 0x42e21ee430fff282e21b3f1bc445b0833ae8bec4347c38d7350547a50d3419e5, previously 0x98b843c00f67c94aa16a18bca7b17c65585a7fb51b8b63a7f471d614916710d2.    
2023-09-21 18:39:12 ✨ Imported #589 (0x42e2…19e5)    
  • Export blocks
./target/debug/substrate-node export-blocks --base-path ./ --dev ./blocks.json
2023-09-21 18:39:18 [0] 💸 generated 1 npos voters, 1 from validators and 0 nominators    
2023-09-21 18:39:18 [0] 💸 generated 1 npos targets    
2023-09-21 18:39:22 DB path: ./chains/dev/db/full    
2023-09-21 18:39:22 Exporting blocks from #1 to #589    
2023-09-21 18:39:22 Essential task `babe-worker` failed. Shutting down service.    
2023-09-21 18:39:22 Essential task `transaction-pool-task-0` failed. Shutting down service.    
2023-09-21 18:39:22 Essential task `basic-block-import-worker` failed. Shutting down service.    
2023-09-21 18:39:22 Essential task `txpool-background` failed. Shutting down service.    
2023-09-21 18:39:22 Essential task `transaction-pool-task-1` failed. Shutting down service.    
  • Import blocks into new base-path
./target/debug/substrate-node import-blocks --base-path ./restore --dev ./blocks.json
2023-09-21 18:39:52 [0] 💸 generated 1 npos voters, 1 from validators and 0 nominators    
2023-09-21 18:39:52 [0] 💸 generated 1 npos targets    
2023-09-21 18:39:56 🔨 Initializing Genesis block/state (state: 0x227f…8203, header-hash: 0xce89…4562)    
2023-09-21 18:39:56 👴 Loading GRANDPA authority set from genesis on what appears to be first startup.    
2023-09-21 18:40:01 👶 Creating empty BABE epoch changes on what appears to be first startup.    
2023-09-21 18:40:01 Essential task `babe-worker` failed. Shutting down service.    
2023-09-21 18:40:01 👶 New epoch 0 launching at block 0x1d29…e190 (block slot 565102030 >= start slot 565102030).    
2023-09-21 18:40:01 👶 Next epoch starts at slot 565102230    
2023-09-21 18:40:02 👶 New epoch 1 launching at block 0x6727…bb50 (block slot 565102230 >= start slot 565102230).    
2023-09-21 18:40:02 👶 Next epoch starts at slot 565102430    
2023-09-21 18:40:03 👶 Epoch(s) skipped: from 2 to 11    
2023-09-21 18:40:03 👶 New epoch 11 launching at block 0x9cf4…55fb (block slot 565104419 >= start slot 565104230).    
2023-09-21 18:40:03 👶 Next epoch starts at slot 565104430    
2023-09-21 18:40:03 👶 New epoch 12 launching at block 0xdf7d…ae73 (block slot 565104430 >= start slot 565104430).    
2023-09-21 18:40:03 👶 Next epoch starts at slot 565104630    
2023-09-21 18:40:04 👶 New epoch 13 launching at block 0x8884…a741 (block slot 565104630 >= start slot 565104630).    
2023-09-21 18:40:04 👶 Next epoch starts at slot 565104830    
2023-09-21 18:40:05 [535] 💸 generated 1 npos targets    
2023-09-21 18:40:05 [535] 💸 generated 1 npos voters, 1 from validators and 0 nominators    
2023-09-21 18:40:05 [#535] 🗳  creating a snapshot with metadata SolutionOrSnapshotSize { voters: 1, targets: 1 }    
2023-09-21 18:40:05 [#535] 🗳  Starting phase Signed, round 1.    
2023-09-21 18:40:06 [#585] 🗳  Starting phase Unsigned((true, 585)), round 1.    
2023-09-21 18:40:06 [#586] 🗳  queued unsigned solution with score ElectionScore { minimal_stake: 1538461538461538, sum_stake: 1538461538461538, sum_stake_squared: 2366863905325442366863905325444 }    
2023-09-21 18:40:06 🎉 Imported 589 blocks. Best: #589    
  • Start the node using new base-path
./target/debug/substrate-node --base-path ./restore --dev
2023-09-21 18:40:19 Substrate Node    
2023-09-21 18:40:19 ✌️  version 3.0.0-dev-f60318f6868    
2023-09-21 18:40:19 ❤️  by Parity Technologies <admin@parity.io>, 2017-2023    
2023-09-21 18:40:19 📋 Chain specification: Development    
2023-09-21 18:40:19 🏷  Node name: unbecoming-cabbage-0634    
2023-09-21 18:40:19 👤 Role: AUTHORITY    
2023-09-21 18:40:19 💾 Database: RocksDb at ./restore/chains/dev/db/full    
2023-09-21 18:40:23 [0] 💸 generated 1 npos voters, 1 from validators and 0 nominators    
2023-09-21 18:40:23 [0] 💸 generated 1 npos targets    
2023-09-21 18:40:27 Using default protocol ID "sup" because none is configured in the chain specs    
2023-09-21 18:40:27 🏷  Local node identity is: 12D3KooWMjw9RTnA62vGnBF4znUZtLdmBpoXYGFMtTWi4U33SD9G    
2023-09-21 18:40:27 💻 Operating system: linux    
2023-09-21 18:40:27 💻 CPU architecture: x86_64    
2023-09-21 18:40:27 💻 Target environment: gnu    
2023-09-21 18:40:27 💻 CPU: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz    
2023-09-21 18:40:27 💻 CPU cores: 6    
2023-09-21 18:40:27 💻 Memory: 31867MB    
2023-09-21 18:40:27 💻 Kernel: 5.15.0-83-generic    
2023-09-21 18:40:27 💻 Linux distribution: Ubuntu 20.04.6 LTS    
2023-09-21 18:40:27 💻 Virtual machine: no    
2023-09-21 18:40:27 📦 Highest known block at #589    
2023-09-21 18:40:27 〽️ Prometheus exporter started at 127.0.0.1:9615    
2023-09-21 18:40:27 Running JSON-RPC server: addr=127.0.0.1:9944, allowed origins=["*"]    
2023-09-21 18:40:27 🏁 CPU score: 1.29 GiBs    
2023-09-21 18:40:27 🏁 Memory score: 12.94 GiBs    
2023-09-21 18:40:27 🏁 Disk score (seq. writes): 1.49 GiBs    
2023-09-21 18:40:27 🏁 Disk score (rand. writes): 529.82 MiBs    
2023-09-21 18:40:27 👶 Starting BABE Authorship worker    
2023-09-21 18:40:29 Accepting new connection 1/100
2023-09-21 18:40:30 🙌 Starting consensus session on top of parent 0x42e21ee430fff282e21b3f1bc445b0833ae8bec4347c38d7350547a50d3419e5    
2023-09-21 18:40:30 🎁 Prepared block for proposing at 590 (3 ms) [hash: 0x7a3d5670baed2d4410bbeb0e92a811ce60199b58599fd153fb29372890f965f9; parent_hash: 0x42e2…19e5; extrinsics (1): [0x02bf…0b5e]    
2023-09-21 18:40:30 🔖 Pre-sealed block for proposal at 590. Hash now 0xfc45cd19c0a7a456442788302b35e6b306f4f59be87fd1b733d328c5050bf097, previously 0x7a3d5670baed2d4410bbeb0e92a811ce60199b58599fd153fb29372890f965f9.    
2023-09-21 18:40:30 ✨ Imported #590 (0xfc45…f097)    
2023-09-21 18:40:32 💤 Idle (0 peers), best: #590 (0xfc45…f097), finalized #512 (0x5f0d…9aa8), ⬇ 0 ⬆ 0    
2023-09-21 18:40:33 Backing off claiming new slot for block authorship: finality is lagging.    
2023-09-21 18:40:36 Backing off claiming new slot for block authorship: finality is lagging.    
2023-09-21 18:40:37 💤 Idle (0 peers), best: #590 (0xfc45…f097), finalized #512 (0x5f0d…9aa8), ⬇ 0 ⬆ 0    
2023-09-21 18:40:39 Backing off claiming new slot for block authorship: finality is lagging.    
2023-09-21 18:40:42 Backing off claiming new slot for block authorship: finality is lagging.    
  • Note that last finalized block is not what it was when we stopped the node (moved from 585 to 512 in the log)
  • Note that finality is lagging
@serkixenos serkixenos added I10-unconfirmed Issue might be valid, but it's not yet known. I2-bug The node fails to follow expected behavior. labels Sep 21, 2023
@altonen altonen moved this to Backlog 🗒 in Networking Sep 22, 2023
@bkchr
Copy link
Member

bkchr commented Sep 26, 2023

512 should be the block that had a justification stored and thus, it was finalized when loading it from the exported blocks. I assume that the problem you are seeing here is maybe related to grandpa running only with one authority. @andresilva maybe any idea?

@bkchr bkchr added this to SDK Node Sep 26, 2023
@github-project-automation github-project-automation bot moved this to backlog in SDK Node Sep 26, 2023
@bkchr bkchr removed this from Networking Sep 26, 2023
@andresilva
Copy link
Contributor

andresilva commented Sep 26, 2023

Yes, by default since we are only persisting justifications every 512 blocks (and for blocks that change the authorities) when you export the block data the latest justification data will not be available and hence why finality "moves back" after import. The node should start finalizing again after the import though (starting from block 512), not sure why that isn't happening. I will have a look.

bkchr pushed a commit that referenced this issue Apr 10, 2024
* batch transactions in message relay: API prototype

* get rid of Box<dyn BatchTransaction> and actually submit it

* test batch transactions

* message_lane_loop_works_with_batch_transactions

* removed logger

* BatchConfirmationTransaction + BatchDeliveryTransaction

* more prototyping

* fmt

* continue with batch calls

* impl BatchCallBuilder for ()

* BatchDeliveryTransaction impl

* BundledBatchCallBuilder

* proper impl of BundledBatchCallBuilder + use it in RialtoParachain -> Millau

* impl prove_header in OnDemandHeadersRelay

* impl OnDemandParachainsRelay::prove_header (needs extensive tests)

* added a couple of TODOs

* return Result<Option<BatchTx>> when asking for more headers

* prove headers when reauire_* is called && return proper headers from required_header_id

* split parachains::prove_header and test select_headers_to_prove

* more traces and leave TODOs

* use finality stream in SubstrateFinalitySource::prove_block_finality

* prove parachain head at block, selected by headers relay

* const ANCIENT_BLOCK_THRESHOLD

* TODO -> proof

* clippy and spelling

* BatchCallBuilder::build_batch_call() returns Result

* read first proof from two streams

* FailedToFindFinalityProof -> FinalityProofNotFound

* changed select_headers_to_prove to version from PR review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I2-bug The node fails to follow expected behavior. I10-unconfirmed Issue might be valid, but it's not yet known.
Projects
Status: backlog
Development

No branches or pull requests

3 participants