Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Disputes Causing Unresponsivenss #6412

Closed
wpank opened this issue Dec 9, 2022 · 49 comments · Fixed by #6440
Closed

Disputes Causing Unresponsivenss #6412

wpank opened this issue Dec 9, 2022 · 49 comments · Fixed by #6440

Comments

@wpank
Copy link
Contributor

wpank commented Dec 9, 2022

Many had logs similar to this with some parachain subsystems failing, but the node doesn't crash.

@dbpatty
Copy link

dbpatty commented Dec 9, 2022

when I restarted my node first time I had this log (it wasn't responde):
Dec 09 00:38:54 Ubuntu-2004-focal-64-minimal polkadot[4186922]: 2022-12-09 00:38:54.997 ERROR tokio-runtime-worker sc_service::task_manager: Essential task overseer failed. Shutting down service.
Dec 09 00:38:54 Ubuntu-2004-focal-64-minimal polkadot[4186922]: 2022-12-09 00:38:54.997 WARN tokio-runtime-worker slots: Unable to author block in slot. No best block header: Request for leaves from chain selection got canceled
Dec 09 00:38:54 Ubuntu-2004-focal-64-minimal polkadot[4186922]: 2022-12-09 00:38:54.997 INFO tokio-runtime-worker parachain::dispute-coordinator: New dispute initiated for candidate. candidate_hash=0xe1e677e00a3a3411c4bad16559c4753c47cd51043159b1f4399caea155bfb91b session=26550 traceID=300272958686085533188489404059870328124
Dec 09 00:38:54 Ubuntu-2004-focal-64-minimal polkadot[4186922]: 2022-12-09 00:38:54.997 INFO tokio-runtime-worker parachain::dispute-coordinator: Dispute on candidate concluded with 'valid' result candidate_hash=0xe1e677e00a3a3411c4bad16559c4753c47cd51043159b1f4399caea155bfb91b session=26550 traceID=300272958686085533188489404059870328124
Dec 09 00:38:54 Ubuntu-2004-focal-64-minimal polkadot[4186922]: 2022-12-09 00:38:54.998 ERROR tokio-runtime-worker polkadot_overseer: subsystem exited with error subsystem="statement-distribution-subsystem" err=FromOrigin { origin: "statement-distribution", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) }
Dec 09 00:38:55 Ubuntu-2004-focal-64-minimal polkadot[4186922]: 2022-12-09 00:38:54.999 WARN tokio-runtime-worker parachain::dispute-coordinator: Fetch for approval votes got cancelled, only expected during shutdown!
Dec 09 00:38:55 Ubuntu-2004-focal-64-minimal polkadot[4186922]: 2022-12-09 00:38:54.999 INFO tokio-runtime-worker parachain::dispute-coordinator: New dispute initiated for candidate. candidate_hash=0xfcd08ae3f48e2eb589415d6cecffe6e65cfd8a7fb587323deae45faf8552ac67 session=26550 traceID=336048269717378469227419513318138701542
Dec 09 00:38:55 Ubuntu-2004-focal-64-minimal polkadot[4186922]: 2022-12-09 00:38:54.999 INFO tokio-runtime-worker parachain::dispute-coordinator: Dispute on candidate concluded with 'valid' result candidate_hash=0xfcd08ae3f48e2eb589415d6cecffe6e65cfd8a7fb587323deae45faf8552ac67 session=26550 traceID=336048269717378469227419513318138701542
Dec 09 00:38:55 Ubuntu-2004-focal-64-minimal polkadot[4186922]: 2022-12-09 00:38:55.014 ERROR tokio-runtime-worker polkadot_overseer: subsystem exited with error subsystem="runtime-api-subsystem" err=Generated(Context("Signal channel is terminated and empty."))
Dec 09 00:38:55 Ubuntu-2004-focal-64-minimal polkadot[4186922]: 2022-12-09 00:38:55.031 ERROR tokio-runtime-worker polkadot_overseer: subsystem exited with error subsystem="dispute-coordinator-subsystem" err=FromOrigin { origin: "dispute-coordinator", source: ChainApiSenderDropped }

@tugytur
Copy link
Contributor

tugytur commented Dec 9, 2022

logs from our validators

First I had thousands of these

2022-12-09 00:34:18 error=AuthorityFlooding(Public(82d52cc7b29907658987ceb5ca8774cadac4f2f2dff14b5fd7257929ea8f9931

and shortly after:

2022-12-09 00:39:33 💤 Idle (40 peers), best: #15675155 (0x6365…21c0), finalized #15675123 (0x1032…cc38), ⬇ 3.2MiB/s ⬆ 947.1kiB/s
2022-12-09 00:39:36 ✨ Imported #15675156 (0x2050…cc1b)
2022-12-09 00:39:38 💤 Idle (40 peers), best: #15675156 (0x2050…cc1b), finalized #15675123 (0x1032…cc38), ⬇ 596.6kiB/s ⬆ 335.7kiB/s
2022-12-09 00:39:39 ♻️  Reorg on #15675156,0x2050…cc1b to #15675151,0xdbda…58bf, common ancestor #15675150,0x85f8…bcf9
2022-12-09 00:39:39 ✨ Imported #15675151 (0xdbda…58bf)
2022-12-09 00:39:42 Subsystem dispute-coordinator-subsystem appears unresponsive. origin=polkadot_service::relay_chain_selection::SelectRelayChainInner<sc_client_db::Backend<sp_runtime::generic::block::Block<sp_runtime::generic::header::Header<u32, sp_runtime::traits::BlakeTwo256>, sp_runtime::OpaqueExtrinsic>>, polkadot_overseer::Handle>
2022-12-09 00:39:42 Call to `DetermineUndisputedChain` failed error=DetermineUndisputedChainCanceled(Canceled)
2022-12-09 00:39:42 err=ContextChannelClosed
2022-12-09 00:39:42 subsystem exited with error subsystem="pvf-checker-subsystem" err=FromOrigin { origin: "pvf-checker", source: Generated(Context("Signal channel is terminated and empty.")) }
2022-12-09 00:39:42 subsystem exited with error subsystem="candidate-validation-subsystem" err=FromOrigin { origin: "candidate-validation", source: Generated(Context("Signal channel is terminated and empty.")) }
2022-12-09 00:39:42 subsystem exited with error subsystem="bitfield-signing-subsystem" err=FromOrigin { origin: "bitfield-signing", source: Generated(Context("Signal channel is terminated and empty.")) }
2022-12-09 00:39:42 err=Subsystem(Generated(Context("Signal channel is terminated and empty.")))
2022-12-09 00:39:42 subsystem exited with error subsystem="chain-api-subsystem" err=FromOrigin { origin: "chain-api", source: Generated(Context("Signal channel is terminated and empty.")) }
2022-12-09 00:39:42 subsystem exited with error subsystem="availability-distribution-subsystem" err=FromOrigin { origin: "availability-distribution", source: IncomingMessageChannel(Generated(Context("Signal channel is terminated and empty."))) }
2022-12-09 00:39:42 subsystem exited with error subsystem="availability-recovery-subsystem" err=FromOrigin { origin: "availability-recovery", source: Generated(Context("Signal channel is terminated and empty.")) }
2022-12-09 00:39:42 Failed to receive a message from Overseer, exiting err=Generated(Context("Signal channel is terminated and empty."))
2022-12-09 00:39:42 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) }
2022-12-09 00:39:42 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) }
2022-12-09 00:39:42 error receiving message from subsystem context: Generated(Context("Signal channel is terminated and empty.")) err=Generated(Context("Signal channel is terminated and empty."))
2022-12-09 00:39:42 Fetch for approval votes got cancelled, only expected during shutdown!
2022-12-09 00:39:42 subsystem exited with error subsystem="approval-voting-subsystem" err=FromOrigin { origin: "approval-voting", source: Generated(Context("Signal channel is terminated and empty.")) }
2022-12-09 00:39:42 Essential task `overseer` failed. Shutting down service.
2022-12-09 00:39:42 Unable to author block in slot. No best block header: Request for leaves from chain selection got canceled
2022-12-09 00:39:42 New dispute initiated for candidate. candidate_hash=0x76f50ee7b0b011c8476ff06137758119f14621ed56fc8eddeddcab972075d418 session=26550 traceID=158121318543083314337622651639539532057
2022-12-09 00:39:42 Dispute on candidate concluded with 'valid' result candidate_hash=0x76f50ee7b0b011c8476ff06137758119f14621ed56fc8eddeddcab972075d418 session=26550 traceID=158121318543083314337622651639539532057
2022-12-09 00:39:42 Fetch for approval votes got cancelled, only expected during shutdown!
2022-12-09 00:39:42 New dispute initiated for candidate. candidate_hash=0xb135a789e5406bed8b799a8edc1f7ded19c6d443c35c0cab94190cb4802e1e6c session=26550 traceID=235551945075044477828146618989736328685
2022-12-09 00:39:42 Dispute on candidate concluded with 'valid' result candidate_hash=0xb135a789e5406bed8b799a8edc1f7ded19c6d443c35c0cab94190cb4802e1e6c session=26550 traceID=235551945075044477828146618989736328685
2022-12-09 00:39:42 Fetch for approval votes got cancelled, only expected during shutdown!
2022-12-09 00:39:42 New dispute initiated for candidate. candidate_hash=0xe1e677e00a3a3411c4bad16559c4753c47cd51043159b1f4399caea155bfb91b session=26550 traceID=300272958686085533188489404059870328124
2022-12-09 00:39:42 Dispute on candidate concluded with 'valid' result candidate_hash=0xe1e677e00a3a3411c4bad16559c4753c47cd51043159b1f4399caea155bfb91b session=26550 traceID=300272958686085533188489404059870328124
2022-12-09 00:39:42 Fetch for approval votes got cancelled, only expected during shutdown!
2022-12-09 00:39:42 New dispute initiated for candidate. candidate_hash=0xfcd08ae3f48e2eb589415d6cecffe6e65cfd8a7fb587323deae45faf8552ac67 session=26550 traceID=336048269717378469227419513318138701542
2022-12-09 00:39:42 Dispute on candidate concluded with 'valid' result candidate_hash=0xfcd08ae3f48e2eb589415d6cecffe6e65cfd8a7fb587323deae45faf8552ac67 session=26550 traceID=336048269717378469227419513318138701542
2022-12-09 00:39:42 subsystem exited with error subsystem="statement-distribution-subsystem" err=FromOrigin { origin: "statement-distribution", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) }
2022-12-09 00:39:42 subsystem exited with error subsystem="runtime-api-subsystem" err=Generated(Context("Signal channel is terminated and empty."))
2022-12-09 00:39:42 subsystem exited with error subsystem="dispute-coordinator-subsystem" err=FromOrigin { origin: "dispute-coordinator", source: ChainApiSenderDropped }
2022-12-09 00:39:42 GRANDPA voter error: safety invariant has been violated: `voter_commands_rx` was closed.
2022-12-09 00:39:42 Essential task `grandpa-voter` failed. Shutting down service

another validator:

2022-12-08 23:40:23 💤 Idle (40 peers), best: #15675153 (0x594b…c937), finalized #15675123 (0x1032…cc38), ⬇ 358.1kiB/s ⬆ 280.6kiB/s
2022-12-08 23:40:24 Subsystem dispute-coordinator-subsystem appears unresponsive. origin=polkadot_service::relay_chain_selection::SelectRelayChainInner<sc_client_db::Backend<sp_runtime::generic::block::Block<sp_runtime::generic::header::Header<u32, sp_runtime::traits::BlakeTwo256>, sp_runtime::OpaqueExtrinsic>>, polkadot_overseer::Handle>
2022-12-08 23:40:24 Call to `DetermineUndisputedChain` failed error=DetermineUndisputedChainCanceled(Canceled)
2022-12-08 23:40:24 err=ContextChannelClosed
2022-12-08 23:40:24 Unable to author block in slot. No best block header: Request for leaves from chain selection got canceled
2022-12-08 23:40:24 err=Subsystem(Generated(Context("Signal channel is terminated and empty.")))
2022-12-08 23:40:24 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) }
2022-12-08 23:40:24 subsystem exited with error subsystem="pvf-checker-subsystem" err=FromOrigin { origin: "pvf-checker", source: Generated(Context("Signal channel is terminated and empty.")) }
2022-12-08 23:40:24 subsystem exited with error subsystem="availability-recovery-subsystem" err=FromOrigin { origin: "availability-recovery", source: Generated(Context("Signal channel is terminated and empty.")) }
2022-12-08 23:40:24 subsystem exited with error subsystem="availability-distribution-subsystem" err=FromOrigin { origin: "availability-distribution", source: IncomingMessageChannel(Generated(Context("Signal channel is terminated and empty."))) }
2022-12-08 23:40:24 subsystem exited with error subsystem="approval-voting-subsystem" err=FromOrigin { origin: "approval-voting", source: Generated(Context("Signal channel is terminated and empty.")) }
2022-12-08 23:40:24 subsystem exited with error subsystem="chain-api-subsystem" err=FromOrigin { origin: "chain-api", source: Generated(Context("Signal channel is terminated and empty.")) }
2022-12-08 23:40:24 Fetch for approval votes got cancelled, only expected during shutdown!
2022-12-08 23:40:24 subsystem exited with error subsystem="statement-distribution-subsystem" err=FromOrigin { origin: "statement-distribution", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) }
2022-12-08 23:40:24 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) }
2022-12-08 23:40:24 subsystem exited with error subsystem="bitfield-signing-subsystem" err=FromOrigin { origin: "bitfield-signing", source: Generated(Context("Signal channel is terminated and empty.")) }
2022-12-08 23:40:24 Essential task `overseer` failed. Shutting down service.
2022-12-08 23:40:24 subsystem exited with error subsystem="candidate-validation-subsystem" err=FromOrigin { origin: "candidate-validation", source: Generated(Context("Signal channel is terminated and empty.")) }
2022-12-08 23:40:24 error receiving message from subsystem context: Generated(Context("Signal channel is terminated and empty.")) err=Generated(Context("Signal channel is terminated and empty."))
2022-12-08 23:40:24 Failed to receive a message from Overseer, exiting err=Generated(Context("Signal channel is terminated and empty."))
2022-12-08 23:40:24 subsystem exited with error subsystem="runtime-api-subsystem" err=Generated(Context("Signal channel is terminated and empty."))
2022-12-08 23:40:24 New dispute initiated for candidate. candidate_hash=0xfcd08ae3f48e2eb589415d6cecffe6e65cfd8a7fb587323deae45faf8552ac67 session=26550 traceID=336048269717378469227419513318138701542
2022-12-08 23:40:24 Dispute on candidate concluded with 'valid' result candidate_hash=0xfcd08ae3f48e2eb589415d6cecffe6e65cfd8a7fb587323deae45faf8552ac67 session=26550 traceID=336048269717378469227419513318138701542
2022-12-08 23:40:24 subsystem exited with error subsystem="dispute-coordinator-subsystem" err=FromOrigin { origin: "dispute-coordinator", source: ChainApiSenderDropped }

I only posted these two because the rest seem to be identical

@rphmeier
Copy link
Contributor

rphmeier commented Dec 9, 2022

Thanks for the logs, folks.

The working hypothesis is that the dispute-coordinator got overloaded, and is becoming unresponsive. We have a few corroborations of this theory, now, from a few different places. If anyone has more logs which contain the phrase "appears unresponsive", please post them here. And if by any chance there are folks running with parachain=debug, it would be fantastic to have a few logs showing the dispute coordinator's actions leading up to the stall.

It is interesting to note in both @tugytur's examples, that the dispute coordinator was considered unresponsive and then afterwards issued a log "Fetch for approval votes cancelled" followed by "New dispute initiated for candidate". This might indicate the code branch where the dispute coordinator is stalling.

@CertHum-Jim

This comment was marked as outdated.

@rphmeier
Copy link
Contributor

rphmeier commented Dec 9, 2022

Thanks @CertHum-Jim - hiding to reduce noise.

Please do not post logs unless your node specifically has gone down or has initiated a dispute, to avoid noise, but if in doubt please post and we will hide if not useful.

@senseless
Copy link
Contributor

senseless commented Dec 9, 2022

@rphmeier
Copy link
Contributor

rphmeier commented Dec 9, 2022

@senseless Thanks. The nodes here didn't go down, but did seize up and failed to sync at a reasonable rate. Presumably related to the "Flooding" logs.

@paradox-tt
Copy link

paradox-tt commented Dec 9, 2022

Posting from one node,

-- Logs begin at Fri 2022-12-02 11:29:36 UTC, end at Fri 2022-12-09 03:18:18 UTC. --

Dec 08 23:39:13 ns3219400 polkadot[364299]: 2022-12-08 23:39:13 dropping (Stream e9a274e4/23838) because buffer is full
Dec 08 23:39:13 ns3219400 polkadot[364299]: 2022-12-08 23:39:13 dropping (Stream e9a274e4/23840) because buffer is full
Dec 08 23:39:13 ns3219400 polkadot[364299]: 2022-12-08 23:39:13 dropping (Stream e9a274e4/23842) because buffer is full
Dec 08 23:39:13 ns3219400 polkadot[364299]: 2022-12-08 23:39:13 dropping (Stream e9a274e4/23844) because buffer is full
Dec 08 23:39:13 ns3219400 polkadot[364299]: 2022-12-08 23:39:13 dropping (Stream e9a274e4/23846) because buffer is full
Dec 08 23:39:13 ns3219400 polkadot[364299]: 2022-12-08 23:39:13 dropping (Stream e9a274e4/23848) because buffer is full
Dec 08 23:39:17 ns3219400 polkadot[364299]: 2022-12-08 23:39:17 💤 Idle (40 peers), best: #15675154 (0x799a…afb4), finalized #15675123 (0x1032…cc38), ⬇ 2.1MiB/s ⬆ 718.4kiB/s
Dec 08 23:39:19 ns3219400 polkadot[364299]: 2022-12-08 23:39:19 Creating inherent data providers took more time than we had left for the slot.
Dec 08 23:39:22 ns3219400 polkadot[364299]: 2022-12-08 23:39:22 💤 Idle (40 peers), best: #15675154 (0x799a…afb4), finalized #15675123 (0x1032…cc38), ⬇ 702.0kiB/s ⬆ 270.7kiB/s
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 Call to `DetermineUndisputedChain` failed error=DetermineUndisputedChainCanceled(Canceled)
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 Subsystem dispute-coordinator-subsystem appears unresponsive. origin=polkadot_service::relay_chain_selection::SelectRelayChainInner<sc_client_db::Backend<sp_runtime::generic::block::Block<sp_runtime::generic::header::Header<u32, sp_runtime::traits::BlakeTwo256>, sp_runtime::OpaqueExtrinsic>>, polkadot_overseer::Handle>
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 err=ContextChannelClosed
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 Failed to receive a message from Overseer, exiting err=Generated(Context("Signal channel is terminated and empty."))
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 subsystem exited with error subsystem="pvf-checker-subsystem" err=FromOrigin { origin: "pvf-checker", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 subsystem exited with error subsystem="approval-voting-subsystem" err=FromOrigin { origin: "approval-voting", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 subsystem exited with error subsystem="candidate-validation-subsystem" err=FromOrigin { origin: "candidate-validation", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) }
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) }
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 error receiving message from subsystem context: Generated(Context("Signal channel is terminated and empty.")) err=Generated(Context("Signal channel is terminated and empty."))
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 Essential task `overseer` failed. Shutting down service.
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 Unable to author block in slot. No best block header: Request for leaves from chain selection got canceled
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 err=Subsystem(Generated(Context("Signal channel is terminated and empty.")))
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 subsystem exited with error subsystem="statement-distribution-subsystem" err=FromOrigin { origin: "statement-distribution", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) }
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 subsystem exited with error subsystem="bitfield-signing-subsystem" err=FromOrigin { origin: "bitfield-signing", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 GRANDPA voter error: safety invariant has been violated: `voter_commands_rx` was closed.
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 Essential task `grandpa-voter` failed. Shutting down service.
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 subsystem exited with error subsystem="availability-recovery-subsystem" err=FromOrigin { origin: "availability-recovery", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 subsystem exited with error subsystem="runtime-api-subsystem" err=Generated(Context("Signal channel is terminated and empty."))
Dec 08 23:39:26 ns3219400 polkadot[364299]: 2022-12-08 23:39:26 subsystem exited with error subsystem="availability-distribution-subsystem" err=FromOrigin { origin: "availability-distribution", source: IncomingMessageChannel(Generated(Context("Signal channel is terminated and empty."))) }
Dec 08 23:49:43 ns3219400 systemd[1]: Stopping Kusama Node...

I invoked the stop manually when the error was detected.

I should have logs from two other nodes, shall I post as well?

@tdimitrov
Copy link
Contributor

I should have logs from two other nodes, shall I post as well?

Yes, please.

@senseless
Copy link
Contributor

@senseless Thanks. The nodes here didn't go down, but did seize up and failed to sync at a reasonable rate. Presumably related to the "Flooding" logs.

When they seized they failed to output any additional logs in the log file until I restarted the node.

@rphmeier
Copy link
Contributor

rphmeier commented Dec 9, 2022

Were those the absolute last logs in the log file? If so, something must have been cut off...

@rphmeier
Copy link
Contributor

rphmeier commented Dec 9, 2022

@paradox-tt, to confirm, the node outputting those logs did not actually shut down but just hung for 10 minutes until stopped manually?

@senseless
Copy link
Contributor

Were those the absolute last logs in the log file? If so, something must have been cut off...

Yes, the last line was the last line in the log file.

@paradox-tt
Copy link

paradox-tt commented Dec 9, 2022

To reduce on noise I used ... to replace a few lines of "dropping (Stream aaae890a/9699) because buffer is full".

This node hung to the point that it did not produce a block or heartbeat in the session and was slashed. I was able to restart and re-declare the intent for the next session.

Timing:

Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 subsystem exited with error subsystem="dispute-coordinator-subsystem" err=FromOrigin { origin: "dispute-coordinator", source: ChainApiSenderDropped }
Dec 09 00:45:15 ns3219400 systemd[1]: Stopping Kusama Node...
-- Logs begin at Fri 2022-12-02 11:29:36 UTC, end at Fri 2022-12-09 03:23:59 UTC. --
...
Dec 08 23:39:37 ns3219400 polkadot[527355]: 2022-12-08 23:39:37 💤 Idle (40 peers), best: #15675156 (0x2050…cc1b), finalized #15675123 (0x1032…cc38), ⬇ 1.2MiB/s ⬆ 763.1kiB/s
Dec 08 23:39:39 ns3219400 polkadot[527355]: 2022-12-08 23:39:39 ♻�  Reorg on #15675156,0x2050…cc1b to #15675151,0xdbda…58bf, common ancestor #15675150,0x85f8…bcf9
Dec 08 23:39:39 ns3219400 polkadot[527355]: 2022-12-08 23:39:39 ✨ Imported #15675151 (0xdbda…58bf)
Dec 08 23:39:42 ns3219400 polkadot[527355]: 2022-12-08 23:39:42 💤 Idle (40 peers), best: #15675151 (0xdbda…58bf), finalized #15675123 (0x1032…cc38), ⬇ 359.8kiB/s ⬆ 216.8kiB/s
Dec 08 23:39:47 ns3219400 polkadot[527355]: 2022-12-08 23:39:47 💤 Idle (40 peers), best: #15675151 (0xdbda…58bf), finalized #15675123 (0x1032…cc38), ⬇ 618.7kiB/s ⬆ 485.4kiB/s
Dec 08 23:39:50 ns3219400 polkadot[527355]: 2022-12-08 23:39:50 ✨ Imported #15675152 (0xa37a…ec14)
Dec 08 23:39:50 ns3219400 polkadot[527355]: 2022-12-08 23:39:50 dropping (Stream 1df5effb/11727) because buffer is full
Dec 08 23:39:50 ns3219400 polkadot[527355]: 2022-12-08 23:39:50 dropping (Stream 1df5effb/11729) because buffer is full
Dec 08 23:39:51 ns3219400 polkadot[527355]: 2022-12-08 23:39:51 dropping (Stream 044350ef/4833) because buffer is full
Dec 08 23:39:51 ns3219400 polkadot[527355]: 2022-12-08 23:39:51 dropping (Stream 044350ef/4835) because buffer is full
Dec 08 23:39:51 ns3219400 polkadot[527355]: 2022-12-08 23:39:51 dropping (Stream 044350ef/4837) because buffer is full
Dec 08 23:39:51 ns3219400 polkadot[527355]: 2022-12-08 23:39:51 dropping (Stream fb8e0f0c/6483) because buffer is full
Dec 08 23:39:51 ns3219400 polkadot[527355]: 2022-12-08 23:39:51 dropping (Stream fb8e0f0c/6485) because buffer is full
Dec 08 23:39:51 ns3219400 polkadot[527355]: 2022-12-08 23:39:51 dropping (Stream fb8e0f0c/6487) because buffer is full
...
Dec 08 23:39:52 ns3219400 polkadot[527355]: 2022-12-08 23:39:52 dropping (Stream aaae890a/10097) because buffer is full
Dec 08 23:39:52 ns3219400 polkadot[527355]: 2022-12-08 23:39:52 dropping (Stream 0771cf3d/15771) because buffer is full
Dec 08 23:39:52 ns3219400 polkadot[527355]: 2022-12-08 23:39:52 💤 Idle (40 peers), best: #15675152 (0xa37a…ec14), finalized #15675123 (0x1032…cc38), ⬇ 2.7MiB/s ⬆ 1013.4kiB/s
Dec 08 23:39:53 ns3219400 polkadot[527355]: 2022-12-08 23:39:53 dropping (Stream 2d1fa8d5/11258) because buffer is full
Dec 08 23:39:53 ns3219400 polkadot[527355]: 2022-12-08 23:39:53 dropping (Stream 2d1fa8d5/11260) because buffer is full
...
Dec 08 23:39:53 ns3219400 polkadot[527355]: 2022-12-08 23:39:53 dropping (Stream eb168d87/2984) because buffer is full
Dec 08 23:39:53 ns3219400 polkadot[527355]: 2022-12-08 23:39:53 dropping (Stream eb168d87/2986) because buffer is full
...
Dec 08 23:40:07 ns3219400 polkadot[527355]: 2022-12-08 23:40:07 dropping (Stream 8b88c072/3419) because buffer is full
Dec 08 23:40:07 ns3219400 polkadot[527355]: 2022-12-08 23:40:07 dropping (Stream 8b88c072/3421) because buffer is full
Dec 08 23:40:07 ns3219400 polkadot[527355]: 2022-12-08 23:40:07 💤 Idle (40 peers), best: #15675153 (0x594b…c937), finalized #15675123 (0x1032…cc38), ⬇ 2.3MiB/s ⬆ 978.8kiB/s
Dec 08 23:40:08 ns3219400 polkadot[527355]: 2022-12-08 23:40:08 dropping (Stream 0771cf3d/16009) because buffer is full
Dec 08 23:40:08 ns3219400 polkadot[527355]: 2022-12-08 23:40:08 dropping (Stream 0771cf3d/16011) because buffer is full
...
Dec 08 23:40:08 ns3219400 polkadot[527355]: 2022-12-08 23:40:08 dropping (Stream e3affccc/20524) because buffer is full
Dec 08 23:40:08 ns3219400 polkadot[527355]: 2022-12-08 23:40:08 dropping (Stream e3affccc/20526) because buffer is full
Dec 08 23:40:08 ns3219400 polkadot[527355]: 2022-12-08 23:40:08 dropping (Stream e3affccc/20528) because buffer is full
Dec 08 23:40:08 ns3219400 polkadot[527355]: 2022-12-08 23:40:08 dropping (Stream e3affccc/20530) because buffer is full
Dec 08 23:40:12 ns3219400 polkadot[527355]: 2022-12-08 23:40:12 💤 Idle (40 peers), best: #15675153 (0x594b…c937), finalized #15675123 (0x1032…cc38), ⬇ 944.4kiB/s ⬆ 346.2kiB/s
Dec 08 23:40:17 ns3219400 polkadot[527355]: 2022-12-08 23:40:17 💤 Idle (40 peers), best: #15675153 (0x594b…c937), finalized #15675123 (0x1032…cc38), ⬇ 383.1kiB/s ⬆ 299.2kiB/s
Dec 08 23:40:20 ns3219400 polkadot[527355]: 2022-12-08 23:40:20 dropping (Stream 68611826/9015) because buffer is full
Dec 08 23:40:20 ns3219400 polkadot[527355]: 2022-12-08 23:40:20 dropping (Stream 68611826/9017) because buffer is full
Dec 08 23:40:20 ns3219400 polkadot[527355]: 2022-12-08 23:40:20 dropping (Stream 68611826/9019) because buffer is full
Dec 08 23:40:20 ns3219400 polkadot[527355]: 2022-12-08 23:40:20 dropping (Stream 68611826/9035) because buffer is full
Dec 08 23:40:22 ns3219400 polkadot[527355]: 2022-12-08 23:40:22 💤 Idle (40 peers), best: #15675153 (0x594b…c937), finalized #15675123 (0x1032…cc38), ⬇ 286.7kiB/s ⬆ 240.8kiB/s
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 Call to `DetermineUndisputedChain` failed error=DetermineUndisputedChainCanceled(Canceled)
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 Subsystem dispute-coordinator-subsystem appears unresponsive. origin=polkadot_service::relay_chain_selection::SelectRelayChainInner<sc_client_db::Backend<sp_runtime::generic::block::Block<sp_runtime::generic::header::Header<u32, sp_runtime::traits::BlakeTwo256>, sp_runtime::OpaqueExtrinsic>>, polkadot_overseer::Handle>
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 err=ContextChannelClosed
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 subsystem exited with error subsystem="bitfield-signing-subsystem" err=FromOrigin { origin: "bitfield-signing", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 subsystem exited with error subsystem="availability-distribution-subsystem" err=FromOrigin { origin: "availability-distribution", source: IncomingMessageChannel(Generated(Context("Signal channel is terminated and empty."))) }
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 subsystem exited with error subsystem="availability-recovery-subsystem" err=FromOrigin { origin: "availability-recovery", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 Failed to receive a message from Overseer, exiting err=Generated(Context("Signal channel is terminated and empty."))
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) }
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) }
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 error receiving message from subsystem context: Generated(Context("Signal channel is terminated and empty.")) err=Generated(Context("Signal channel is terminated and empty."))
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 subsystem exited with error subsystem="approval-voting-subsystem" err=FromOrigin { origin: "approval-voting", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 Fetch for approval votes got cancelled, only expected during shutdown!
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 Essential task `overseer` failed. Shutting down service.
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 New dispute initiated for candidate. candidate_hash=0xe1e677e00a3a3411c4bad16559c4753c47cd51043159b1f4399caea155bfb91b session=26550 traceID=300272958686085533188489404059870328124
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 Dispute on candidate concluded with 'valid' result candidate_hash=0xe1e677e00a3a3411c4bad16559c4753c47cd51043159b1f4399caea155bfb91b session=26550 traceID=300272958686085533188489404059870328124
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 Unable to author block in slot. No best block header: Request for leaves from chain selection got canceled
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 Fetch for approval votes got cancelled, only expected during shutdown!
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 New dispute initiated for candidate. candidate_hash=0xfcd08ae3f48e2eb589415d6cecffe6e65cfd8a7fb587323deae45faf8552ac67 session=26550 traceID=336048269717378469227419513318138701542
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 Dispute on candidate concluded with 'valid' result candidate_hash=0xfcd08ae3f48e2eb589415d6cecffe6e65cfd8a7fb587323deae45faf8552ac67 session=26550 traceID=336048269717378469227419513318138701542
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 subsystem exited with error subsystem="runtime-api-subsystem" err=Generated(Context("Signal channel is terminated and empty."))
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 subsystem exited with error subsystem="statement-distribution-subsystem" err=FromOrigin { origin: "statement-distribution", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) }
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 GRANDPA voter error: safety invariant has been violated: `voter_commands_rx` was closed.
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 Essential task `grandpa-voter` failed. Shutting down service.
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 subsystem exited with error subsystem="candidate-validation-subsystem" err=FromOrigin { origin: "candidate-validation", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 err=Subsystem(Generated(Context("Signal channel is terminated and empty.")))
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 subsystem exited with error subsystem="chain-api-subsystem" err=FromOrigin { origin: "chain-api", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 subsystem exited with error subsystem="pvf-checker-subsystem" err=FromOrigin { origin: "pvf-checker", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:40:24 ns3219400 polkadot[527355]: 2022-12-08 23:40:24 subsystem exited with error subsystem="dispute-coordinator-subsystem" err=FromOrigin { origin: "dispute-coordinator", source: ChainApiSenderDropped }

@paradox-tt
Copy link

@paradox-tt, to confirm, the node outputting those logs did not actually shut down but just hung for 10 minutes until stopped manually?

That is correct.

@paradox-tt
Copy link

One more, very similar to the above:

-- Logs begin at Fri 2022-12-02 08:37:39 UTC, end at Fri 2022-12-09 03:38:50 UTC. --
Dec 08 23:40:10 ns3218828 polkadot[456133]: 2022-12-08 23:40:10 dropping (Stream a4e06d61/3477) because buffer is full
Dec 08 23:40:10 ns3218828 polkadot[456133]: 2022-12-08 23:40:10 dropping (Stream a4e06d61/3479) because buffer is full
Dec 08 23:40:10 ns3218828 polkadot[456133]: 2022-12-08 23:40:10 dropping (Stream a4e06d61/3481) because buffer is full
Dec 08 23:40:10 ns3218828 polkadot[456133]: 2022-12-08 23:40:10 dropping (Stream a4e06d61/3483) because buffer is full
Dec 08 23:40:10 ns3218828 polkadot[456133]: 2022-12-08 23:40:10 dropping (Stream a4e06d61/3485) because buffer is full
Dec 08 23:40:10 ns3218828 polkadot[456133]: 2022-12-08 23:40:10 dropping (Stream a4e06d61/3487) because buffer is full
Dec 08 23:40:10 ns3218828 polkadot[456133]: 2022-12-08 23:40:10 dropping (Stream a4e06d61/3489) because buffer is full
Dec 08 23:40:10 ns3218828 polkadot[456133]: 2022-12-08 23:40:10 💤 Idle (40 peers), best: #15675153 (0x594b…c937), finalized #15675123 (0x1032…cc38), ⬇ 2.4MiB/s ⬆ 963.8kiB/s
Dec 08 23:40:16 ns3218828 polkadot[456133]: 2022-12-08 23:40:15 💤 Idle (40 peers), best: #15675153 (0x594b…c937), finalized #15675123 (0x1032…cc38), ⬇ 346.5kiB/s ⬆ 258.7kiB/s
Dec 08 23:40:21 ns3218828 polkadot[456133]: 2022-12-08 23:40:20 💤 Idle (40 peers), best: #15675153 (0x594b…c937), finalized #15675123 (0x1032…cc38), ⬇ 425.6kiB/s ⬆ 245.7kiB/s
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 dropping (Stream 79eebc38/3803) because buffer is full
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 dropping (Stream 79eebc38/3805) because buffer is full
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 dropping (Stream 79eebc38/3807) because buffer is full
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 dropping (Stream 79eebc38/3809) because buffer is full
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 dropping (Stream 79eebc38/3811) because buffer is full
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 dropping (Stream 79eebc38/3813) because buffer is full
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 dropping (Stream 79eebc38/3815) because buffer is full
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 dropping (Stream 79eebc38/3817) because buffer is full
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 dropping (Stream 79eebc38/3819) because buffer is full
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 dropping (Stream 79eebc38/3821) because buffer is full
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 Subsystem dispute-coordinator-subsystem appears unresponsive. origin=polkadot_service::relay_chain_selection::SelectRelayChainInner<sc_client_db::Backend<sp_runtime::generic::block::Block<sp_runtime::generic::header::Header<u32, sp_runtime::traits::BlakeTwo256>, sp_runtime::OpaqueExtrinsic>>, polkadot_overseer::Handle>
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 Call to `DetermineUndisputedChain` failed error=DetermineUndisputedChainCanceled(Canceled)
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 err=ContextChannelClosed
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 Failed to receive a message from Overseer, exiting err=Generated(Context("Signal channel is terminated and empty."))
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 error receiving message from subsystem context: Generated(Context("Signal channel is terminated and empty.")) err=Generated(Context("Signal channel is terminated and empty."))
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 subsystem exited with error subsystem="candidate-validation-subsystem" err=FromOrigin { origin: "candidate-validation", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 subsystem exited with error subsystem="bitfield-signing-subsystem" err=FromOrigin { origin: "bitfield-signing", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 err=Subsystem(Generated(Context("Signal channel is terminated and empty.")))
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) }
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 subsystem exited with error subsystem="availability-recovery-subsystem" err=FromOrigin { origin: "availability-recovery", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 subsystem exited with error subsystem="pvf-checker-subsystem" err=FromOrigin { origin: "pvf-checker", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 subsystem exited with error subsystem="chain-api-subsystem" err=FromOrigin { origin: "chain-api", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 subsystem exited with error subsystem="approval-voting-subsystem" err=FromOrigin { origin: "approval-voting", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 subsystem exited with error subsystem="availability-distribution-subsystem" err=FromOrigin { origin: "availability-distribution", source: IncomingMessageChannel(Generated(Context("Signal channel is terminated and empty."))) }
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 Fetch for approval votes got cancelled, only expected during shutdown!
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 Essential task `overseer` failed. Shutting down service.
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 Unable to author block in slot. No best block header: Request for leaves from chain selection got canceled
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 New dispute initiated for candidate. candidate_hash=0xe1e677e00a3a3411c4bad16559c4753c47cd51043159b1f4399caea155bfb91b session=26550 traceID=300272958686085533188489404059870328124
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 Dispute on candidate concluded with 'valid' result candidate_hash=0xe1e677e00a3a3411c4bad16559c4753c47cd51043159b1f4399caea155bfb91b session=26550 traceID=300272958686085533188489404059870328124
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) }
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 subsystem exited with error subsystem="statement-distribution-subsystem" err=FromOrigin { origin: "statement-distribution", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) }
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 GRANDPA voter error: safety invariant has been violated: `voter_commands_rx` was closed.
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 Essential task `grandpa-voter` failed. Shutting down service.
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 subsystem exited with error subsystem="runtime-api-subsystem" err=Generated(Context("Signal channel is terminated and empty."))
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 Fetch for approval votes got cancelled, only expected during shutdown!
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 New dispute initiated for candidate. candidate_hash=0xfcd08ae3f48e2eb589415d6cecffe6e65cfd8a7fb587323deae45faf8552ac67 session=26550 traceID=336048269717378469227419513318138701542
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 Dispute on candidate concluded with 'valid' result candidate_hash=0xfcd08ae3f48e2eb589415d6cecffe6e65cfd8a7fb587323deae45faf8552ac67 session=26550 traceID=336048269717378469227419513318138701542
Dec 08 23:40:24 ns3218828 polkadot[456133]: 2022-12-08 23:40:24 subsystem exited with error subsystem="dispute-coordinator-subsystem" err=FromOrigin { origin: "dispute-coordinator", source: ChainApiSenderDropped }
Dec 09 00:35:17 ns3218828 systemd[1]: Stopping Kusama Node...

I'm staying up for another 10 minutes then heading to sleep (its 3:40 AM). Don't hesitate to post any follow-up questions, I'll glance and action before retiring for the night.

@carumusan
Copy link

2022-12-08 23:38:55 subsystem exited with error subsystem="dispute-coordinator-subsystem" err=FromOrigin { origin: "dispute-coordinator", source: ChainApiSenderDropped } |  
  |   | 2022-12-08 23:38:55 subsystem exited with error subsystem="statement-distribution-subsystem" err=FromOrigin { origin: "statement-distribution", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) } |  
  |   | 2022-12-08 23:38:55 Dispute on candidate concluded with 'valid' result candidate_hash=0xfcd08ae3f48e2eb589415d6cecffe6e65cfd8a7fb587323deae45faf8552ac67 session=26550 traceID=336048269717378469227419513318138701542 |  
  |   | 2022-12-08 23:38:55 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) } |  
  |   | 2022-12-08 23:38:55 Unable to author block in slot. No best block header: Request for leaves from chain selection got canceled |  
  |   | 2022-12-08 23:38:55 New dispute initiated for candidate. candidate_hash=0xfcd08ae3f48e2eb589415d6cecffe6e65cfd8a7fb587323deae45faf8552ac67 session=26550 traceID=336048269717378469227419513318138701542 |  
  |   | 2022-12-08 23:38:55 Fetch for approval votes got cancelled, only expected during shutdown! |  
  |   | 2022-12-08 23:38:55 Essential task `overseer` failed. Shutting down service. |  
  |   | 2022-12-08 23:38:55 subsystem exited with error subsystem="chain-api-subsystem" err=FromOrigin { origin: "chain-api", source: Generated(Context("Signal channel is terminated and empty.")) } |  
  |   | 2022-12-08 23:38:55 subsystem exited with error subsystem="runtime-api-subsystem" err=Generated(Context("Signal channel is terminated and empty.")) |  
  |   | 2022-12-08 23:38:55 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) } |  
  |   | 2022-12-08 23:38:55 error receiving message from subsystem context: Generated(Context("Signal channel is terminated and empty.")) err=Generated(Context("Signal channel is terminated and empty.")) |  
  |   | 2022-12-08 23:38:55 subsystem exited with error subsystem="availability-recovery-subsystem" err=FromOrigin { origin: "availability-recovery", source: Generated(Context("Signal channel is terminated and empty.")) } |  
  |   | 2022-12-08 23:38:55 subsystem exited with error subsystem="availability-distribution-subsystem" err=FromOrigin { origin: "availability-distribution", source: IncomingMessageChannel(Generated(Context("Signal channel is terminated and empty."))) } |  
  |   | 2022-12-08 23:38:55 err=ContextChannelClosed |  
  |   | 2022-12-08 23:38:55 Failed to receive a message from Overseer, exiting err=Generated(Context("Signal channel is terminated and empty.")) |  
  |   | 2022-12-08 23:38:55 subsystem exited with error subsystem="approval-voting-subsystem" err=FromOrigin { origin: "approval-voting", source: Generated(Context("Signal channel is terminated and empty.")) } |  
  |   | 2022-12-08 23:38:55 subsystem exited with error subsystem="pvf-checker-subsystem" err=FromOrigin { origin: "pvf-checker", source: Generated(Context("Signal channel is terminated and empty.")) } |  
  |   | 2022-12-08 23:38:55 err=Subsystem(Generated(Context("Signal channel is terminated and empty."))) |  
  |   | 2022-12-08 23:38:55 subsystem exited with error subsystem="candidate-validation-subsystem" err=FromOrigin { origin: "candidate-validation", source: Generated(Context("Signal channel is terminated and empty.")) } |  
  |   | 2022-12-08 23:38:55 subsystem exited with error subsystem="bitfield-signing-subsystem" err=FromOrigin { origin: "bitfield-signing", source: Generated(Context("Signal channel is terminated and empty.")) } |  
  |   | 2022-12-08 23:38:55 Call to `DetermineUndisputedChain` failed error=DetermineUndisputedChainCanceled(Canceled) |  
  |   | 2022-12-08 23:38:55 Subsystem dispute-coordinator-subsystem appears unresponsive. origin=polkadot_service::relay_chain_selection::SelectRelayChainInner<sc_client_db::Backend<sp_runtime::generic::block::Block<sp_runtime::generic::header::Header<u32, sp_runtime::traits::BlakeTwo256>, sp_runtime::OpaqueExtrinsic>>, polkadot_overseer::Handle> |  
  |   | 

@StakeHulk
Copy link

2022-12-08 23:40:56 New dispute initiated for candidate. candidate_hash=0x5799d2aff22f22b837e6fa474724ba95587de8f65ae84c3afc5d348f59eb6e7c session=26550 traceID=11644153029854107226582139140959769>
Dec 08 23:40:56 C87BB98 polkadot[1194]: 2022-12-08 23:40:56 Dispute on candidate concluded with 'valid' result candidate_hash=0x5799d2aff22f22b837e6fa474724ba95587de8f65ae84c3afc5d348f59eb6e7c session=26550 traceID=116441530298541072265>
Dec 08 23:40:59 C87BB98 polkadot[1194]: 2022-12-08 23:40:59 Creating inherent data providers took more time than we had left for the slot.
Dec 08 23:40:59 C87BB98 polkadot[1194]: 2022-12-08 23:40:59 💤 Idle (40 peers), best: #15675157 (0xaf6d…a970), finalized #15675123 (0x1032…cc38), ⬇ 3.4MiB/s ⬆ 1.2MiB/s
Dec 08 23:41:00 C87BB98 polkadot[1194]: 2022-12-08 23:41:00 ✨ Imported #15675158 (0x6f55…0cde)
Dec 08 23:41:00 C87BB98 polkadot[1194]: 2022-12-08 23:41:00 dropping (Stream 38b130d2/52460) because buffer is full
Dec 08 23:41:00 C87BB98 polkadot[1194]: 2022-12-08 23:41:00 dropping (Stream 38b130d2/52462) because buffer is full
Dec 08 23:41:00 C87BB98 polkadot[1194]: 2022-12-08 23:41:00 dropping (Stream 38b130d2/52464) because buffer is full
Dec 08 23:41:04 C87BB98 polkadot[1194]: 2022-12-08 23:41:04 💤 Idle (40 peers), best: #15675158 (0x6f55…0cde), finalized #15675123 (0x1032…cc38), ⬇ 671.3kiB/s ⬆ 290.9kiB/s
Dec 08 23:41:06 C87BB98 polkadot[1194]: 2022-12-08 23:41:06 ✨ Imported #15675159 (0x9522…2d9d)
Dec 08 23:41:09 C87BB98 polkadot[1194]: 2022-12-08 23:41:09 💤 Idle (40 peers), best: #15675159 (0x9522…2d9d), finalized #15675123 (0x1032…cc38), ⬇ 783.8kiB/s ⬆ 355.7kiB/s
Dec 08 23:41:14 C87BB98 polkadot[1194]: 2022-12-08 23:41:14 💤 Idle (40 peers), best: #15675159 (0x9522…2d9d), finalized #15675123 (0x1032…cc38), ⬇ 421.3kiB/s ⬆ 203.5kiB/s
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 Call to DetermineUndisputedChain failed error=DetermineUndisputedChainCanceled(Canceled)
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 Subsystem dispute-coordinator-subsystem appears unresponsive. origin=polkadot_service::relay_chain_selection::SelectRelayChainInner<sc_client_db::Backend<sp_runtime::generic::b>
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 subsystem exited with error subsystem="candidate-validation-subsystem" err=FromOrigin { origin: "candidate-validation", source: Generated(Context("Signal channel is terminated >
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 subsystem exited with error subsystem="bitfield-signing-subsystem" err=FromOrigin { origin: "bitfield-signing", source: Generated(Context("Signal channel is terminated and empt>
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 subsystem exited with error subsystem="pvf-checker-subsystem" err=FromOrigin { origin: "pvf-checker", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 subsystem exited with error subsystem="availability-distribution-subsystem" err=FromOrigin { origin: "availability-distribution", source: IncomingMessageChannel(Generated(Conte>
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 subsystem exited with error subsystem="availability-recovery-subsystem" err=FromOrigin { origin: "availability-recovery", source: Generated(Context("Signal channel is terminate>
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 Failed to receive a message from Overseer, exiting err=Generated(Context("Signal channel is terminated and empty."))
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 err=ContextChannelClosed
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 subsystem exited with error subsystem="statement-distribution-subsystem" err=FromOrigin { origin: "statement-distribution", source: SubsystemReceive(Generated(Context("Signal c>
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 subsystem exited with error subsystem="runtime-api-subsystem" err=Generated(Context("Signal channel is terminated and empty."))
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 subsystem exited with error subsystem="chain-api-subsystem" err=FromOrigin { origin: "chain-api", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 err=Subsystem(Generated(Context("Signal channel is terminated and empty.")))
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 error receiving message from subsystem context: Generated(Context("Signal channel is terminated and empty.")) err=Generated(Context("Signal channel is terminated and empty."))
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminat>
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 subsystem exited with error subsystem="approval-voting-subsystem" err=FromOrigin { origin: "approval-voting", source: Generated(Context("Signal channel is terminated and empty.>
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 Essential task overseer failed. Shutting down service.
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 Unable to author block in slot. No best block header: Request for leaves from chain selection got canceled
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminat>
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 Fetch for approval votes got cancelled, only expected during shutdown!
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 New dispute initiated for candidate. candidate_hash=0x8ac8628da77e8de96bffc5299ff73107e5daf607e095159d6fbcccfbc7d53c27 session=26550 traceID=18447392168917441981438830011589428>
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 Dispute on candidate concluded with 'valid' result candidate_hash=0x8ac8628da77e8de96bffc5299ff73107e5daf607e095159d6fbcccfbc7d53c27 session=26550 traceID=184473921689174419814>
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 Fetch for approval votes got cancelled, only expected during shutdown!
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 New dispute initiated for candidate. candidate_hash=0xa47e62f05fdf22271f8bef07d363b5a0268e3b878e2c13e31a43c313542761f4 session=26550 traceID=21864962743347258039752022323588469>
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 Dispute on candidate concluded with 'valid' result candidate_hash=0xa47e62f05fdf22271f8bef07d363b5a0268e3b878e2c13e31a43c313542761f4 session=26550 traceID=218649627433472580397>
Dec 08 23:41:16 C87BB98 polkadot[1194]: 2022-12-08 23:41:16 subsystem exited with error subsystem="dispute-coordinator-subsystem" err=FromOrigin { origin: "dispute-coordinator", source: ChainApiSenderDropped }
lines 431-500/500 (END)

@StakeHulk
Copy link

and the service won't start now

@rphmeier
Copy link
Contributor

rphmeier commented Dec 9, 2022

@StakeHulk What logs do you get when starting the service?

@StakeHulk
Copy link

and the service won't start now

now it has started successfully

seeing this log now

@StakeHulk

This comment was marked as outdated.

@StakeHulk

This comment was marked as outdated.

@rphmeier
Copy link
Contributor

rphmeier commented Dec 9, 2022

@StakeHulk Thanks for posting. this appears ~unrelated to the issue directly but is worth filing in another issue. I will hide those comments in this thread.

@stakeworld
Copy link
Contributor

stakeworld commented Dec 9, 2022

I had a crashed node, kusama paritydb around 01 am with Essential task overseer failed. Shutting down service.. It did do an automatic restart but i still didn't see it in my monitoring so restarted and after that everything went back to normal.

It started with a lot of disputes, then runtimerequest not supported:

Dec 09 00:54:48 polkadot[3713604]: 2022-12-09 00:54:48 New dispute initiated for candidate. candidate_hash=0xf3082efd31306ceff90c159ab952e18da27c2a2037856a3149e3372922541fdb session=26550 traceID=32304>
Dec 09 00:54:48 polkadot[3713604]: 2022-12-09 00:54:48 cannot query the runtime API version: UnknownBlock: State already discarded for 0xf5a042d9ea567a1a8cee9aee0da59651d50ad48fe490f2b161e54171d558a30b
Dec 09 00:54:48 polkadot[3713604]: 2022-12-09 00:54:48 error=Runtime(RuntimeRequest(NotSupported { runtime_api_name: "candidate_events" }))

Then function almost normal until a real crash:

Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 💤 Idle (20 peers), best: #15675255 (0x0ae6…1b6d), finalized #15675252 (0xef1c…be11), ⬇ 639.6kiB/s ⬆ 585.1kiB/s
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 Subsystem dispute-coordinator-subsystem appears unresponsive. origin=polkadot_service::relay_chain_selection::SelectRelayChainInner<sc_client_db::>
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 subsystem exited with error subsystem="bitfield-signing-subsystem" err=FromOrigin { origin: "bitfield-signing", source: Generated(Context("Signal >
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 subsystem exited with error subsystem="chain-api-subsystem" err=FromOrigin { origin: "chain-api", source: Generated(Context("Signal channel is ter>
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 err=Subsystem(Generated(Context("Signal channel is terminated and empty.")))
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 subsystem exited with error subsystem="approval-voting-subsystem" err=FromOrigin { origin: "approval-voting", source: Generated(Context("Signal ch>
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 subsystem exited with error subsystem="pvf-checker-subsystem" err=FromOrigin { origin: "pvf-checker", source: Generated(Context("Signal channel is>
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 subsystem exited with error subsystem="candidate-validation-subsystem" err=FromOrigin { origin: "candidate-validation", source: Generated(Context(>
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 Fetch for approval votes got cancelled, only expected during shutdown!
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 Dispute on candidate concluded with 'valid' result candidate_hash=0xa89fed42bb605287564b01d767d3cccd7dc3ac092cc58198ac1fcbc2e67a970f session=26550>
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 subsystem exited with error subsystem="statement-distribution-subsystem" err=FromOrigin { origin: "statement-distribution", source: SubsystemRecei>
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 subsystem exited with error subsystem="availability-distribution-subsystem" err=FromOrigin { origin: "availability-distribution", source: Incoming>
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 subsystem exited with error subsystem="availability-recovery-subsystem" err=FromOrigin { origin: "availability-recovery", source: Generated(Contex>
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 Failed to receive a message from Overseer, exiting err=Generated(Context("Signal channel is terminated and empty."))
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 subsystem exited with error subsystem="dispute-coordinator-subsystem" err=FromOrigin { origin: "dispute-coordinator", source: ChainApiSenderDroppe>
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 Unable to author block in slot. No best block header: Request for leaves from chain selection got canceled
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 Encountered error finding best chain containing 0x5542ed413f2bc10d7c9608d84ff9c91b9f6008912351cd48ba42bc72223883aa: Request for leaves from chain >
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 Could not cast prevote: previously known block 0x5542ed413f2bc10d7c9608d84ff9c91b9f6008912351cd48ba42bc72223883aa has disappeared
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 err=ContextChannelClosed
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 subsystem exited with error subsystem="runtime-api-subsystem" err=Generated(Context("Signal channel is terminated and empty."))
Dec 09 01:03:18 polkadot[3713604]: 2022-12-09 01:03:18 Essential task `overseer` failed. Shutting down service.
Dec 09 01:12:00 systemd[1]: Stopping Kusama STAKEWORLD/02...
Dec 09 01:13:30 systemd[1]: stakeworld-02k.service: State 'stop-sigterm' timed out. Killing.
Dec 09 01:13:30 systemd[1]: stakeworld-02k.service: Killing process 3713604 (polkadot) with signal SIGKILL.
Dec 09 01:13:30 systemd[1]: stakeworld-02k.service: Killing process 3713609 (tokio-runtime-w) with signal SIGKILL.

@alko89
Copy link

alko89 commented Dec 9, 2022

I noticed the first dispute logs appear at 00:25, at 00:40 the node completely hanged. Uploading the log dump from that time period, hope it helps.

kusama-2022-12-09.txt

@openbitlab2
Copy link

Logs from our validator:

Dec  9 00:03:47 kusama-mene1 polkadot[172438]: 2022-12-09 00:03:47 💤 Idle (40 peers), best: #15675258 (0x063d…08d7), finalized #15675255 (0x0ae6…1b6d), ⬇ 1.8MiB/s ⬆ 909.6kiB/s
Dec  9 00:03:48 kusama-mene1 polkadot[172438]: 2022-12-09 00:03:48 ✨ Imported #15675259 (0x1f43…4683)
Dec  9 00:03:52 kusama-mene1 polkadot[172438]: 2022-12-09 00:03:52 💤 Idle (40 peers), best: #15675259 (0x1f43…4683), finalized #15675255 (0x0ae6…1b6d), ⬇ 872.3kiB/s ⬆ 339.6kiB/s
Dec  9 00:03:54 kusama-mene1 polkadot[172438]: 2022-12-09 00:03:54 ✨ Imported #15675260 (0x44f8…a22e)
Dec  9 00:03:54 kusama-mene1 polkadot[172438]: 2022-12-09 00:03:54 Dispute on candidate concluded with 'valid' result candidate_hash=0x2cd649044dca4e9ea15c8c3cd9166c42f070ad94e49ae85954f058a16d2ee8b1 session=26550 traceID=59598664299151388191875610784774384706
Dec  9 00:03:54 kusama-mene1 polkadot[172438]: 2022-12-09 00:03:54 Missing block weight for new head. Skipping chain. hash=0xca7eb522bdc431d0975fcdd2060dde5e4ca061e1b1ebeb8c8e5f9a7aaeeadb09
Dec  9 00:03:57 kusama-mene1 polkadot[172438]: 2022-12-09 00:03:57 💤 Idle (40 peers), best: #15675260 (0x44f8…a22e), finalized #15675256 (0x8475…da1f), ⬇ 866.9kiB/s ⬆ 808.8kiB/s
Dec  9 00:04:02 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:02 💤 Idle (40 peers), best: #15675260 (0x44f8…a22e), finalized #15675256 (0x8475…da1f), ⬇ 467.8kiB/s ⬆ 431.7kiB/s
Dec  9 00:04:04 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:04 Missing block weight for new head. Skipping chain. hash=0xca7eb522bdc431d0975fcdd2060dde5e4ca061e1b1ebeb8c8e5f9a7aaeeadb09
Dec  9 00:04:04 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:04 cannot query the runtime API version: UnknownBlock: State already discarded for 0x75fecc2074c4fb48f565a70b2a64b8f461326d70e67a012bea4c388a63a548ea
Dec  9 00:04:04 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:04 cannot query the runtime API version: UnknownBlock: State already discarded for 0x75fecc2074c4fb48f565a70b2a64b8f461326d70e67a012bea4c388a63a548ea
Dec  9 00:04:04 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:04 cannot query the runtime API version: UnknownBlock: State already discarded for 0x75fecc2074c4fb48f565a70b2a64b8f461326d70e67a012bea4c388a63a548ea
Dec  9 00:04:04 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:04 Our view contains 0x75fe…48ea, but not in active heads added=0x75fe…48ea
Dec  9 00:04:07 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:07 💤 Idle (40 peers), best: #15675260 (0x44f8…a22e), finalized #15675258 (0x063d…08d7), ⬇ 539.4kiB/s ⬆ 571.4kiB/s
Dec  9 00:04:12 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:12 ✨ Imported #15675261 (0x018b…f453)
Dec  9 00:04:12 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:12 💤 Idle (40 peers), best: #15675261 (0x018b…f453), finalized #15675258 (0x063d…08d7), ⬇ 409.8kiB/s ⬆ 586.8kiB/s
Dec  9 00:04:17 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:17 💤 Idle (40 peers), best: #15675261 (0x018b…f453), finalized #15675258 (0x063d…08d7), ⬇ 685.6kiB/s ⬆ 925.0kiB/s
Dec  9 00:04:22 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:22 💤 Idle (40 peers), best: #15675261 (0x018b…f453), finalized #15675258 (0x063d…08d7), ⬇ 560.7kiB/s ⬆ 570.8kiB/s
Dec  9 00:04:24 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:24 ✨ Imported #15675262 (0x2116…b541)
Dec  9 00:04:24 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:24 Missing block weight for new head. Skipping chain. hash=0xca7eb522bdc431d0975fcdd2060dde5e4ca061e1b1ebeb8c8e5f9a7aaeeadb09
Dec  9 00:04:27 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:27 💤 Idle (40 peers), best: #15675262 (0x2116…b541), finalized #15675258 (0x063d…08d7), ⬇ 412.2kiB/s ⬆ 715.2kiB/s
Dec  9 00:04:30 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:30 ✨ Imported #15675263 (0xfe82…b975)
Dec  9 00:04:32 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:32 💤 Idle (40 peers), best: #15675263 (0xfe82…b975), finalized #15675259 (0x1f43…4683), ⬇ 851.7kiB/s ⬆ 460.4kiB/s
Dec  9 00:04:34 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:34 New dispute initiated for candidate. candidate_hash=0x099b526589c28af01cfed1bca487222464723152a71fecddb18ce0a61ce4f9a9 session=26550 traceID=12769529177403688586641415157907399204
Dec  9 00:04:34 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:34 Dispute on candidate concluded with 'valid' result candidate_hash=0x099b526589c28af01cfed1bca487222464723152a71fecddb18ce0a61ce4f9a9 session=26550 traceID=12769529177403688586641415157907399204
Dec  9 00:04:34 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:34 Missing block weight for new head. Skipping chain. hash=0xca7eb522bdc431d0975fcdd2060dde5e4ca061e1b1ebeb8c8e5f9a7aaeeadb09
Dec  9 00:04:36 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:36 ✨ Imported #15675264 (0x4f9e…4a06)
Dec  9 00:04:37 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:37 💤 Idle (40 peers), best: #15675264 (0x4f9e…4a06), finalized #15675259 (0x1f43…4683), ⬇ 829.5kiB/s ⬆ 953.5kiB/s
Dec  9 00:04:42 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:42 ✨ Imported #15675265 (0x7e83…9436)
Dec  9 00:04:42 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:42 💤 Idle (40 peers), best: #15675265 (0x7e83…9436), finalized #15675260 (0x44f8…a22e), ⬇ 762.6kiB/s ⬆ 602.7kiB/s
Dec  9 00:04:44 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:44 Subsystem dispute-coordinator-subsystem appears unresponsive. origin=polkadot_service::relay_chain_selection::SelectRelayChainInner<sc_client_db::Backend<sp_runtime::generic::block::Block<sp_runtime::generic::header::Header<u32, sp_runtime::traits::BlakeTwo256>, sp_runtime::OpaqueExtrinsic>>, polkadot_overseer::Handle>
Dec  9 00:04:44 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:44 err=ContextChannelClosed
Dec  9 00:04:44 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:44 Fetch for approval votes got cancelled, only expected during shutdown!
Dec  9 00:04:44 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:44 Unable to author block in slot. No best block header: Request for leaves from chain selection got canceled
Dec  9 00:04:44 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:44 Essential task `overseer` failed. Shutting down service.
Dec  9 00:04:44 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:44 New dispute initiated for candidate. candidate_hash=0x6eccee7c701aa440a12b6e5e008df32a975d21809df8f384c52d0884acc1633d session=26550 traceID=147279145167954200553586695736319144746
Dec  9 00:04:44 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:44 Dispute on candidate concluded with 'valid' result candidate_hash=0x6eccee7c701aa440a12b6e5e008df32a975d21809df8f384c52d0884acc1633d session=26550 traceID=147279145167954200553586695736319144746
Dec  9 00:04:44 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:44 err=Subsystem(Generated(Context("Signal channel is terminated and empty.")))
Dec  9 00:04:44 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:44 subsystem exited with error subsystem="approval-voting-subsystem" err=FromOrigin { origin: "approval-voting", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec  9 00:04:44 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:44 subsystem exited with error subsystem="chain-api-subsystem" err=FromOrigin { origin: "chain-api", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec  9 00:04:44 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:44 subsystem exited with error subsystem="bitfield-signing-subsystem" err=FromOrigin { origin: "bitfield-signing", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec  9 00:04:44 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:44 subsystem exited with error subsystem="pvf-checker-subsystem" err=FromOrigin { origin: "pvf-checker", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec  9 00:04:44 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:44 Encountered error finding best chain containing 0x063dc2d1a8b8ab1c8ee9ffcf23ed5edfbd8affe2dfd7503add992882865c08d7: Request for leaves from chain selection got canceled
Dec  9 00:04:44 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:44 Could not cast prevote: previously known block 0x063dc2d1a8b8ab1c8ee9ffcf23ed5edfbd8affe2dfd7503add992882865c08d7 has disappeared
Dec  9 00:04:44 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:44 subsystem exited with error subsystem="statement-distribution-subsystem" err=FromOrigin { origin: "statement-distribution", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) }
Dec  9 00:04:44 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:44 subsystem exited with error subsystem="runtime-api-subsystem" err=Generated(Context("Signal channel is terminated and empty."))
Dec  9 00:04:44 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:44 subsystem exited with error subsystem="candidate-validation-subsystem" err=FromOrigin { origin: "candidate-validation", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec  9 00:04:44 kusama-mene1 polkadot[172438]: 2022-12-09 00:04:44 subsystem exited with error subsystem="dispute-coordinator-subsystem" err=FromOrigin { origin: "dispute-coordinator", source: ChainApiSenderDropped }


@ilhanu
Copy link

ilhanu commented Dec 9, 2022

Logs for our slashed validator:

Dec 09 00:02:00 ms2737 polkadot[860701]: 2022-12-09 00:02:00 💤 Idle (40 peers), best: #15675250 (0x115c…bcf8), finalized #15675244 (0x5542…83aa), ⬇ 754.9kiB/s ⬆ 892.8kiB/s
Dec 09 00:02:04 ms2737 polkadot[860701]: 2022-12-09 00:02:04 Missing block weight for new head. Skipping chain. hash=0x03935ef18e7c8aef501e31b10c63f689116dba2feb85f0b25bc8d8a2d40f0e3e
Dec 09 00:02:05 ms2737 polkadot[860701]: 2022-12-09 00:02:05 💤 Idle (40 peers), best: #15675250 (0x115c…bcf8), finalized #15675244 (0x5542…83aa), ⬇ 671.6kiB/s ⬆ 767.3kiB/s
Dec 09 00:02:06 ms2737 polkadot[860701]: 2022-12-09 00:02:06 ✨ Imported #15675251 (0xf56b…e853)
Dec 09 00:02:06 ms2737 polkadot[860701]: 2022-12-09 00:02:06 ♻️  Reorg on #15675251,0xf56b…e853 to #15675251,0x5b22…b097, common ancestor #15675250,0x115c…bcf8
Dec 09 00:02:06 ms2737 polkadot[860701]: 2022-12-09 00:02:06 ✨ Imported #15675251 (0x5b22…b097)
Dec 09 00:02:10 ms2737 polkadot[860701]: 2022-12-09 00:02:10 💤 Idle (40 peers), best: #15675251 (0x5b22…b097), finalized #15675244 (0x5542…83aa), ⬇ 995.8kiB/s ⬆ 2.1MiB/s
Dec 09 00:02:12 ms2737 polkadot[860701]: 2022-12-09 00:02:12 ✨ Imported #15675252 (0xef1c…be11)
Dec 09 00:02:12 ms2737 polkadot[860701]: 2022-12-09 00:02:12 ✨ Imported #15675252 (0x9613…e0a9)
Dec 09 00:02:14 ms2737 polkadot[860701]: 2022-12-09 00:02:14 New dispute initiated for candidate. candidate_hash=0x8aaeb37ff89f835d5eea9c9791c15b5a5b3f9917064c61826d26c76396df24fd session=26550 traceID=184340563761944265422074650637983898458
Dec 09 00:02:14 ms2737 polkadot[860701]: 2022-12-09 00:02:14 Missing block weight for new head. Skipping chain. hash=0x03935ef18e7c8aef501e31b10c63f689116dba2feb85f0b25bc8d8a2d40f0e3e
Dec 09 00:02:15 ms2737 polkadot[860701]: 2022-12-09 00:02:15 💤 Idle (40 peers), best: #15675252 (0xef1c…be11), finalized #15675244 (0x5542…83aa), ⬇ 856.5kiB/s ⬆ 2.4MiB/s
Dec 09 00:02:20 ms2737 polkadot[860701]: 2022-12-09 00:02:20 💤 Idle (40 peers), best: #15675252 (0xef1c…be11), finalized #15675244 (0x5542…83aa), ⬇ 834.5kiB/s ⬆ 577.5kiB/s
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 Subsystem dispute-coordinator-subsystem appears unresponsive. origin=polkadot_service::relay_chain_selection::SelectRelayChainInner<sc_client_db::Backend<sp_runtime::generic::block::Block<>
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 subsystem exited with error subsystem="candidate-validation-subsystem" err=FromOrigin { origin: "candidate-validation", source: Generated(Context("Signal channel is terminated and empty.")>
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 subsystem exited with error subsystem="pvf-checker-subsystem" err=FromOrigin { origin: "pvf-checker", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 subsystem exited with error subsystem="bitfield-signing-subsystem" err=FromOrigin { origin: "bitfield-signing", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 subsystem exited with error subsystem="chain-api-subsystem" err=FromOrigin { origin: "chain-api", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 subsystem exited with error subsystem="availability-distribution-subsystem" err=FromOrigin { origin: "availability-distribution", source: IncomingMessageChannel(Generated(Context("Signal c>
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 err=ContextChannelClosed
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 subsystem exited with error subsystem="approval-voting-subsystem" err=FromOrigin { origin: "approval-voting", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 subsystem exited with error subsystem="availability-recovery-subsystem" err=FromOrigin { origin: "availability-recovery", source: Generated(Context("Signal channel is terminated and empty.>
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 Failed to receive a message from Overseer, exiting err=Generated(Context("Signal channel is terminated and empty."))
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 err=Subsystem(Generated(Context("Signal channel is terminated and empty.")))
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 Unable to author block in slot. No best block header: Request for leaves from chain selection got canceled
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 Essential task `overseer` failed. Shutting down service.
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 subsystem exited with error subsystem="statement-distribution-subsystem" err=FromOrigin { origin: "statement-distribution", source: SubsystemReceive(Generated(Context("Signal channel is te>
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty>
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty>
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 error receiving message from subsystem context: Generated(Context("Signal channel is terminated and empty.")) err=Generated(Context("Signal channel is terminated and empty."))
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 Fetch for approval votes got cancelled, only expected during shutdown!
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 Dispute on candidate concluded with 'valid' result candidate_hash=0x8aaeb37ff89f835d5eea9c9791c15b5a5b3f9917064c61826d26c76396df24fd session=26550 traceID=184340563761944265422074650637983>
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 Fetch for approval votes got cancelled, only expected during shutdown!
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 New dispute initiated for candidate. candidate_hash=0x026644741bcdb8533ed728170de17fa8119affe7adabb8307940bfcad2a43224 session=26550 traceID=3189458674065079641407768009149349800
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 Fetch for approval votes got cancelled, only expected during shutdown!
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 New dispute initiated for candidate. candidate_hash=0x98af2fa19f956948ed9e6bdfb6018dcebb9aadc269e567ed9906b333d312cdcf session=26550 traceID=202952273387925087643924749837603540430
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 Dispute on candidate concluded with 'valid' result candidate_hash=0x98af2fa19f956948ed9e6bdfb6018dcebb9aadc269e567ed9906b333d312cdcf session=26550 traceID=202952273387925087643924749837603>
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 subsystem exited with error subsystem="runtime-api-subsystem" err=Generated(Context("Signal channel is terminated and empty."))
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 subsystem exited with error subsystem="dispute-coordinator-subsystem" err=FromOrigin { origin: "dispute-coordinator", source: ChainApiSenderDropped }
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 error=SendResponses([PeerId("12D3KooWPd6HxZQhhtU2BUYbnpC1DXxhqoPgX648Dkb1GwiE6UqA")])
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 error=SendResponses([PeerId("12D3KooWS8s2Dx5MePfpLao5a9Lp15NcjB89JEoHdGhEXRYwqJ6L")])
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 error=SendResponses([PeerId("12D3KooWPd6HxZQhhtU2BUYbnpC1DXxhqoPgX648Dkb1GwiE6UqA")])
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 error=SendResponses([PeerId("12D3KooWBPjbGpX19NQ5sezPQ9NxRsCc57Cpv3k4cBMV2WM9jMCC")])
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 error=SendResponses([PeerId("12D3KooWQDM4cczHd5LqQStZwTPrk4D283kgMEwC2N9ip9GMPS3c")])
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 error=SendResponses([PeerId("12D3KooWGjrHw4kTb3uCQT5FvHE8D8NWa6n2i6EiZg4gSCq9zYUL")])
Dec 09 00:02:24 ms2737 polkadot[860701]: 2022-12-09 00:02:24 error=SendResponses([PeerId("12D3KooWQW6RWdtJVY5Ku4n6AahMLzxoHjMFhFiPTnY55dM421Rb")])

@pmensik
Copy link
Contributor

pmensik commented Dec 9, 2022

We had two nodes in the active set, both of them crashed and one of them was also subsequently slashed as a result of this event.

The slashed one

Dec 08 23:39:22 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:22 💤 Idle (40 peers), best: #15675154 (0x799a…afb4), finalized #15675123 (0x1032…cc38), ⬇ 498.6kiB/s ⬆ 151.1kiB/s
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 Subsystem dispute-coordinator-subsystem appears unresponsive. origin=polkadot_service::relay_chain_selection::SelectRelayChainInner<sc_client_db::Backend<sp_runtime::generic::block::Block<sp_runtime::generic::header::Header<u32, sp_runtime::traits::BlakeTwo256>, sp_runtime::OpaqueExtrinsic>>, polkadot_overseer::Handle>
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 Call to `DetermineUndisputedChain` failed error=DetermineUndisputedChainCanceled(Canceled)
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 err=ContextChannelClosed
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 subsystem exited with error subsystem="candidate-validation-subsystem" err=FromOrigin { origin: "candidate-validation", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 subsystem exited with error subsystem="pvf-checker-subsystem" err=FromOrigin { origin: "pvf-checker", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 Failed to receive a message from Overseer, exiting err=Generated(Context("Signal channel is terminated and empty."))
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 subsystem exited with error subsystem="availability-distribution-subsystem" err=FromOrigin { origin: "availability-distribution", source: IncomingMessageChannel(Generated(Context("Signal channel is terminated and empty."))) }
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 error receiving message from subsystem context: Generated(Context("Signal channel is terminated and empty.")) err=Generated(Context("Signal channel is terminated and empty."))
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 subsystem exited with error subsystem="bitfield-signing-subsystem" err=FromOrigin { origin: "bitfield-signing", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) }
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 err=Subsystem(Generated(Context("Signal channel is terminated and empty.")))
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 subsystem exited with error subsystem="availability-recovery-subsystem" err=FromOrigin { origin: "availability-recovery", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) }
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 subsystem exited with error subsystem="approval-voting-subsystem" err=FromOrigin { origin: "approval-voting", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 subsystem exited with error subsystem="chain-api-subsystem" err=FromOrigin { origin: "chain-api", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 Fetch for approval votes got cancelled, only expected during shutdown!
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 Essential task `overseer` failed. Shutting down service.
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 Unable to author block in slot. No best block header: Request for leaves from chain selection got canceled
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 New dispute initiated for candidate. candidate_hash=0xe1e677e00a3a3411c4bad16559c4753c47cd51043159b1f4399caea155bfb91b session=26550 traceID=300272958686085533188489404059870328124
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 Dispute on candidate concluded with 'valid' result candidate_hash=0xe1e677e00a3a3411c4bad16559c4753c47cd51043159b1f4399caea155bfb91b session=26550 traceID=300272958686085533188489404059870328124
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 Fetch for approval votes got cancelled, only expected during shutdown!
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 New dispute initiated for candidate. candidate_hash=0xfcd08ae3f48e2eb589415d6cecffe6e65cfd8a7fb587323deae45faf8552ac67 session=26550 traceID=336048269717378469227419513318138701542
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 Dispute on candidate concluded with 'valid' result candidate_hash=0xfcd08ae3f48e2eb589415d6cecffe6e65cfd8a7fb587323deae45faf8552ac67 session=26550 traceID=336048269717378469227419513318138701542
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 subsystem exited with error subsystem="statement-distribution-subsystem" err=FromOrigin { origin: "statement-distribution", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) }
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 subsystem exited with error subsystem="runtime-api-subsystem" err=Generated(Context("Signal channel is terminated and empty."))
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 subsystem exited with error subsystem="runtime-api-subsystem" err=Generated(Context("Signal channel is terminated and empty."))
Dec 08 23:39:26 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:26 subsystem exited with error subsystem="dispute-coordinator-subsystem" err=FromOrigin { origin: "dispute-coordinator", source: ChainApiSenderDropped }
Dec 08 23:39:27 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:27 GRANDPA voter error: safety invariant has been violated: `voter_commands_rx` was closed.
Dec 08 23:39:27 kusama-parallel-1 polkadot[2002078]: 2022-12-08 23:39:27 Essential task `grandpa-voter` failed. Shutting down service.
Dec 09 00:51:14 kusama-parallel-1 systemd[1]: Stopping kusama node...

The other one

Dec 09 00:04:33 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:33 💤 Idle (33 peers), best: #15675263 (0xfe82…b975), finalized #15675259 (0x1f43…4683), ⬇ 242.5kiB/s ⬆ 325.3kiB/s
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 Subsystem dispute-coordinator-subsystem appears unresponsive. origin=polkadot_service::relay_chain_selection::SelectRelayChainInner<sc_client_db::Backend<sp_runtime::generic::block::Block<sp_runtime::generic::header::Header<u32, sp_runtime::traits::BlakeTwo256>, sp_runtime::OpaqueExtrinsic>>, polkadot_overseer::Handle>
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 subsystem exited with error subsystem="candidate-validation-subsystem" err=FromOrigin { origin: "candidate-validation", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 err=Subsystem(Generated(Context("Signal channel is terminated and empty.")))
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 subsystem exited with error subsystem="pvf-checker-subsystem" err=FromOrigin { origin: "pvf-checker", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 subsystem exited with error subsystem="chain-api-subsystem" err=FromOrigin { origin: "chain-api", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 subsystem exited with error subsystem="bitfield-signing-subsystem" err=FromOrigin { origin: "bitfield-signing", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 err=ContextChannelClosed
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 subsystem exited with error subsystem="availability-distribution-subsystem" err=FromOrigin { origin: "availability-distribution", source: IncomingMessageChannel(Generated(Context("Signal channel is terminated and empty."))) }
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 Failed to receive a message from Overseer, exiting err=Generated(Context("Signal channel is terminated and empty."))
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 subsystem exited with error subsystem="availability-recovery-subsystem" err=FromOrigin { origin: "availability-recovery", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) }
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 subsystem exited with error subsystem="network-bridge-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) }
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 subsystem exited with error subsystem="approval-voting-subsystem" err=FromOrigin { origin: "approval-voting", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 error receiving message from subsystem context: Generated(Context("Signal channel is terminated and empty.")) err=Generated(Context("Signal channel is terminated and empty."))
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 Fetch for approval votes got cancelled, only expected during shutdown!
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 Essential task `overseer` failed. Shutting down service.
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 Unable to author block in slot. No best block header: Request for leaves from chain selection got canceled
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 New dispute initiated for candidate. candidate_hash=0xc515980b2fa8fa4298a2f188e504bd59902a53c0e702efd13cb1650efd30251a session=26550 traceID=261970037216177277872464205827121921369
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 Encountered error finding best chain containing 0x1f435c5b5b0276b01025e021ec98dd6897644807a315d6525d69e72388074683: Request for leaves from chain selection got canceled
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 Could not cast prevote: previously known block 0x1f435c5b5b0276b01025e021ec98dd6897644807a315d6525d69e72388074683 has disappeared
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 subsystem exited with error subsystem="statement-distribution-subsystem" err=FromOrigin { origin: "statement-distribution", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) }
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 Dispute on candidate concluded with 'valid' result candidate_hash=0xc515980b2fa8fa4298a2f188e504bd59902a53c0e702efd13cb1650efd30251a session=26550 traceID=261970037216177277872464205827121921369
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 subsystem exited with error subsystem="dispute-coordinator-subsystem" err=FromOrigin { origin: "dispute-coordinator", source: ChainApiSenderDropped }
Dec 09 00:04:34 kusama-parallel-2 polkadot[1899843]: 2022-12-09 00:04:34 subsystem exited with error subsystem="runtime-api-subsystem" err=Generated(Context("Signal channel is terminated and empty."))
Dec 09 00:52:39 kusama-parallel-2 systemd[1]: Stopping kusama node...

Also, both nodes show quite a lot of lines exactly every hour saying - and this keeps repeating itself for around one minute

Dec 09 03:54:19 kusama-parallel-2 polkadot[2013179]: 2022-12-09 03:54:19 dropping (Stream 4628c5f7/114) because buffer is full
Dec 09 03:54:19 kusama-parallel-2 polkadot[2013179]: 2022-12-09 03:54:19 dropping (Stream 4628c5f7/116) because buffer is full
Dec 09 03:54:19 kusama-parallel-2 polkadot[2013179]: 2022-12-09 03:54:19 dropping (Stream eba1c19d/56) because buffer is full

@matteo-SBond
Copy link

Caught a small portion of logs before service shutdwon before the server was rebooted.

Dec 09 00:03:14 ns554290 polkadot[243207]: 2022-12-09 00:03:14 err=Subsystem(Generated(Context("Signal channel is terminated and empty.")))
Dec 09 00:03:14 ns554290 polkadot[243207]: 2022-12-09 00:03:14 subsystem exited with error subsystem="approval-voting-subsystem" err=FromOrigin { origin: "approval-voting", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 09 00:03:14 ns554290 polkadot[243207]: 2022-12-09 00:03:14 subsystem exited with error subsystem="candidate-validation-subsystem" err=FromOrigin { origin: "candidate-validation", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 09 00:03:14 ns554290 polkadot[243207]: 2022-12-09 00:03:14 subsystem exited with error subsystem="availability-distribution-subsystem" err=FromOrigin { origin: "availability-distribution", source: IncomingMessageChannel(Generated(Context("Signal channel is terminated and empty."))) }
Dec 09 00:03:14 ns554290 polkadot[243207]: 2022-12-09 00:03:14 subsystem exited with error subsystem="availability-recovery-subsystem" err=FromOrigin { origin: "availability-recovery", source: Generated(Context("Signal channel is terminated and empty.")) }
Dec 09 00:03:14 ns554290 polkadot[243207]: 2022-12-09 00:03:14 Failed to receive a message from Overseer, exiting err=Generated(Context("Signal channel is terminated and empty."))
Dec 09 00:03:14 ns554290 polkadot[243207]: 2022-12-09 00:03:14 Essential task `overseer` failed. Shutting down service.
Dec 09 00:03:14 ns554290 polkadot[243207]: 2022-12-09 00:03:14 Unable to author block in slot. No best block header: Request for leaves from chain selection got canceled
Dec 09 00:03:14 ns554290 polkadot[243207]: 2022-12-09 00:03:14 subsystem exited with error subsystem="statement-distribution-subsystem" err=FromOrigin { origin: "statement-distribution", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) }
Dec 09 00:03:14 ns554290 polkadot[243207]: 2022-12-09 00:03:14 subsystem exited with error subsystem="runtime-api-subsystem" err=Generated(Context("Signal channel is terminated and empty."))

@luckyve
Copy link

luckyve commented Dec 9, 2022

Log from block #15675064 to #15675448.
No errors or alerts from monitors.
kusama log.txt

@gregorst3
Copy link

Problems started around 00:29 (GMT +1) after block #1567503.
The node stopped listening on 30333 around 00:55 (Flooded by logs).
The monitor reported something was wrong and, on my way to the PC, I tried to restart around 00:59
The restart was failing (no response) and I did a hard reboot.
The node didn't crash until a manual action to restart.
The validator did not slash because the node recovered in time.
Logs can be seen here:
log.txt

@tunkas
Copy link

tunkas commented Dec 9, 2022

Here are my 5 cents, experienced much the same chaos as the rest.
Attaching the file with most important excerpts from logs of my node.
After 3rd restart (first 2 were automatic, 3rd was manual) it started behaving normally. I am, however, still to see first block produced :/
Validator is marked for slashing (hopefully will overturn this with Gov2).

Good luck with resolving this guys! 🤞🤞

Kusama chaos Dec 08 2022.docx

@Generic-Chain
Copy link

Generic-Chain commented Dec 9, 2022

panic after the incident, times are UTC

To be noted: node did not crash, it was not chilled and was not slashed (I think the last 2 were hand in hand but just for clarity), it resumed afterwards and still running fine at the moment without a restart

Dec 09 00:09:54  polkadot[1058]: ====================
Dec 09 00:09:54  polkadot[1058]: Version: 0.9.33-c7d6c21242f
Dec 09 00:09:54  polkadot[1058]:    0: sp_panic_handler::set::{{closure}}
Dec 09 00:09:54  polkadot[1058]:    1: std::panicking::rust_panic_with_hook
Dec 09 00:09:54  polkadot[1058]:              at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:702:17
Dec 09 00:09:54  polkadot[1058]:    2: std::panicking::begin_panic_handler::{{closure}}
Dec 09 00:09:54  polkadot[1058]:              at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:588:13
Dec 09 00:09:54  polkadot[1058]:    3: std::sys_common::backtrace::__rust_end_short_backtrace
Dec 09 00:09:54  polkadot[1058]:              at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/sys_common/backtrace.rs:138:18
Dec 09 00:09:54  polkadot[1058]:    4: rust_begin_unwind
Dec 09 00:09:54  polkadot[1058]:              at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:584:5
Dec 09 00:09:54  polkadot[1058]:    5: core::panicking::panic_fmt
Dec 09 00:09:54  polkadot[1058]:              at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:142:14
Dec 09 00:09:54  polkadot[1058]:    6: core::result::unwrap_failed
Dec 09 00:09:54  polkadot[1058]:              at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/result.rs:1785:5
Dec 09 00:09:54  polkadot[1058]:    7: <sp_state_machine::ext::Ext<H,B> as sp_externalities::Externalities>::next_storage_key
Dec 09 00:09:54  polkadot[1058]:    8: sp_io::storage::next_key_version_1
Dec 09 00:09:54  polkadot[1058]:    9: sp_io::storage::ExtStorageNextKeyVersion1::call
Dec 09 00:09:54  polkadot[1058]:   10: <F as wasmtime::func::IntoFunc<T,(wasmtime::func::Caller<T>,A1),R>>::into_func::wasm_to_host_shim
Dec 09 00:09:54  polkadot[1058]:   11: <unknown>
Dec 09 00:09:54  polkadot[1058]:   12: <unknown>
Dec 09 00:09:54  polkadot[1058]:   13: <unknown>
Dec 09 00:09:54  polkadot[1058]:   14: <unknown>
Dec 09 00:09:54  polkadot[1058]:   15: <unknown>
Dec 09 00:09:54  polkadot[1058]:   16: <unknown>
Dec 09 00:09:54  polkadot[1058]:   17: <unknown>
Dec 09 00:09:54  polkadot[1058]:   18: wasmtime_runtime::traphandlers::catch_traps::call_closure
Dec 09 00:09:54  polkadot[1058]:   19: wasmtime_setjmp
Dec 09 00:09:54  polkadot[1058]:   20: sc_executor_wasmtime::runtime::perform_call
Dec 09 00:09:54  polkadot[1058]:   21: <sc_executor_wasmtime::runtime::WasmtimeInstance as sc_executor_common::wasm_runtime::WasmInstance>::call_with_allocation_stats
Dec 09 00:09:54  polkadot[1058]:   22: sc_executor_common::wasm_runtime::WasmInstance::call_export
Dec 09 00:09:54  polkadot[1058]:   23: sc_executor::native_executor::WasmExecutor<H>::with_instance::{{closure}}
Dec 09 00:09:54  polkadot[1058]:   24: <sc_executor::native_executor::NativeElseWasmExecutor<D> as sp_core::traits::CodeExecutor>::call
Dec 09 00:09:54  polkadot[1058]:   25: sp_state_machine::execution::StateMachine<B,H,Exec>::execute_aux
Dec 09 00:09:54  polkadot[1058]:   26: sp_state_machine::execution::StateMachine<B,H,Exec>::execute_using_consensus_failure_handler
Dec 09 00:09:54  polkadot[1058]:   27: <sc_service::client::client::Client<B,E,Block,RA> as sp_api::CallApiAt<Block>>::call_api_at
Dec 09 00:09:54  polkadot[1058]:   28: <kusama_runtime::RuntimeApiImpl<__SR_API_BLOCK__,RuntimeApiImplCall> as sp_api::Core<__SR_API_BLOCK__>>::__runtime_api_internal_call_api_at
Dec 09 00:09:54  polkadot[1058]:   29: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Dec 09 00:09:54  polkadot[1058]:   30: <sc_finality_grandpa::import::GrandpaBlockImport<BE,Block,Client,SC> as sc_consensus::block_import::BlockImport<Block>>::import_block::{{closure}}
Dec 09 00:09:54  polkadot[1058]:   31: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Dec 09 00:09:54  polkadot[1058]:   32: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Dec 09 00:09:54  polkadot[1058]:   33: <sc_consensus_babe::BabeBlockImport<Block,Client,Inner> as sc_consensus::block_import::BlockImport<Block>>::import_block::{{closure}}
Dec 09 00:09:54  polkadot[1058]:   34: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Dec 09 00:09:54  polkadot[1058]:   35: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Dec 09 00:09:54  polkadot[1058]:   36: sc_consensus::import_queue::basic_queue::BlockImportWorker<B>::new::{{closure}}
Dec 09 00:09:54  polkadot[1058]:   37: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Dec 09 00:09:54  polkadot[1058]:   38: <futures_util::future::future::Map<Fut,F> as core::future::future::Future>::poll
Dec 09 00:09:54  polkadot[1058]:   39: <tracing_futures::Instrumented<T> as core::future::future::Future>::poll
Dec 09 00:09:54  polkadot[1058]:   40: tokio::runtime::task::raw::poll
Dec 09 00:09:54  polkadot[1058]:   41: std::sys_common::backtrace::__rust_begin_short_backtrace
Dec 09 00:09:54  polkadot[1058]:   42: core::ops::function::FnOnce::call_once{{vtable.shim}}
Dec 09 00:09:54  polkadot[1058]:   43: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
Dec 09 00:09:54  polkadot[1058]:              at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/alloc/src/boxed.rs:1940:9
Dec 09 00:09:54  polkadot[1058]:       <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
Dec 09 00:09:54  polkadot[1058]:              at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/alloc/src/boxed.rs:1940:9
Dec 09 00:09:54  polkadot[1058]:       std::sys::unix::thread::Thread::new::thread_start
Dec 09 00:09:54  polkadot[1058]:              at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/sys/unix/thread.rs:108:17
Dec 09 00:09:54  polkadot[1058]:   44: start_thread
Dec 09 00:09:54  polkadot[1058]:   45: clone
Dec 09 00:09:54  polkadot[1058]: Thread 'tokio-runtime-worker' panicked at 'Externalities not allowed to fail within runtime: "TrieDB iterator next error: Database missing expected key: 0x>
Dec 09 00:09:54  polkadot[1058]: This is a bug. Please report it at:
Dec 09 00:09:54  polkadot[1058]:         https://github.com/paritytech/polkadot/issues/new
Dec 09 00:09:54  polkadot[1058]: 2022-12-09 00:09:54 Block prepare storage changes error: Error at calling runtime api: Execution failed: Execution aborted due to trap: host code panicked >
Dec 09 00:09:54  polkadot[1058]: WASM backtrace:
Dec 09 00:09:54  polkadot[1058]:     0: 0x54cc33 - <unknown>!sp_io::storage::extern_host_function_impls::next_key::hbfc53d13cd2fed36
Dec 09 00:09:54  polkadot[1058]:     1: 0x28caba - <unknown>!<frame_support::storage::PrefixIterator<T,OnRemoval> as core::iter::traits::iterator::Iterator>::next::h2e1b5d1ee62654da
Dec 09 00:09:54  polkadot[1058]:     2: 0x39d4e9 - <unknown>!<polkadot_runtime_parachains::disputes::pallet::Pallet<T> as polkadot_runtime_parachains::disputes::DisputesHandler<<T as frame>
Dec 09 00:09:54  polkadot[1058]:     3: 0xaf69e - <unknown>!<(TupleElement0,TupleElement1,TupleElement2,TupleElement3,TupleElement4,TupleElement5,TupleElement6,TupleElement7,TupleElement8,>
Dec 09 00:09:54  polkadot[1058]:     4: 0x150cf4 - <unknown>!frame_executive::Executive<System,Block,Context,UnsignedValidator,AllPalletsWithSystem,COnRuntimeUpgrade>::initialize_block::hf>
Dec 09 00:09:54  polkadot[1058]:     5: 0x150082 - <unknown>!frame_executive::Executive<System,Block,Context,UnsignedValidator,AllPalletsWithSystem,COnRuntimeUpgrade>::execute_block::hb966>
Dec 09 00:09:54  polkadot[1058]:     6: 0x50706b - <unknown>!Core_execute_block
Dec 09 00:09:54  polkadot[1058]:
Dec 09 00:09:54  polkadot[1058]: 2022-12-09 00:09:54 💔 Error importing block 0xf1ea481f90a70ddf9bf8d9624c23e9a5c39be95510dea71423be6d3b751e59f5: consensus error: Import failed: Import fai>
Dec 09 00:09:54  polkadot[1058]: WASM backtrace:
Dec 09 00:09:54  polkadot[1058]:     0: 0x54cc33 - <unknown>!sp_io::storage::extern_host_function_impls::next_key::hbfc53d13cd2fed36
Dec 09 00:09:54  polkadot[1058]:     1: 0x28caba - <unknown>!<frame_support::storage::PrefixIterator<T,OnRemoval> as core::iter::traits::iterator::Iterator>::next::h2e1b5d1ee62654da
Dec 09 00:09:54  polkadot[1058]:     2: 0x39d4e9 - <unknown>!<polkadot_runtime_parachains::disputes::pallet::Pallet<T> as polkadot_runtime_parachains::disputes::DisputesHandler<<T as frame>
Dec 09 00:09:54  polkadot[1058]:     3: 0xaf69e - <unknown>!<(TupleElement0,TupleElement1,TupleElement2,TupleElement3,TupleElement4,TupleElement5,TupleElement6,TupleElement7,TupleElement8,>
Dec 09 00:09:54  polkadot[1058]:     4: 0x150cf4 - <unknown>!frame_executive::Executive<System,Block,Context,UnsignedValidator,AllPalletsWithSystem,COnRuntimeUpgrade>::initialize_block::hf>
Dec 09 00:09:54  polkadot[1058]:     5: 0x150082 - <unknown>!frame_executive::Executive<System,Block,Context,UnsignedValidator,AllPalletsWithSystem,COnRuntimeUpgrade>::execute_block::hb966>
Dec 09 00:09:54  polkadot[1058]:     6: 0x50706b - <unknown>!Core_execute_block

@paradox-tt
Copy link

Hey team,

Is there any reason some validators went offline while others didn't? Were these disputes handled evenly by each validator or is it that some handle these disputes while other didn't?

@eskimor
Copy link
Member

eskimor commented Dec 12, 2022

Not all validators are parachain validators, I guess that's the reason. Only those will actually participate in disputes, the others will just include votes in blocks.

@paradox-tt
Copy link

I suspected that not all validators participate in disputes, thanks for confirming.

@eskimor
Copy link
Member

eskimor commented Dec 12, 2022

For the warning "Missing block weight for new head", it looks like it is a symptom not a cause, on one of our validators:



2022-12-09 09:31:41 | 2022-12-09 08:31:41.417 DEBUG tokio-runtime-worker parachain::pvf-checker: Examining activation of leaf 0xe272a78c813fc8c885b21aacf42ca4c89d9e2cc87178b36c9f2f0527de6d96c5 (15679391)


2022-12-09 09:31:51 | 2022-12-09 08:31:51.438  WARN tokio-runtime-worker parachain::chain-selection: Missing block weight for new head. Skipping chain. hash=0xe272a78c813fc8c885b21aacf42ca4c89d9e2cc87178b36c9f2f0527de6d96c5
-- | --


We can see that there is a full 10 seconds, between first seeing the leaf and the import happening in in chain-selection. This suggests that chain-selection had trouble keeping up. Why that would happen is a different story though - it does seem to only depend on the chain-api subsystem.
-- | --
  |  

@ybstaked
Copy link

@sandreim
Copy link
Contributor

No, this is a runtime only release, the fix for this issue will come in the client.

@ybstaked
Copy link

No, this is a runtime only release, the fix for this issue will come in the client.

Who typically uses the runtime only release? For validators, we should wait for a node client release? When is that intended?

@sandreim
Copy link
Contributor

This runtime will soon be proposed for upgrade. The client fix is currently being developed, the PR doing so will link to this issue.

@sandreim
Copy link
Contributor

sandreim commented Dec 14, 2022

#6440 is a potential fix for the issue. Initial testing looking good (Versi), but we need to bump up the dispute rate more aggressively to be sure.

@eskimor
Copy link
Member

eskimor commented Dec 15, 2022

Fix confirmed.

@eskimor eskimor closed this as completed Dec 15, 2022
@infrachris
Copy link

infrachris commented Dec 20, 2022

I'm unable to determine if what follows directly relates to this GitHub issue, however the errors are similar so I thought it may be beneficial to post here.

I'm getting the 'dropping stream because buffer is full' error repeatedly/constantly on both of my validators immediately after upgrading from v0.9.33 to v0.9.36. "dropping (Stream a5718300/101) because buffer is full"

Additionally, the polkadot service is in an "activating (auto-restart)" state so this leads me to believe both validators are effectively down.

Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1089) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1091) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1093) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1095) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1097) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1099) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1101) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1103) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1105) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1107) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1109) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1111) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1113) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1115) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1117) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1119) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1121) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1123) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1125) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1127) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1129) because buffer is full
Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1131) because buffer is full
Dec 05 09:58:54 host15 polkadot[79378]: 2022-12-05 09:58:54 ✨ Imported #15628775 (0x5069…b35f)
Dec 05 09:58:57 host15 polkadot[79378]: 2022-12-05 09:58:57 💤 Idle (40 peers), best: #15628775 (0x5069…b35f), finalized #15628771 (0xc87e…58f8), ⬇ 1.7MiB/s ⬆ 2.0MiB/s
Dec 05 09:59:00 host15 polkadot[79378]: 2022-12-05 09:59:00 ✨ Imported #15628776 (0x5512…1946)
Dec 05 09:59:02 host15 polkadot[79378]: 2022-12-05 09:59:02 💤 Idle (40 peers), best: #15628776 (0x5512…1946), finalized #15628773 (0x79ac…b31c), ⬇ 1.4MiB/s ⬆ 1.6MiB/s
Dec 05 09:59:06 host15 polkadot[79378]: 2022-12-05 09:59:06 ✨ Imported #15628777 (0x4acd…bdfc)
Dec 05 09:59:07 host15 polkadot[79378]: 2022-12-05 09:59:07 💤 Idle (40 peers), best: #15628777 (0x4acd…bdfc), finalized #15628773 (0x79ac…b31c), ⬇ 960.5kiB/s ⬆ 1.1MiB/s
Dec 05 09:59:12 host15 polkadot[79378]: 2022-12-05 09:59:12 💤 Idle (40 peers), best: #15628777 (0x4acd…bdfc), finalized #15628774 (0x8e20…2f37), ⬇ 1.0MiB/s ⬆ 1.1MiB/s
Dec 05 09:59:12 host15 polkadot[79378]: 2022-12-05 09:59:12 ✨ Imported #15628778 (0x840b…4865)
Dec 05 09:59:17 host15 polkadot[79378]: 2022-12-05 09:59:17 💤 Idle (40 peers), best: #15628778 (0x840b…4865), finalized #15628774 (0x8e20…2f37), ⬇ 1.1MiB/s ⬆ 1.1MiB/s
Dec 05 09:59:18 host15 polkadot[79378]: 2022-12-05 09:59:18 ✨ Imported #15628779 (0x1007…460e)
Dec 05 09:59:22 host15 polkadot[79378]: 2022-12-05 09:59:22 💤 Idle (40 peers), best: #15628779 (0x1007…460e), finalized #15628775 (0x5069…b35f), ⬇ 1.8MiB/s ⬆ 1.7MiB/s
Dec 05 09:59:24 host15 polkadot[79378]: 2022-12-05 09:59:24 ✨ Imported #15628780 (0x3a52…2acd)
Dec 05 09:59:27 host15 polkadot[79378]: 2022-12-05 09:59:27 💤 Idle (40 peers), best: #15628780 (0x3a52…2acd), finalized #15628777 (0x4acd…bdfc), ⬇ 1.5MiB/s ⬆ 1.8MiB/s
Dec 05 09:59:30 host15 polkadot[79378]: 2022-12-05 09:59:30 ✨ Imported #15628781 (0x2c67…e1d6)
Dec 05 09:59:32 host15 polkadot[79378]: 2022-12-05 09:59:32 💤 Idle (40 peers), best: #15628781 (0x2c67…e1d6), finalized #15628777 (0x4acd…bdfc), ⬇ 1.3MiB/s ⬆ 1.6MiB/s
Dec 05 09:59:36 host15 polkadot[79378]: 2022-12-05 09:59:36 ✨ Imported #15628782 (0x86ab…60a0)
Dec 05 09:59:37 host15 polkadot[79378]: 2022-12-05 09:59:37 💤 Idle (40 peers), best: #15628782 (0x86ab…60a0), finalized #15628778 (0x840b…4865), ⬇ 1.5MiB/s ⬆ 1.2MiB/s
Dec 05 09:59:42 host15 polkadot[79378]: 2022-12-05 09:59:42 💤 Idle (40 peers), best: #15628782 (0x86ab…60a0), finalized #15628778 (0x840b…4865), ⬇ 662.1kiB/s ⬆ 1.1MiB/s
Dec 05 09:59:42 host15 polkadot[79378]: 2022-12-05 09:59:42 ✨ Imported #15628783 (0x135b…9098)
Dec 05 09:59:42 host15 polkadot[79378]: 2022-12-05 09:59:42 ♻️ Reorg on #15628783,0x135b…9098 to #15628783,0x8d4f…3a1d, common ancestor #15628782,0x86ab…60a0
Dec 05 09:59:42 host15 polkadot[79378]: 2022-12-05 09:59:42 ✨ Imported #15628783 (0x8d4f…3a1d)
Dec 05 09:59:47 host15 polkadot[79378]: 2022-12-05 09:59:47 💤 Idle (40 peers), best: #15628783 (0x8d4f…3a1d), finalized #15628779 (0x1007…460e), ⬇ 2.8MiB/s ⬆ 3.3MiB/s
Dec 05 09:59:48 host15 polkadot[79378]: 2022-12-05 09:59:48 ✨ Imported #15628784 (0x2fb8…9486)
Dec 05 09:59:52 host15 polkadot[79378]: 2022-12-05 09:59:52 💤 Idle (40 peers), best: #15628784 (0x2fb8…9486), finalized #15628781 (0x2c67…e1d6), ⬇ 1.4MiB/s ⬆ 1.4MiB/s
Dec 05 09:59:54 host15 polkadot[79378]: 2022-12-05 09:59:54 ✨ Imported #15628785 (0x2cbb…4195)
Dec 05 09:59:54 host15 polkadot[79378]: 2022-12-05 09:59:54 ♻️ Reorg on #15628785,0x2cbb…4195 to #15628785,0xc6e5…15c4, common ancestor #15628784,0x2fb8…9486
Dec 05 09:59:54 host15 polkadot[79378]: 2022-12-05 09:59:54 ✨ Imported #15628785 (0xc6e5…15c4)
Dec 05 09:59:57 host15 polkadot[79378]: 2022-12-05 09:59:57 💤 Idle (40 peers), best: #15628785 (0xc6e5…15c4), finalized #15628781 (0x2c67…e1d6), ⬇ 2.2MiB/s ⬆ 2.2MiB/s
Dec 05 10:00:00 host15 polkadot[79378]: 2022-12-05 10:00:00 ✨ Imported #15628786 (0x53e0…19f9)
Dec 05 10:00:00 host15 polkadot[79378]: 2022-12-05 10:00:00 ♻️ Reorg on #15628786,0x53e0…19f9 to #15628786,0x07ff…4928, common ancestor #15628785,0xc6e5…15c4
Dec 05 10:00:00 host15 polkadot[79378]: 2022-12-05 10:00:00 ✨ Imported #15628786 (0x07ff…4928)
Dec 05 10:00:02 host15 polkadot[79378]: 2022-12-05 10:00:02 💤 Idle (40 peers), best: #15628786 (0x07ff…4928), finalized #15628782 (0x86ab…60a0), ⬇ 2.9MiB/s ⬆ 2.4MiB/s
Dec 05 10:00:06 host15 polkadot[79378]: 2022-12-05 10:00:06 ✨ Imported #15628787 (0x1422…f51c)
Dec 05 10:00:06 host15 polkadot[79378]: 2022-12-05 10:00:06 ♻️ Reorg on #15628787,0x1422…f51c to #15628787,0x348a…8b3e, common ancestor #15628786,0x07ff…4928
Dec 05 10:00:06 host15 polkadot[79378]: 2022-12-05 10:00:06 ✨ Imported #15628787 (0x348a…8b3e)
Dec 05 10:00:07 host15 polkadot[79378]: 2022-12-05 10:00:07 💤 Idle (40 peers), best: #15628787 (0x348a…8b3e), finalized #15628784 (0x2fb8…9486), ⬇ 2.6MiB/s ⬆ 1.9MiB/s
Dec 05 10:00:12 host15 polkadot[79378]: 2022-12-05 10:00:12 💤 Idle (40 peers), best: #15628787 (0x348a…8b3e), finalized #15628784 (0x2fb8…9486), ⬇ 2.5MiB/s ⬆ 2.8MiB/s
Dec 05 10:00:12 host15 polkadot[79378]: 2022-12-05 10:00:12 ✨ Imported #15628788 (0x60d1…6769)
Dec 05 10:00:12 host15 polkadot[79378]: 2022-12-05 10:00:12 ✨ Imported #15628788 (0x4536…e56e)
Dec 05 10:00:17 host15 polkadot[79378]: 2022-12-05 10:00:17 💤 Idle (40 peers), best: #15628788 (0x60d1…6769), finalized #15628785 (0xc6e5…15c4), ⬇ 2.5MiB/s ⬆ 2.0MiB/s
Dec 05 10:00:18 host15 polkadot[79378]: 2022-12-05 10:00:18 ✨ Imported #15628789 (0xd3a8…bd5e)
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/54) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/56) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/58) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/60) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/62) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/64) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/66) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/68) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/70) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/72) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/74) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/76) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/78) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/80) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/82) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/84) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/86) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/88) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/90) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/92) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/94) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/96) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/98) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/100) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/102) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/104) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/106) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/108) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/110) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/112) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/114) because buffer is full
Dec 05 10:00:19 host15 polkadot[79378]: 2022-12-05 10:00:19 dropping (Stream f9c74d5a/116) because buffer is full

@infrachris
Copy link

I was able to get around this issue. While stopping and restarting the polkadot service had no effect on the issue on either node, rebooting the server itself appears to have resolved the issue on both of my nodes which is quiet interesting. It's as if something is left in memory after the polkadot service is stopped.

@tunkas
Copy link

tunkas commented Dec 20, 2022

Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1089) because buffer is full

Why is your date stamp on these messages December 5th?

A

@infrachris
Copy link

infrachris commented Dec 20, 2022

Dec 05 09:58:53 host15 polkadot[79378]: 2022-12-05 09:58:53 dropping (Stream ea849616/1089) because buffer is full

Why is your date stamp on these messages December 5th?

A

Thanks for the call out, I don't know how I failed to notice that. I recall the time was incrementing and matching the current time each time I would run 'sudo journalctl -u polkadot.service -b --no-pager' so I believe my log output was current/real-time. Perhaps the date was incorrect on the server itself and the restart fixed that. I will look into this more, thank you.

@Polkadot-Forum
Copy link

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/polkadot-release-analysis-v0-9-36/1529/1

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.