-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lotus' api ChainNotify sometimes keeps blocking #6883
Comments
we often met the same problem when batch and aggregate message |
@freek99 check lotus log . has anything like head change sub is slow, has %d buffered entries |
workerlog:
minerlog:
|
@Stebalien |
@freek99 a stack trace when this is happening would be really helpful. The (hidden) |
@Stebalien (The complete output file is relatively large (more than 50M), if you need it, how can I get it to you?)
|
Please run both and post the full logs. That single stack trace shows me that your call to chain notify isn't getting any results but doesn't tell me where it's stuck on the other side. |
(sorry, wrong button) |
{"level":"warn","ts":"2021-07-31T15:02:20.464+0800","logger":"miner","caller":"miner/miner.go:507","msg":"completed mineOne","tookMilliseconds":27,"forRound":980285,"baseEpoch":980284,"baseDeltaSeconds":6
,"nullRounds":0,"lateStart":true,"beaconEpoch":1076129,"lookbackEpochs":900,"networkPowerAtLookback":"9575549025580548096","minerPowerAtLookback":"82944408420352","isEligible":true,"isWinner":false,"erro
r":null}
{"level":"warn","ts":"2021-07-31T15:02:50.235+0800","logger":"miner","caller":"miner/miner.go:507","msg":"completed mineOne","tookMilliseconds":6,"forRound":980286,"baseEpoch":980285,"baseDeltaSeconds":6
,"nullRounds":0,"lateStart":true,"beaconEpoch":1076130,"lookbackEpochs":900,"networkPowerAtLookback":"9575549025580548096","minerPowerAtLookback":"82944408420352","isEligible":true,"isWinner":false,"error
":null}
{"level":"warn","ts":"2021-07-31T15:03:20.105+0800","logger":"miner","caller":"miner/miner.go:507","msg":"completed mineOne","tookMilliseconds":5,"forRound":980287,"baseEpoch":980286,"baseDeltaSeconds":6
,"nullRounds":0,"lateStart":true,"beaconEpoch":1076131,"lookbackEpochs":900,"networkPowerAtLookback":"9575606578142314496","minerPowerAtLookback":"82944408420352","isEligible":true,"isWinner":false,"error
":null}
{"level":"error","ts":"2021-07-31T15:03:55.971+0800","logger":"miner","caller":"miner/miner.go:256","msg":"failed to get best mining candidate: handler: websocket connection closed"}
{"level":"warn","ts":"2021-07-31T15:03:55.971+0800","logger":"storageminer","caller":"storage/wdpost_sched.go:113","msg":"window post scheduler notifs channel closed"}
{"level":"warn","ts":"2021-07-31T15:03:55.971+0800","logger":"events","caller":"events/events.go:106","msg":"listenHeadChanges quit"}
{"level":"warn","ts":"2021-07-31T15:03:55.971+0800","logger":"events","caller":"events/events.go:106","msg":"listenHeadChanges quit"}
{"level":"error","ts":"2021-07-31T15:03:55.972+0800","logger":"storageminer","caller":"storage/wdpost_sched.go:101","msg":"ChainNotify error: handler: websocket connection closed"}
{"level":"info","ts":"2021-07-31T15:03:56.972+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"error","ts":"2021-07-31T15:03:56.973+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"info","ts":"2021-07-31T15:03:56.972+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"error","ts":"2021-07-31T15:03:56.973+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"info","ts":"2021-07-31T15:03:57.973+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"info","ts":"2021-07-31T15:03:57.973+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"warn","ts":"2021-07-31T15:04:20.065+0800","logger":"miner","caller":"miner/miner.go:507","msg":"completed mineOne","tookMilliseconds":5,"forRound":980289,"baseEpoch":980288,"baseDeltaSeconds":6
,"nullRounds":0,"lateStart":true,"beaconEpoch":1076133,"lookbackEpochs":900,"networkPowerAtLookback":"9575766694523109376","minerPowerAtLookback":"82944408420352","isEligible":true,"isWinner":false,"error
":null} |
I still need the pprof profiles taken when this is happening.
|
I want to try to reset lotus datastore,to solve this problem. I know I'm whimsical root@miner:/media/nvme# lotus chain export --skip-old-msgs --recent-stateroots 900 chain.car
2021-08-01T13:12:23.568+0800 ERROR rpc go-jsonrpc@v0.1.4-0.20210217175800-45ea43ac2bec/websocket.go:498 sending ping message: write tcp 10.10.20.109:49080->10.10.20.109:1234: use of closed network connection
2021-08-01T13:12:23.568+0800 ERROR rpc go-jsonrpc@v0.1.4-0.20210217175800-45ea43ac2bec/websocket.go:667 Connection timeout {"remote": "10.10.20.109:1234"}
2021-08-01T13:12:23.568+0800 WARN rpc go-jsonrpc@v0.1.4-0.20210217175800-45ea43ac2bec/websocket.go:678 failed to write close message: write tcp 10.10.20.109:49080->10.10.20.109:1234: use of
closed network connection
2021-08-01T13:12:23.568+0800 WARN rpc go-jsonrpc@v0.1.4-0.20210217175800-45ea43ac2bec/websocket.go:681 websocket close error {"error": "close tcp 10.10.20.109:49080->10.10.20.109:1234: use of closed network connection"}
ERROR: incomplete export (remote connection lost?) {"level":"warn","ts":"2021-08-01T13:11:20.053+0800","logger":"miner","caller":"miner/miner.go:507","msg":"completed mineOne","tookMilliseconds":6,"forRound":982943,"baseEpoch":982942,"baseDeltaSeconds":6,"nullRounds":0,"lateStart":true,"beaconEpoch":1078787,"lookbackEpochs":900,"networkPowerAtLookback":"9640644729113444352","minerPowerAtLookback":"82944408420352","isEligible":true,"isWinner":false,"error":null}
{"level":"warn","ts":"2021-08-01T13:11:50.234+0800","logger":"miner","caller":"miner/miner.go:507","msg":"completed mineOne","tookMilliseconds":6,"forRound":982944,"baseEpoch":982943,"baseDeltaSeconds":6,"nullRounds":0,"lateStart":true,"beaconEpoch":1078788,"lookbackEpochs":900,"networkPowerAtLookback":"9640644729113444352","minerPowerAtLookback":"82944408420352","isEligible":true,"isWinner":false,"error":null}
{"level":"warn","ts":"2021-08-01T13:12:27.568+0800","logger":"storageminer","caller":"storage/wdpost_sched.go:113","msg":"window post scheduler notifs channel closed"}
{"level":"error","ts":"2021-08-01T13:12:27.568+0800","logger":"storageminer","caller":"storage/wdpost_sched.go:101","msg":"ChainNotify error: handler: websocket connection closed"}
{"level":"warn","ts":"2021-08-01T13:12:27.568+0800","logger":"events","caller":"events/events.go:106","msg":"listenHeadChanges quit"}
{"level":"warn","ts":"2021-08-01T13:12:27.568+0800","logger":"events","caller":"events/events.go:106","msg":"listenHeadChanges quit"}
{"level":"error","ts":"2021-08-01T13:12:27.568+0800","logger":"miner","caller":"miner/miner.go:256","msg":"failed to get best mining candidate: handler: websocket connection closed"}
{"level":"info","ts":"2021-08-01T13:12:28.569+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"error","ts":"2021-08-01T13:12:28.569+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"info","ts":"2021-08-01T13:12:28.569+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"error","ts":"2021-08-01T13:12:28.569+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"info","ts":"2021-08-01T13:12:29.569+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"info","ts":"2021-08-01T13:12:29.569+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"error","ts":"2021-08-01T13:12:29.570+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"error","ts":"2021-08-01T13:12:29.570+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"info","ts":"2021-08-01T13:12:30.570+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"error","ts":"2021-08-01T13:12:30.570+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"info","ts":"2021-08-01T13:12:30.570+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"error","ts":"2021-08-01T13:12:30.570+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"info","ts":"2021-08-01T13:12:31.571+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"error","ts":"2021-08-01T13:12:31.571+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"info","ts":"2021-08-01T13:12:31.571+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"error","ts":"2021-08-01T13:12:31.571+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"error","ts":"2021-08-01T13:12:32.569+0800","logger":"miner","caller":"miner/miner.go:256","msg":"failed to get best mining candidate: handler: websocket connection closed"}
{"level":"info","ts":"2021-08-01T13:12:32.572+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"error","ts":"2021-08-01T13:12:32.572+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"info","ts":"2021-08-01T13:12:32.572+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"error","ts":"2021-08-01T13:12:32.572+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"info","ts":"2021-08-01T13:12:33.572+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"error","ts":"2021-08-01T13:12:33.572+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"info","ts":"2021-08-01T13:12:33.572+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"error","ts":"2021-08-01T13:12:33.572+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"info","ts":"2021-08-01T13:12:34.573+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"error","ts":"2021-08-01T13:12:34.573+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"info","ts":"2021-08-01T13:12:34.573+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"error","ts":"2021-08-01T13:12:34.573+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"info","ts":"2021-08-01T13:12:35.573+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"error","ts":"2021-08-01T13:12:35.573+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"info","ts":"2021-08-01T13:12:35.573+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"error","ts":"2021-08-01T13:12:35.573+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"info","ts":"2021-08-01T13:12:36.574+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"info","ts":"2021-08-01T13:12:36.574+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"error","ts":"2021-08-01T13:12:36.574+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"error","ts":"2021-08-01T13:12:36.574+0800","logger":"events","caller":"events/events.go:104","msg":"listen head changes errored: listenHeadChanges ChainNotify call failed: handler: websocket con
nection closed"}
{"level":"info","ts":"2021-08-01T13:12:37.574+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"info","ts":"2021-08-01T13:12:37.574+0800","logger":"events","caller":"events/events.go:115","msg":"restarting listenHeadChanges"}
{"level":"warn","ts":"2021-08-01T13:12:37.576+0800","logger":"events","caller":"events/events.go:151","msg":"tsc.add: adding current tipset failed: tipSetCache.add: expected new tipset height to be at lea
st 982944, was 982943"}
{"level":"warn","ts":"2021-08-01T13:12:37.577+0800","logger":"events","caller":"events/events.go:151","msg":"tsc.add: adding current tipset failed: tipSetCache.add: expected new tipset height to be at lea
st 982944, was 982943"}
{"level":"warn","ts":"2021-08-01T13:12:50.703+0800","logger":"miner","caller":"miner/miner.go:507","msg":"completed mineOne","tookMilliseconds":19,"forRound":982946,"baseEpoch":982945,"baseDeltaSeconds":6
,"nullRounds":0,"lateStart":true,"beaconEpoch":1078790,"lookbackEpochs":900,"networkPowerAtLookback":"9640694000978264064","minerPowerAtLookback":"82944408420352","isEligible":true,"isWinner":false,"erro
r":null} |
We are waiting to collect goroutine information after the scene reappears!
Many rpc api calls here take a long time.
|
Fixed in #7000. That won't fix the slow subscribers, but it will prevent them from blocking everything else. |
i occur the same issue?why it happend,how to solve it?network is ok,no drop packet |
@psweiweimao no, that looks unrelated. |
could you tell me why it happened? |
could you tell me why it happened? tks |
@psweiweimao I have no idea, please ask on Slack. |
i have same proplem. Lotus version:1.10.2, how to solve it ? |
Hey! Just wanted to updated everybody in this thread that this issue will be tracked in #8362 going forward. It´s an issue that is on our radar, and one that we really want to find find a fix for, but it´s also one that we would certainly need additional help from everybody.
|
Lotus' api ChainNotify is called in WindowPoStScheduler.Run, sometimes it keeps blocking, causing WindowPoSt to not run normally!
Here some logs have been added to track whether ChainNotify is blocked:
It will print normally:
But sometimes it just prints:
At this time WindowPoSt cannot work normally!
The following is the log information:
System info:
The text was updated successfully, but these errors were encountered: