Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lotus daemon stops synchronizing after running for a while[v1.25.0] #11489

Closed
5 of 11 tasks
Brian44913 opened this issue Dec 6, 2023 · 3 comments
Closed
5 of 11 tasks
Labels
kind/bug Kind: Bug

Comments

@Brian44913
Copy link

Checklist

  • This is not a security-related bug/issue. If it is, please follow please follow the security policy.
  • I have searched on the issue tracker and the lotus forum, and there is no existing related issue or discussion.
  • I am running the Latest release, the most recent RC(release canadiate) for the upcoming release or the dev branch(master), or have an issue updating to any of these.
  • I did not make any code changes to lotus.

Lotus component

  • lotus daemon - chain sync
  • lotus fvm/fevm - Lotus FVM and FEVM interactions
  • lotus miner/worker - sealing
  • lotus miner - proving(WindowPoSt/WinningPoSt)
  • lotus JSON-RPC API
  • lotus message management (mpool)
  • Other

Lotus Version

lotus version 1.25.0+mainnet+git.7e6b20cd9

Repro Steps

  1. Run 'lotus daemon'

Describe the Bug

At the time when synchronization was stopped, the following warning log was found. The synchronization was interrupted for more than 30 minutes and was not restored until the lotus daemon process was restarted.

Logging Information

{"level":"warn","ts":"2023-12-06T13:25:27.226+0800","logger":"sub","caller":"sub/incoming.go:102","msg":"received block with large delay from miner","block":"bafy2bzacebojci5qkons34sbubjzmafe3rowfwizfgrnldu744xkhvuyg63jg","delay":27,"miner":"f034566"}
{"level":"info","ts":"2023-12-06T13:25:30.649+0800","logger":"consensus-common","caller":"consensus/compute_state.go:306","msg":"ApplyBlocks stats","early":0.00000241,"earlyCronGas":0,"vmMsg":0.331308156,"msgGas":12257130321,"vmCron":0.869174182,"cronGas":48702510243,"vmFlush":0.055473393,"epoch":"3451130","tsk":"{bafy2bzacec7se66uofey4tpbm2gwcvzhocgydrfe3ewk5rmtvqijqltwggxum,bafy2bzacebojci5qkons34sbubjzmafe3rowfwizfgrnldu744xkhvuyg63jg,bafy2bzacedl7ipn56ucxlaqbmpauzx5lfuhrgmzm5a5urxs6qik5mrjvroqss,bafy2bzacebz7j6rh4xbd5n527dj2wtqyjggn2bf65rh4nuz3f4zksz7ov4uho}"}
{"level":"warn","ts":"2023-12-06T13:25:30.680+0800","logger":"chain","caller":"chain/sync.go:852","msg":"(fork detected) synced header chain ([bafy2bzacedeciss3nrmtzqs5heopydx3p72djv4xbaqo25fckppaphlfsgpts] - 3451131) does not link to our best block ([bafy2bzacec7se66uofey4tpbm2gwcvzhocgydrfe3ewk5rmtvqijqltwggxum bafy2bzacebojci5qkons34sbubjzmafe3rowfwizfgrnldu744xkhvuyg63jg bafy2bzacedl7ipn56ucxlaqbmpauzx5lfuhrgmzm5a5urxs6qik5mrjvroqss bafy2bzacebz7j6rh4xbd5n527dj2wtqyjggn2bf65rh4nuz3f4zksz7ov4uho] - 3451130)"}
{"level":"warn","ts":"2023-12-06T13:25:33.166+0800","logger":"chain","caller":"chain/sync.go:852","msg":"(fork detected) synced header chain ([bafy2bzacecj3au5qpnmgvq4g6lobxxc4rwhrjitzk4smcd3zducfirpmeckj4] - 3451131) does not link to our best block ([bafy2bzacedeciss3nrmtzqs5heopydx3p72djv4xbaqo25fckppaphlfsgpts] - 3451131)"}
{"level":"info","ts":"2023-12-06T13:25:34.103+0800","logger":"consensus-common","caller":"consensus/compute_state.go:306","msg":"ApplyBlocks stats","early":0.00000244,"earlyCronGas":0,"vmMsg":0.276702354,"msgGas":8334757503,"vmCron":0.666357752,"cronGas":39165944554,"vmFlush":0.069529157,"epoch":"3451131","tsk":"{bafy2bzacedeciss3nrmtzqs5heopydx3p72djv4xbaqo25fckppaphlfsgpts}"}
{"level":"warn","ts":"2023-12-06T13:25:37.749+0800","logger":"sub","caller":"sub/incoming.go:102","msg":"received block with large delay from miner","block":"bafy2bzacea6tmqwbnp5qpuhilpr53oq5vtkejgrifvzt2cjnpk6wrsqtpkivw","delay":7,"miner":"f02232007"}
{"level":"warn","ts":"2023-12-06T13:25:38.154+0800","logger":"sub","caller":"sub/incoming.go:102","msg":"received block with large delay from miner","block":"bafy2bzaceaimhjc7qiym3nz3ku7pdeeshn7fboqye5hu5cntpg2zga2bzhism","delay":8,"miner":"f01945216"}
{"level":"warn","ts":"2023-12-06T13:25:48.478+0800","logger":"sub","caller":"sub/incoming.go:102","msg":"received block with large delay from miner","block":"bafy2bzacebrkem63cqxy7etxrvx4kmbpg2i3okl2cju6quhzzylggpsbgmjws","delay":18,"miner":"f01969339"}
{"level":"info","ts":"2023-12-06T13:25:58.251+0800","logger":"canonical-log","caller":"swarm/swarm_listen.go:136","msg":"CANONICAL_PEER_STATUS: peer=12D3KooWDiwG7q86vsDVQJKg5vRf58hE4Eb7UjAigodpgfKfsGeq addr=/ip4/113.90.157.85/tcp/16188 sample_rate=100 connection_status=\"established\" dir=\"inbound\""}
{"level":"info","ts":"2023-12-06T13:26:03.971+0800","logger":"canonical-log","caller":"swarm/swarm_listen.go:136","msg":"CANONICAL_PEER_STATUS: peer=12D3KooWLcSxkP5RCXgDu7Rv2jnFKNkAz8VnR27Ct2imWt9gyrNj addr=/ip4/65.49.28.210/tcp/28025 sample_rate=100 connection_status=\"established\" dir=\"inbound\""}
{"level":"warn","ts":"2023-12-06T13:26:05.599+0800","logger":"hello","caller":"hello/hello.go:108","msg":"other peer hasnt completed libp2p identify, waiting a bit"}
{"level":"info","ts":"2023-12-06T13:26:34.502+0800","logger":"net/identify","caller":"identify/id.go:427","msg":"failed negotiate identify protocol with peer","peer":"12D3KooWSohkEsPDsEwnnKLJWPRoNMNRM39NxUehNDivC6nmwgas","error":"stream reset"}
{"level":"warn","ts":"2023-12-06T13:26:34.502+0800","logger":"net/identify","caller":"identify/id.go:399","msg":"failed to identify 12D3KooWSohkEsPDsEwnnKLJWPRoNMNRM39NxUehNDivC6nmwgas: stream reset"}
{"level":"info","ts":"2023-12-06T13:26:45.269+0800","logger":"net/identify","caller":"identify/id.go:427","msg":"failed negotiate identify protocol with peer","peer":"12D3KooWNeqRA65xKW9zpsghUfwTNbHc6uXYXxUXqR2ezQ6VBcDs","error":"stream reset"}
{"level":"warn","ts":"2023-12-06T13:26:45.269+0800","logger":"net/identify","caller":"identify/id.go:399","msg":"failed to identify 12D3KooWNeqRA65xKW9zpsghUfwTNbHc6uXYXxUXqR2ezQ6VBcDs: stream reset"}
{"level":"info","ts":"2023-12-06T13:27:00.228+0800","logger":"canonical-log","caller":"swarm/swarm_listen.go:136","msg":"CANONICAL_PEER_STATUS: peer=12D3KooWBtMDTadZVzicLpmWPoD7FSgJbcwsCrw1r9fWPYpvyQMY addr=/ip4/162.219.38.92/tcp/35457 sample_rate=100 connection_status=\"established\" dir=\"inbound\""}
{"level":"info","ts":"2023-12-06T13:27:54.487+0800","logger":"canonical-log","caller":"swarm/swarm_listen.go:136","msg":"CANONICAL_PEER_STATUS: peer=12D3KooWBsxoo9dZoBoe6SLmV3HYFKebMucBLJ4TJjRxsoqE9dJf addr=/ip4/103.9.208.55/tcp/24001 sample_rate=100 connection_status=\"established\" dir=\"inbound\""}
{"level":"warn","ts":"2023-12-06T13:28:03.129+0800","logger":"hello","caller":"hello/hello.go:108","msg":"other peer hasnt completed libp2p identify, waiting a bit"}
{"level":"warn","ts":"2023-12-06T13:28:03.992+0800","logger":"hello","caller":"hello/hello.go:108","msg":"other peer hasnt completed libp2p identify, waiting a bit"}
{"level":"warn","ts":"2023-12-06T13:28:11.591+0800","logger":"hello","caller":"hello/hello.go:108","msg":"other peer hasnt completed libp2p identify, waiting a bit"}
{"level":"info","ts":"2023-12-06T13:28:56.059+0800","logger":"net/identify","caller":"identify/id.go:427","msg":"failed negotiate identify protocol with peer","peer":"12D3KooWH13ekG2PsnW6iC9kCVniWm6HUkbNcefqgfADxyo6y43F","error":"stream reset"}
{"level":"warn","ts":"2023-12-06T13:28:56.059+0800","logger":"net/identify","caller":"identify/id.go:399","msg":"failed to identify 12D3KooWH13ekG2PsnW6iC9kCVniWm6HUkbNcefqgfADxyo6y43F: stream reset"}
{"level":"info","ts":"2023-12-06T13:29:30.380+0800","logger":"canonical-log","caller":"swarm/swarm_listen.go:136","msg":"CANONICAL_PEER_STATUS: peer=12D3KooWMUqRR1vfGLdiDwiJxhobk63wKCkXej2rSxLLRD421gFg addr=/ip4/18.166.69.239/tcp/34169 sample_rate=100 connection_status=\"established\" dir=\"inbound\""}
{"level":"info","ts":"2023-12-06T13:30:01.022+0800","logger":"bitswap","caller":"bitswap/bitswap.go:169","msg":"Bitswap Client ReceiveError: Application error 0x0 (remote)"}
{"level":"info","ts":"2023-12-06T13:30:03.089+0800","logger":"net/identify","caller":"identify/id.go:427","msg":"failed negotiate identify protocol with peer","peer":"12D3KooWFN7c6VAtzsEuw2r4cjhpU6AhWQVVBH6HAZUieXZohHy2","error":"stream reset"}
{"level":"warn","ts":"2023-12-06T13:30:03.089+0800","logger":"net/identify","caller":"identify/id.go:399","msg":"failed to identify 12D3KooWFN7c6VAtzsEuw2r4cjhpU6AhWQVVBH6HAZUieXZohHy2: stream reset"}
{"level":"info","ts":"2023-12-06T13:30:30.095+0800","logger":"bs:sess","caller":"session/session.go:460","msg":"No peers - broadcasting","session":22,"want-count":1}
{"level":"info","ts":"2023-12-06T13:31:00.196+0800","logger":"bitswap_network","caller":"network/ipfs_impl.go:187","msg":"send message to 12D3KooWLu7iCqdFVauMmbNPDNM4yKqKgEwUTyechNgyRZMYkZW6 failed but context was not Done: identify failed to complete: context deadline exceeded"}
{"level":"info","ts":"2023-12-06T13:31:00.196+0800","logger":"bitswap_network","caller":"network/ipfs_impl.go:187","msg":"send message to 12D3KooWRWLCiuiDbKqQy4eX3qg2veqqtunbxzbCYTrP7po3jvFS failed but context was not Done: identify failed to complete: context deadline exceeded"}
{"level":"info","ts":"2023-12-06T13:31:00.196+0800","logger":"bitswap_network","caller":"network/ipfs_impl.go:187","msg":"send message to 12D3KooWSH6g27vijVrpzbzQRSysdstJsVtUbxrurDbQnvPPf3iu failed but context was not Done: identify failed to complete: context deadline exceeded"}
{"level":"info","ts":"2023-12-06T13:31:00.196+0800","logger":"bitswap_network","caller":"network/ipfs_impl.go:187","msg":"send message to 12D3KooWRT93YwjqzuXvKveAPxtb4y9TwfuTMmDotPRDioUGkgMx failed but context was not Done: identify failed to complete: context deadline exceeded"}
{"level":"info","ts":"2023-12-06T13:31:00.196+0800","logger":"bitswap_network","caller":"network/ipfs_impl.go:187","msg":"send message to 12D3KooWHFF74AMx9zdXcUa2MZxxBrG2ztSk6r3gEJ6rCAzmd362 failed but context was not Done: identify failed to complete: context deadline exceeded"}
{"level":"info","ts":"2023-12-06T13:31:00.196+0800","logger":"bitswap_network","caller":"network/ipfs_impl.go:187","msg":"send message to 12D3KooWQBiN8z9dmQCQBSjzy9X9J5ptW2ZBcCmxsfQWq8bffTYF failed but context was not Done: identify failed to complete: context deadline exceeded"}
{"level":"info","ts":"2023-12-06T13:31:00.196+0800","logger":"bitswap_network","caller":"network/ipfs_impl.go:187","msg":"send message to 12D3KooWBcw6LaCGPfm8nWmvwYEy4D8EsrWDiz9u1apwHrahQ1JR failed but context was not Done: identify failed to complete: context deadline exceeded"}
@rjan90
Copy link
Contributor

rjan90 commented Dec 6, 2023

Hey @Brian44913! Can you try to take some goroutines from when the daemon is stuck with this command: lotus pprof goroutines > goroutines.txt ?

@rjan90 rjan90 removed the need/triage label Dec 6, 2023
@Brian44913
Copy link
Author

goroutines.txt.zip

Based on lotus 1.25.1+mainnet+git.924af4294

@rjan90
Copy link
Contributor

rjan90 commented Feb 6, 2024

Closing this issue, as this syncing bug has been fixed with:

Both of which has landed in the Lotus v1.25.2 release. Thank you so much for the report, and please let us know if you encounter any other syncing issues.

@rjan90 rjan90 closed this as completed Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Kind: Bug
Projects
None yet
Development

No branches or pull requests

2 participants