Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stuck with syncing #3708

Closed
leafyoung opened this issue Mar 16, 2022 · 44 comments
Closed

Stuck with syncing #3708

leafyoung opened this issue Mar 16, 2022 · 44 comments

Comments

@leafyoung
Copy link

leafyoung commented Mar 16, 2022

System information

Erigon version: erigon version 2022.99.99-dev

OS & Version: Linux

Commit hash : 10aee02

Expected behaviour

I have finished the initial full sync and expected erigon will repeat the cycle of staged sync for new blocks.

Actual behaviour

But it has ever stuck with block=14378699. The screen output is

INFO[03-16|10:12:23.974] Build info                               git_branch=devel git_tag=v2022.03.01 git_commit=10aee02e2e9b7fdc160dae0712b0f0bba72a75dd
INFO[03-16|10:12:23.974] Starting Erigon on Ethereum mainnet...
INFO[03-16|10:12:23.975] Maximum peer count                       ETH=100 total=100
INFO[03-16|10:12:23.975] Set global gas cap                       cap=50000000
INFO[03-16|10:12:23.997] Opening Database                         label=chaindata path=/mnt/Erigon/eth/mainnet/chaindata
INFO[03-16|10:12:23.998] Initialised chain configuration          config="{ChainID: 1 Homestead: 1150000 DAO: 1920000 DAOSupport: true EIP150: 2463000 EIP155: 2675000 EIP158: 2675000 Byzantium: 4370000 Constantinople: 7280000 Petersburg: 7280000 Istanbul: 9069000, Muir Glacier: 9200000, Berlin: 12244000, London: 12965000, Arrow Glacier: 13773000, Terminal Total Difficulty: <nil>, Engine: ethash}" genesis=0xd4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3
INFO[03-16|10:12:23.998] Disk storage enabled for ethash DAGs     dir=/mnt/Erigon/eth/mainnet/ethash-dags count=2
INFO[03-16|10:12:23.998] Initialising Ethereum protocol           network=1
INFO[03-16|10:12:23.998] Effective                                prune_flags="--prune=" snapshot_flags=
INFO[03-16|10:12:24.236] Starting private RPC server              on=0.0.0.0:9090
INFO[03-16|10:12:24.243] Started P2P networking                   version=66 self=enode://c52224f9072bae0c07c9b0464379ece6d8f647ae61dbb0fabcd12accd931c9cce6200058e5f87e160a89ce308dded0d550efc488e5403ade31272667596d8b83@127.0.0.1:30303 name=erigon/v2022.99.99-dev-10aee02e/linux-amd64/go1.16.15
INFO[03-16|10:12:24.248] [1/16 Headers] Waiting for headers...    from=14378699
INFO[03-16|10:12:24.273] Mapped network port                      proto=tcp extport=30303 intport=30303 interface=NAT-PMP(192.168.68.1)
INFO[03-16|10:12:24.293] Mapped network port                      proto=udp extport=30303 intport=30303 interface=NAT-PMP(192.168.68.1)
INFO[03-16|10:12:54.248] [1/16 Headers] Wrote block headers       number=14378699 blk/second=0.000 alloc="71.54 MiB" sys="141.46 MiB"
INFO[03-16|10:13:24.248] [1/16 Headers] Wrote block headers       number=14378699 blk/second=0.000 alloc="99.85 MiB" sys="142.05 MiB"
INFO[03-16|10:13:54.249] [1/16 Headers] Wrote block headers       number=14378699 blk/second=0.000 alloc="83.76 MiB" sys="142.61 MiB"
...

I noticed that mbox file was last modified a few days ago. It was not updated.

-rwxrwx--- 1 erigon yangye 2063731785728 Mar 14 15:02 mdbx.dat
-rwxrwx--- 1 erigon yangye       1028096 Mar 16 10:14 mdbx.lck

Steps to reproduce the behaviour

I would like to provide more information as requested. Thanks.

Backtrace

[backtrace]
@AskAlexSharov
Copy link
Collaborator

AskAlexSharov commented Mar 16, 2022

  1. Do you see 0 "GoodPeers" in logs? Probably not
  2. then it means - by some reason we have peers but have no incoming headers... not sure what it means... we will investigate
  3. as the hotfix you can do:
make integration
./build/bin/integration state_stages --datadir=<your_datadir> --unwind=100 --integrity.fast=false
./build/bin/integration stage_bodies --datadir=<your_datadir> --unwind=100 
./build/bin/integration stage_headers --datadir=<your_datadir> --unwind=100
# start Erigon as usually

@mcseemk
Copy link

mcseemk commented Mar 16, 2022

Same issue:

INFO[03-16|07:47:00.869] Build info                               git_branch=devel git_tag=v2022.03.01 git_commit=2785a57d1cad9581f2a76982595baad3c71c4a21
INFO[03-16|07:47:00.870] Starting Erigon on Ethereum mainnet... 
INFO[03-16|07:47:00.870] Maximum peer count                       ETH=100 total=100
INFO[03-16|07:47:00.870] Set global gas cap                       cap=50000000
INFO[03-16|07:47:00.909] Opening Database                         label=chaindata path=/home/mcs/.local/share/erigon/chaindata
INFO[03-16|07:47:00.911] Initialised chain configuration          config="{ChainID: 1 Homestead: 1150000 DAO: 1920000 DAOSupport: true EIP150: 2463000 EIP155: 2675000 EIP158: 2675000 Byzantium: 4370000 Constantinople: 7280000 Petersburg: 7280000 Istanbul: 9069000, Muir Glacier: 9200000, Berlin: 12244000, London: 12965000, Arrow Glacier: 13773000, Terminal Total Difficulty: <nil>, Engine: ethash}" genesis=0xd4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3
INFO[03-16|07:47:00.911] Disk storage enabled for ethash DAGs     dir=/home/mcs/.local/share/erigon/ethash-dags count=2
INFO[03-16|07:47:00.911] Initialising Ethereum protocol           network=1
INFO[03-16|07:47:00.911] Effective                                prune_flags="--prune=" snapshot_flags=
INFO[03-16|07:47:01.307] Starting private RPC server              on=127.0.0.1:9090
INFO[03-16|07:47:01.310] Started P2P networking                   version=66 self=enode://12ac8f731adf2e3db95663d29d52b872573f923d066191b752f55e2e0a4680283cd5312bb56d7d2519740d3036b569458669b2a6aee9f1ffb86427bc4c15301f@127.0.0.1:30303 name=erigon/v2022.99.99-dev-2785a57d/linux-amd64/go1.18
INFO[03-16|07:47:01.318] [1/16 Headers] Waiting for headers...    from=14394401
INFO[03-16|07:47:31.318] [1/16 Headers] Wrote block headers       number=14394401 blk/second=0.000 alloc="49.93 MiB" sys="73.39 MiB"
INFO[03-16|07:48:01.319] [1/16 Headers] Wrote block headers       number=14394401 blk/second=0.000 alloc="193.89 MiB" sys="233.72 MiB"
INFO[03-16|07:48:31.319] [1/16 Headers] Wrote block headers       number=14394401 blk/second=0.000 alloc="202.50 MiB" sys="233.72 MiB"
INFO[03-16|07:49:00.912] [p2p] GoodPeers                          eth66=4
INFO[03-16|07:49:01.318] [1/16 Headers] Wrote block headers       number=14394401 blk/second=0.000 alloc="212.55 MiB" sys="237.72 MiB"
INFO[03-16|07:49:31.318] [1/16 Headers] Wrote block headers       number=14394401 blk/second=0.000 alloc="224.73 MiB" sys="249.72 MiB"
INFO[03-16|07:50:01.318] [1/16 Headers] Wrote block headers       number=14394401 blk/second=0.000 alloc="199.80 MiB" sys="261.72 MiB"
INFO[03-16|07:50:31.318] [1/16 Headers] Wrote block headers       number=14394401 blk/second=0.000 alloc="222.36 MiB" sys="261.97 MiB"
INFO[03-16|07:51:00.912] [p2p] GoodPeers                          eth66=14
INFO[03-16|07:51:01.318] [1/16 Headers] Wrote block headers       number=14394401 blk/second=0.000 alloc="211.48 MiB" sys="357.25 MiB"
INFO[03-16|07:51:31.319] [1/16 Headers] Wrote block headers       number=14394401 blk/second=0.000 alloc="267.42 MiB" sys="357.38 MiB"

@mcseemk
Copy link

mcseemk commented Mar 16, 2022

./build/bin/integration state_stages --datadir=~/.local/share/erigon --unwind=100 --integrity.fast=false

INFO[03-16|07:54:18.532] Re-Opening DB in exclusive mode to apply DB migrations 
INFO[03-16|07:54:18.532] database closed                          label=chaindata
INFO[03-16|07:54:18.533] Apply migration                          name=db_schema_version5
INFO[03-16|07:54:18.533] Applied migration                        name=db_schema_version5
INFO[03-16|07:54:18.533] Apply migration                          name=txs_begin_end
INFO[03-16|07:54:18.534] Applied migration                        name=txs_begin_end
INFO[03-16|07:54:18.534] Updated DB schema to                     version=6.0.0
INFO[03-16|07:54:18.534] database closed                          label=chaindata
INFO[03-16|07:54:18.704] Writing custom genesis block             hash=0xd4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3
INFO[03-16|07:54:18.705] Initialised chain configuration          config="{ChainID: 1 Homestead: 1150000 DAO: 1920000 DAOSupport: true EIP150: 2463000 EIP155: 2675000 EIP158: 2675000 Byzantium: 4370000 Constantinople: 7280000 Petersburg: 7280000 Istanbul: 9069000, Muir Glacier: 9200000, Berlin: 12244000, London: 12965000, Arrow Glacier: 13773000, Terminal Total Difficulty: <nil>, Engine: ethash}"
INFO[03-16|07:54:18.743] database closed                          label=chaindata
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x48 pc=0x11ab91d]

goroutine 1 [running]:
github.com/ledgerwatch/erigon/turbo/snapshotsync.(*RoSnapshots).Cfg(...)
	github.com/ledgerwatch/erigon/turbo/snapshotsync/block_snapshots.go:349
github.com/ledgerwatch/erigon/cmd/integration/commands.newSync({0x231cd40, 0xc000760c40}, {0x2320220, 0xc00077dcc0}, 0xc00061fbe8)
	github.com/ledgerwatch/erigon/cmd/integration/commands/stages.go:1130 +0xb9d
github.com/ledgerwatch/erigon/cmd/integration/commands.syncBySmallSteps({0x2320220, 0xc00077dcc0}, {0x0, 0x0, 0x0, {0x0, 0x0, 0x0, 0x0, 0x0, ...}, ...}, ...)
	github.com/ledgerwatch/erigon/cmd/integration/commands/state_stages.go:150 +0xab
github.com/ledgerwatch/erigon/cmd/integration/commands.glob..func23(0x2c0bfa0?, {0x14db1b3?, 0x3?, 0x3?})
	github.com/ledgerwatch/erigon/cmd/integration/commands/state_stages.go:65 +0x208
github.com/spf13/cobra.(*Command).execute(0x2c0bfa0, {0xc000776660, 0x3, 0x3})
	github.com/spf13/cobra@v1.4.0/command.go:856 +0x67c
github.com/spf13/cobra.(*Command).ExecuteC(0x2c0d3a0)
	github.com/spf13/cobra@v1.4.0/command.go:974 +0x3b4
github.com/spf13/cobra.(*Command).Execute(...)
	github.com/spf13/cobra@v1.4.0/command.go:902
github.com/spf13/cobra.(*Command).ExecuteContext(...)
	github.com/spf13/cobra@v1.4.0/command.go:895
main.main()
	github.com/ledgerwatch/erigon/cmd/integration/main.go:15 +0x112

@mcseemk
Copy link

mcseemk commented Mar 16, 2022

./build/bin/integration stage_bodies --datadir=~/.local/share/erigon --unwind=100

INFO[03-16|07:56:04.385] Initialised chain configuration          config="{ChainID: 1 Homestead: 1150000 DAO: 1920000 DAOSupport: true EIP150: 2463000 EIP155: 2675000 EIP158: 2675000 Byzantium: 4370000 Constantinople: 7280000 Petersburg: 7280000 Istanbul: 9069000, Muir Glacier: 9200000, Berlin: 12244000, London: 12965000, Arrow Glacier: 13773000, Terminal Total Difficulty: <nil>, Engine: ethash}"
INFO[03-16|07:56:04.425] database closed                          label=chaindata
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x48 pc=0x11ab91d]

goroutine 1 [running]:
github.com/ledgerwatch/erigon/turbo/snapshotsync.(*RoSnapshots).Cfg(...)
	github.com/ledgerwatch/erigon/turbo/snapshotsync/block_snapshots.go:349
github.com/ledgerwatch/erigon/cmd/integration/commands.newSync({0x231cd40, 0xc000978c00}, {0x2320220, 0xc0009f3360}, 0x0)
	github.com/ledgerwatch/erigon/cmd/integration/commands/stages.go:1130 +0xb9d
github.com/ledgerwatch/erigon/cmd/integration/commands.stageBodies({0x2320220?, 0xc0009f3360}, {0x231cd40?, 0xc000978c00})
	github.com/ledgerwatch/erigon/cmd/integration/commands/stages.go:467 +0x6f
github.com/ledgerwatch/erigon/cmd/integration/commands.glob..func9(0x2c0a6a0?, {0x14db1b3?, 0x2?, 0x2?})
	github.com/ledgerwatch/erigon/cmd/integration/commands/stages.go:70 +0xd8
github.com/spf13/cobra.(*Command).execute(0x2c0a6a0, {0xc000483fc0, 0x2, 0x2})
	github.com/spf13/cobra@v1.4.0/command.go:856 +0x67c
github.com/spf13/cobra.(*Command).ExecuteC(0x2c0d3a0)
	github.com/spf13/cobra@v1.4.0/command.go:974 +0x3b4
github.com/spf13/cobra.(*Command).Execute(...)
	github.com/spf13/cobra@v1.4.0/command.go:902
github.com/spf13/cobra.(*Command).ExecuteContext(...)
	github.com/spf13/cobra@v1.4.0/command.go:895
main.main()
	github.com/ledgerwatch/erigon/cmd/integration/main.go:15 +0x112

@mcseemk
Copy link

mcseemk commented Mar 16, 2022

./build/bin/integration stage_headers --datadir=~/.local/share/erigon --unwind=100

EROR[03-16|07:58:10.271] Error                                    err="cannot unwind past 0"
INFO[03-16|07:58:10.271] database closed                          label=chaindata
Error: cannot unwind past 0
Usage:
  integration stage_headers [flags]

Flags:
      --bor.heimdall string      URL of Heimdall service (default "http://localhost:1317")
      --chain string             pick a chain to assume (mainnet, ropsten, etc.)
      --chaindata string         path to the db
      --database.verbosity int   Enabling internal db logs. Very high verbosity levels may require recompile db. Default: 2, means warning (default 2)
      --datadir string           data directory for temporary ELT files (default "/home/mcs/.local/share/erigon")
      --experimental.snapshot    
  -h, --help                     help for stage_headers
      --unwind uint              how much blocks unwind on each iteration

Global Flags:
      --log.json                  Format logs with JSON
      --metrics                   Enable metrics collection and reporting
      --metrics.addr string       Enable stand-alone metrics HTTP server listening interface (default "127.0.0.1")
      --metrics.expensive         Enable expensive metrics collection and reporting
      --metrics.port int          Metrics HTTP server listening port (default 6060)
      --pprof                     Enable the pprof HTTP server
      --pprof.addr string         pprof HTTP server listening interface (default "127.0.0.1")
      --pprof.cpuprofile string   Write CPU profile to the given file
      --pprof.port int            pprof HTTP server listening port (default 6060)
      --trace string              Write execution trace to the given file
      --verbosity int             Logging verbosity: 0=silent, 1=error, 2=warn, 3=info, 4=debug, 5=detail (default 3)

cannot unwind past 0

@AskAlexSharov
Copy link
Collaborator

Will check first error today

@IanBarnesZa
Copy link

Hi.

I am experiencing the same exact problem running the same version and the same symptoms and build up to the stuck indexing. I have run the commands suggested and have the same output as above apart from the second one:

INFO[03-16|18:45:22.240] Initialised chain configuration          config="{ChainID: 1 Homestead: 1150000 DAO: 1920000 DAOSupport: true EIP150: 2463000 EIP155: 2675000 EIP158: 2675000 Byzantium: 4370000 Constantinople: 7280000 Petersburg: 7280000 Istanbul: 9069000, Muir Glacier: 9200000, Berlin: 12244000, London: 12965000, Arrow Glacier: 13773000, Terminal Total Difficulty: <nil>, Engine: ethash}"
EROR[03-16|18:45:22.300] Error                                    err="cannot unwind past 0"
INFO[03-16|18:45:22.301] database closed                          label=chaindata
Error: cannot unwind past 0
Usage:
  integration stage_bodies [flags]

@AskAlexSharov
Copy link
Collaborator

First nil pointer dereference fixed by #3723

@AskAlexSharov
Copy link
Collaborator

Also, if you have list of 3 commands and first one failed - I don't advise you run second. It may lead to unexpected result.

@AskAlexSharov
Copy link
Collaborator

@IanBarnesZa show
./build/bin/integration print_stages --datadir=<your_datadir>

@mcseemk
Copy link

mcseemk commented Mar 17, 2022

./build/bin/integration state_stages --datadir=~/.local/share/erigon

INFO[03-17|03:22:02.143] Initialised chain configuration          config="{ChainID: 1 Homestead: 1150000 DAO: 1920000 DAOSupport: true EIP150: 2463000 EIP155: 2675000 EIP158: 2675000 Byzantium: 4370000 Constantinople: 7280000 Petersburg: 7280000 Istanbul: 9069000, Muir Glacier: 9200000, Berlin: 12244000, London: 12965000, Arrow Glacier: 13773000, Terminal Total Difficulty: <nil>, Engine: ethash}"
INFO[03-17|03:22:02.184] database closed                          label=chaindata

@mcseemk
Copy link

mcseemk commented Mar 17, 2022

./build/bin/integration state_stages --datadir=~/.local/share/erigon --unwind=100 --integrity.fast=false

INFO[03-17|03:23:52.137] Initialised chain configuration          config="{ChainID: 1 Homestead: 1150000 DAO: 1920000 DAOSupport: true EIP150: 2463000 EIP155: 2675000 EIP158: 2675000 Byzantium: 4370000 Constantinople: 7280000 Petersburg: 7280000 Istanbul: 9069000, Muir Glacier: 9200000, Berlin: 12244000, London: 12965000, Arrow Glacier: 13773000, Terminal Total Difficulty: <nil>, Engine: ethash}"
INFO[03-17|03:23:52.178] database closed                          label=chaindata

@mcseemk
Copy link

mcseemk commented Mar 17, 2022

./build/bin/integration stage_bodies --datadir=~/.local/share/erigon --unwind=100

INFO[03-17|03:24:05.867] Initialised chain configuration          config="{ChainID: 1 Homestead: 1150000 DAO: 1920000 DAOSupport: true EIP150: 2463000 EIP155: 2675000 EIP158: 2675000 Byzantium: 4370000 Constantinople: 7280000 Petersburg: 7280000 Istanbul: 9069000, Muir Glacier: 9200000, Berlin: 12244000, London: 12965000, Arrow Glacier: 13773000, Terminal Total Difficulty: <nil>, Engine: ethash}"
EROR[03-17|03:24:05.911] Error                                    err="cannot unwind past 0"
INFO[03-17|03:24:05.912] database closed                          label=chaindata
Error: cannot unwind past 0
Usage:
  integration stage_bodies [flags]
...

@AskAlexSharov
Copy link
Collaborator

@mcseemk print_stages please

@mcseemk
Copy link

mcseemk commented Mar 17, 2022

./build/bin/integration print_stages --datadir=~/.local/share/erigon

Note: prune_at doesn't mean 'all data before were deleted' - it just mean stage.Prune function were run to this block. Because 1 stage may prune multiple data types to different prune distance.

 			 stage_at 	 prune_at
Headers 		 0 		 0
BlockHashes 		 0 		 0
Bodies 			 0 		 0
Senders 		 0 		 0
Execution 		 0 		 0
Translation 		 0 		 0
HashState 		 0 		 0
IntermediateHashes 	 0 		 0
AccountHistoryIndex 	 0 		 0
StorageHistoryIndex 	 0 		 0
LogIndex 		 0 		 0
CallTraces 		 0 		 0
TxLookup 		 0 		 0
Finish 			 0 		 0
--
prune distance: --prune=

INFO[03-17|04:17:34.412] database closed                          label=chaindata

@AskAlexSharov
Copy link
Collaborator

@mcseemk - you have no data. or your datadir is wrong - try manually resolve ~. or nothing were synced - show erigon logs.

@mcseemk
Copy link

mcseemk commented Mar 17, 2022

The database was fully synced before upgrade from erigon 2022.02.04-beta.

The datadir is correct.

$ cd ~/.local/share/erigon/
$ pwd
/home/mcs/.local/share/erigon

Erigon logs:

INFO[03-17|05:44:57.994] Build info                               git_branch=devel git_tag=v2022.03.01 git_commit=426d97235603097b1810978abf9ed9ace382e750
INFO[03-17|05:44:57.994] Starting Erigon on Ethereum mainnet... 
INFO[03-17|05:44:57.995] Maximum peer count                       ETH=100 total=100
INFO[03-17|05:44:57.995] Set global gas cap                       cap=50000000
INFO[03-17|05:44:58.034] Opening Database                         label=chaindata path=/home/mcs/.local/share/erigon/chaindata
INFO[03-17|05:44:58.035] Initialised chain configuration          config="{ChainID: 1 Homestead: 1150000 DAO: 1920000 DAOSupport: true EIP150: 2463000 EIP155: 2675000 EIP158: 2675000 Byzantium: 4370000 Constantinople: 7280000 Petersburg: 7280000 Istanbul: 9069000, Muir Glacier: 9200000, Berlin: 12244000, London: 12965000, Arrow Glacier: 13773000, Terminal Total Difficulty: <nil>, Engine: ethash}" genesis=0xd4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3
INFO[03-17|05:44:58.035] Disk storage enabled for ethash DAGs     dir=/home/mcs/.local/share/erigon/ethash-dags count=2
INFO[03-17|05:44:58.035] Initialising Ethereum protocol           network=1
INFO[03-17|05:44:58.035] Effective                                prune_flags="--prune=" snapshot_flags=
INFO[03-17|05:44:58.429] Starting private RPC server              on=127.0.0.1:9090
INFO[03-17|05:44:58.432] Started P2P networking                   version=66 self=enode://a757aeb175240040f3ed89a37fe727f4becada434d7fc67943afb481169f7a96f73cb1d5e51bc5b58286e2fc351501e7294183d9a62527bf3ddbb5376c1af6b8@127.0.0.1:30303 name=erigon/v2022.99.99-dev-426d9723/linux-amd64/go1.18
INFO[03-17|05:44:58.440] [1/16 Headers] Waiting for headers...    from=14394401
INFO[03-17|05:45:28.440] [1/16 Headers] Wrote block headers       number=14394401 blk/second=0.000 alloc="54.17 MiB" sys="73.20 MiB"
INFO[03-17|05:45:58.440] [1/16 Headers] Wrote block headers       number=14394401 blk/second=0.000 alloc="43.86 MiB" sys="73.45 MiB"
INFO[03-17|05:46:28.440] [1/16 Headers] Wrote block headers       number=14394401 blk/second=0.000 alloc="207.60 MiB" sys="239.65 MiB"
INFO[03-17|05:46:58.035] [p2p] GoodPeers                          eth66=2
INFO[03-17|05:46:58.440] [1/16 Headers] Wrote block headers       number=14394401 blk/second=0.000 alloc="206.85 MiB" sys="239.65 MiB"
INFO[03-17|05:47:28.440] [1/16 Headers] Wrote block headers       number=14394401 blk/second=0.000 alloc="253.15 MiB" sys="277.97 MiB"
INFO[03-17|05:47:58.440] [1/16 Headers] Wrote block headers       number=14394401 blk/second=0.000 alloc="278.33 MiB" sys="307.00 MiB"
INFO[03-17|05:48:28.440] [1/16 Headers] Wrote block headers       number=14394401 blk/second=0.000 alloc="327.87 MiB" sys="361.17 MiB"
INFO[03-17|05:48:58.035] [p2p] GoodPeers                          eth66=12
INFO[03-17|05:48:58.440] [1/16 Headers] Wrote block headers       number=14394401 blk/second=0.000 alloc="258.16 MiB" sys="369.55 MiB"
INFO[03-17|05:49:28.440] [1/16 Headers] Wrote block headers       number=14394401 blk/second=0.000 alloc="312.14 MiB" sys="369.92 MiB"

@leafyoung
Copy link
Author

leafyoung commented Mar 17, 2022

  1. Do you see 0 "GoodPeers" in logs? Probably not

    1. then it means - by some reason we have peers but have no incoming headers... not sure what it means... we will investigate

    2. as the hotfix you can do:

make integration
./build/bin/integration state_stages --datadir=<your_datadir> --unwind=100 --integrity.fast=false
./build/bin/integration stage_bodies --datadir=<your_datadir> --unwind=100 
./build/bin/integration stage_headers --datadir=<your_datadir> --unwind=100
# start Erigon as usually

Thanks. This fix worked partially.

  1. The unwinding of the bodies didn't work but Erigon has resumed syncing after unwinding the stages and headers.

The unwind bodies output is shown below. It was stuck with 14378699. The stages and headers unwinding had shown it went back 100 blocks earlier to 14378599.

./integration stage_bodies --datadir=/mnt/Erigon/eth/mainnet --unwind=200
INFO[03-17|15:14:45.415] Initialised chain configuration          config="{ChainID: 1 Homestead: 1150000 DAO: 1920000 DAOSupport: true EIP150: 2463000 EIP155: 2675000 EIP158: 2675000 Byzantium: 4370000 Constantinople: 7280000 Petersburg: 7280000 Istanbul: 9069000, Muir Glacier: 9200000, Berlin: 12244000, London: 12965000, Arrow Glacier: 13773000, Terminal Total Difficulty: <nil>, Engine: ethash}"
INFO[03-17|15:14:45.628] Progress                                 bodies=14378699
INFO[03-17|15:14:45.634] database closed                          label=chaindata
  1. After resuming, I got a few errors with mdb
EROR[03-17|15:34:33.173] Staged Sync                              err="[4/16 Bodies] writing block body: failed to WriteRawTransactions: bucket: BlockTransaction, mdbx_cursor_put: MDBX_EKEYMISMATCH: The given key value is mismatched to the current cursor position"
INFO[03-17|15:34:33.358] [4/16 Bodies] Processing bodies...       from=14378499 to=14402693
INFO[03-17|15:34:34.539] [p2p] GoodPeers                          eth66=8
EROR[03-17|15:34:40.532] Staged Sync                              err="[4/16 Bodies] writing block body: failed to WriteRawTransactions: bucket: BlockTransaction, mdbx_cursor_put: MDBX_EKEYMISMATCH: The given key value is mismatched to the current cursor position"
INFO[03-17|15:34:40.719] [4/16 Bodies] Processing bodies...       from=14378499 to=14402693
EROR[03-17|15:34:52.396] Staged Sync                              err="[4/16 Bodies] writing block body: failed to WriteRawTransactions: bucket: BlockTransaction, mdbx_cursor_put: MDBX_EKEYMISMATCH: The given key value is mismatched to the current cursor position"
INFO[03-17|15:34:52.576] [4/16 Bodies] Processing bodies...       from=14378499 to=14402693
EROR[03-17|15:34:59.662] Staged Sync                              err="[4/16 Bodies] writing block body: failed to WriteRawTransactions: bucket: BlockTransaction, mdbx_cursor_put: MDBX_EKEYMISMATCH: The given key value is mismatched to the current cursor position"

@leafyoung leafyoung reopened this Mar 17, 2022
@AskAlexSharov
Copy link
Collaborator

@mcseemk "try manually resolve ~": `./build/bin/integration print_stages --datadir=/home/mcs/.local/share/erigon"

@mcseemk
Copy link

mcseemk commented Mar 17, 2022

./build/bin/integration print_stages --datadir=/home/mcs/.local/share/erigon

mdbx_setup_dxb:15225 filesize mismatch (expect 2120749670400b/517761150p, have 2121713844224b/517996544p)
Note: prune_at doesn't mean 'all data before were deleted' - it just mean stage.Prune function were run to this block. Because 1 stage may prune multiple data types to different prune distance.

 			 stage_at 	 prune_at
Headers 		 14394401 	 0
BlockHashes 		 14394401 	 0
Bodies 			 14394401 	 0
Senders 		 14394401 	 0
Execution 		 14394401 	 14394401
Translation 		 0 		 0
HashState 		 14394401 	 0
IntermediateHashes 	 14394401 	 14394401
AccountHistoryIndex 	 14394401 	 0
StorageHistoryIndex 	 14394401 	 0
LogIndex 		 14394401 	 0
CallTraces 		 14394401 	 14394401
TxLookup 		 14394401 	 0
Finish 			 14394401 	 0
--
prune distance: --prune=

INFO[03-17|08:10:37.945] database closed                          label=chaindata

@AskAlexSharov
Copy link
Collaborator

@mcseemk ok, now you know what to do

@mcseemk
Copy link

mcseemk commented Mar 17, 2022

After re-running integration erigon behaves slightly differently, but still seems to be stuck. There are lots of 'Invalidating anchor' messages in logs.

INFO[03-17|09:52:12.663] [p2p] GoodPeers                          eth66=18
WARN[03-17|09:52:19.976] Invalidating anchor                      height=14403123 hash=0x28dc4696343e81e57ae5342f878d6fa63fd636be48d0aa6864899e4a751d2c40 reason="suspected unavailability"
INFO[03-17|09:52:39.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="381.20 MiB" sys="1.62 GiB"
INFO[03-17|09:53:09.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="401.82 MiB" sys="1.62 GiB"
INFO[03-17|09:53:39.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="429.27 MiB" sys="1.62 GiB"
INFO[03-17|09:54:09.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="312.63 MiB" sys="1.62 GiB"
WARN[03-17|09:54:11.972] Invalidating anchor                      height=14403128 hash=0x1b2f39b801224a15b91a9ee70ee4417b39b261b274da388f09538a15f77bad41 reason="suspected unavailability"
INFO[03-17|09:54:12.664] [p2p] GoodPeers                          eth66=19
INFO[03-17|09:54:39.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="336.39 MiB" sys="1.62 GiB"
INFO[03-17|09:55:09.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="353.21 MiB" sys="1.62 GiB"
INFO[03-17|09:55:39.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="374.34 MiB" sys="1.62 GiB"
WARN[03-17|09:55:41.972] Invalidating anchor                      height=14403122 hash=0x66bc77dc8c19bbbe9fdb517d2b00f5f1182e26563a6743648858fa751757087c reason="suspected unavailability"
INFO[03-17|09:56:09.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="316.21 MiB" sys="1.62 GiB"
INFO[03-17|09:56:12.663] [p2p] GoodPeers                          eth66=18
INFO[03-17|09:56:39.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="342.37 MiB" sys="1.62 GiB"
WARN[03-17|09:56:53.154] Invalidating anchor                      height=14403141 hash=0x74d20e386b0d59057e15fd7acac622606905f510dd5fcdee7c367629245fdca8 reason="suspected unavailability"
INFO[03-17|09:57:09.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="368.11 MiB" sys="1.62 GiB"
INFO[03-17|09:57:39.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="398.62 MiB" sys="1.62 GiB"
INFO[03-17|09:58:09.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="309.99 MiB" sys="1.62 GiB"
INFO[03-17|09:58:12.663] [p2p] GoodPeers                          eth66=18
INFO[03-17|09:58:39.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="332.49 MiB" sys="1.62 GiB"
WARN[03-17|09:59:00.059] Invalidating anchor                      height=14403149 hash=0x484a21d07a032e8c1c8b71545a7e865f5f0d4b6c7c98ce350c086975aced9857 reason="suspected unavailability"
INFO[03-17|09:59:09.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="356.12 MiB" sys="1.62 GiB"
INFO[03-17|09:59:39.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="417.77 MiB" sys="1.62 GiB"
INFO[03-17|10:00:09.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="310.81 MiB" sys="1.62 GiB"
INFO[03-17|10:00:12.663] [p2p] GoodPeers                          eth66=16
INFO[03-17|10:00:39.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="329.61 MiB" sys="1.62 GiB"
WARN[03-17|10:00:50.649] Invalidating anchor                      height=14403158 hash=0x04c1db8c00880a4014622f8f964de20bbc579b816a64716f71d7839768a699c5 reason="suspected unavailability"
INFO[03-17|10:01:09.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="352.30 MiB" sys="1.62 GiB"
INFO[03-17|10:01:39.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="375.18 MiB" sys="1.62 GiB"
INFO[03-17|10:02:09.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="304.72 MiB" sys="1.62 GiB"
INFO[03-17|10:02:12.663] [p2p] GoodPeers                          eth66=16
INFO[03-17|10:02:39.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="348.41 MiB" sys="1.62 GiB"
WARN[03-17|10:02:58.981] Invalidating anchor                      height=14403167 hash=0xbc07632ca1fbf3a2e5919d9bca6a3b7cec726c846a78a774d3b22e142f1b2928 reason="suspected unavailability"
INFO[03-17|10:03:09.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="379.57 MiB" sys="1.62 GiB"
INFO[03-17|10:03:39.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="402.53 MiB" sys="1.62 GiB"
INFO[03-17|10:04:09.971] [1/16 Headers] Wrote block headers       number=14394201 blk/second=0.000 alloc="311.93 MiB" sys="1.62 GiB"
INFO[03-17|10:04:12.664] [p2p] GoodPeers                          eth66=18

@garyng2000
Copy link

I am having these stuck issue too when testing kiln. on the same machine, I have both geth and erigon running(and lighthouse/prysm too) so hardware or network should not be a factor. geth sync properly throughout but erigon are very iffy, sometimes it works for a while then it stuck which resulting the beacon node(doesn't matter if it is lighthouse/prysm) starts to complain. If I switch them back to geth, everything works fine

@IanBarnesZa
Copy link

I am having exactly the same behaviour as mcseemk, after resolving the ~ into a full path all three commands worked, and upon startup it went through many blocks, but is now stuck on the same block it was previously with the same Invalidating Anchor error.

@AskAlexSharov
Copy link
Collaborator

@IanBarnesZa we are investigating. you can try unwind 1000 block instead of 100 (but don't go over 90K)

@garyng2000
Copy link

I tried the 'unwind' but it gives me incompatible genesis, I am running the kiln testnet though

@AskAlexSharov
Copy link
Collaborator

set --chain parameter

@garyng2000
Copy link

--chain kiln give me segfault

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x118 pc=0x13d6c68]

goroutine 1 [running]:
github.com/ledgerwatch/erigon/cmd/integration/commands.newSync(0x2598ff8, 0xc000a46dc0, 0x25a2368, 0xc000a62b40, 0xc0009efbf0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
github.com/ledgerwatch/erigon/cmd/integration/commands/stages.go:1070 +0x2a8
github.com/ledgerwatch/erigon/cmd/integration/commands.syncBySmallSteps(0x25a2368, 0xc000a62b40, 0x0, 0x0, 0x0, 0x0, 0x3155eb8, 0x0, 0x0, 0x3155eb8, ...)
github.com/ledgerwatch/erigon/cmd/integration/commands/state_stages.go:150 +0xbd
github.com/ledgerwatch/erigon/cmd/integration/commands.glob..func23(0x2e62e80, 0xc0000b4280, 0x0, 0x5, 0x0, 0x0)
github.com/ledgerwatch/erigon/cmd/integration/commands/state_stages.go:65 +0x278
github.com/spf13/cobra.(*Command).execute(0x2e62e80, 0xc0000b4230, 0x5, 0x5, 0x2e62e80, 0xc0000b4230)
github.com/spf13/cobra@v1.2.1/command.go:856 +0x472
github.com/spf13/cobra.(*Command).ExecuteC(0x2e64280, 0xc000a46c80, 0xc000a40d90, 0xe)
github.com/spf13/cobra@v1.2.1/command.go:974 +0x375
github.com/spf13/cobra.(*Command).Execute(...)
github.com/spf13/cobra@v1.2.1/command.go:902
github.com/spf13/cobra.(*Command).ExecuteContext(...)
github.com/spf13/cobra@v1.2.1/command.go:895
main.main()
github.com/ledgerwatch/erigon/cmd/integration/main.go:15 +0x129

@AskAlexSharov
Copy link
Collaborator

about Klin better ask in #3451

@mandrigin
Copy link
Collaborator

@garyng2000 for Kiln, make sure that you run from devel branch and at least 3dbd6bef1f0c6524ba8addcb34a584a0d9f6e81c or later (I have my validator on Kiln setup with 3dbd6bef1f0c6524ba8addcb34a584a0d9f6e81c and it works just fine).

My command-line for Kiln is

./build/bin/erigon --datadir /mnt/ssd1/kiln-devnet  --bootnodes="enode://c354db99124f0faf677ff0e75c3cbbd568b2febc186af664e0c51ac435609badedc67a18a63adb64dacc1780a28dcefebfc29b83fd1a3f4aa3c0eb161364cf94@164.92.130.5:30303,enode://d41af1662434cad0a88fe3c7c92375ec5719f4516ab6d8cb9695e0e2e815382c767038e72c224e04040885157da47422f756c040a9072676c6e35c5b1a383cce@138.68.66.103:30303" --networkid=1337802 --http --http.api=engine,net,eth --maxpeers=400

see this guide: https://hackmd.io/@4_PBxu6jQtO7qASCOn0H6w/ByrxApj-9

@garyng2000
Copy link

@mandrigin many thanks

3dbd6be seems to be the culprit as I pull/build again which contains this commit and now seems to be working fine(without any change to other params as it was sort of working but just stuck after a while)

@mandrigin
Copy link
Collaborator

@garyng2000 yep, it fixes a root hash miscalculation at one of the post-merge blocks and similar ones. Without this fix it is impossible to go beyond that block, Erigon goes into a loop.

@tjayrush
Copy link
Contributor

tjayrush commented Mar 20, 2022

I'm seeing this as well. I get repeated

INFO[03-20|09:45:30.934] [1/16 Headers] Wrote block headers       number=14418230 blk/second=0.000 alloc="202.42 MiB" sys="271.15 MiB"
INFO[03-20|09:46:00.933] [1/16 Headers] Wrote block headers       number=14418230 blk/second=0.000 alloc="220.02 MiB" sys="271.40 MiB"
INFO[03-20|09:46:30.933] [1/16 Headers] Wrote block headers       number=14418230 blk/second=0.000 alloc="200.78 MiB" sys="271.40 MiB"

I did rewind a couple of times

./build/bin/integration state_stages --datadir=/mnt/md0/erigon --unwind=2000 --integrity.fast=false

(rewound 100, 100, 1000, and 2000 blocks)

And now get this:

jrush@linux:~/D/erigon|devel⚡?➤ ./build/bin/integration print_stages --datadir /mnt/md0/erigon
Note: prune_at doesn't mean 'all data before were deleted' - it just mean stage.Prune function were run to this block. Because 1 stage may prune multiple data types to different prune distance.

 			 stage_at 	 prune_at
Headers 		 14418230 	 0
BlockHashes 		 14418230 	 0
Bodies 			 14418230 	 0
Senders 		 14418230 	 0
Execution 		 14415030 	 14415030
Translation 		 0 		 0
HashState 		 14415030 	 0
IntermediateHashes 	 14415030 	 14415030
AccountHistoryIndex 	 14415030 	 0
StorageHistoryIndex 	 14415030 	 0
LogIndex 		 14415030 	 0
CallTraces 		 14415030 	 14415030
TxLookup 		 14415030 	 0
Finish 			 14418230 	 0
--
prune distance: --prune=

INFO[03-20|09:46:48.415] database closed                          label=chaindata

I thought it was a little odd that all the stages after Execution were set back, but Finish wasn't.

@MaximusDecimus
Copy link

Hey

Having the same issue here on mainnet. Unwind 100 and 1000 didnt help

Mar 22 14:33:57 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:33:57.093] Opening Database                         label=chaindata path=/root/.local/share/erigon/chaindata
Mar 22 14:33:57 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:33:57.094] Initialised chain configuration          config="{ChainID: 1 Homestead: 1150000 DAO: 1920000 DAOSupport: true EIP150: 2463000 EIP155: 2675000 EIP158: 2675000 Byzantium: 4370000 Constantinople: 7280000 Petersburg: 7280000 Istanbul: 9069000, Muir Glacier: 9200000, Berlin: 12244000, London: 12965000, Arrow Glacier: 13773000, Terminal Total Difficulty: <nil>, Engine: ethash}" genesis=0xd4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3
Mar 22 14:33:57 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:33:57.094] Disk storage enabled for ethash DAGs     dir=/root/.local/share/erigon/ethash-dags count=2
Mar 22 14:33:57 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:33:57.094] Initialising Ethereum protocol           network=1
Mar 22 14:33:57 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:33:57.094] Effective                                prune_flags="--prune=" snapshot_flags=
Mar 22 14:33:57 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:33:57.431] Starting private RPC server              on=localhost:9090
Mar 22 14:33:57 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:33:57.437] Started P2P networking                   version=66 self=enode://78a45bf7716b7135d155e9d6fe7be7a1b918b2137c0730d4fcda72176e9bb3afceb5ba789b397b158f952b40e314feded051258aa3bc46ea2b0f6854e07dc92f@127.0.0.1:30303 name=erigon/v2022.99.99-dev-4cf4c1ba/linux-amd64/go1.17.5
Mar 22 14:33:57 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:33:57.442] [1/16 Headers] Waiting for headers...    from=14430200
Mar 22 14:34:08 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:34:08.210] new subscription to newHeaders established
Mar 22 14:34:08 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:34:08.210] New txs subscriber joined
Mar 22 14:34:27 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:34:27.443] [1/16 Headers] Wrote block headers       number=14436386 blk/second=206.200 alloc="299.51 MiB" sys="332.36 MiB"
Mar 22 14:34:33 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:34:33.919] [1/16 Headers] Processed                 highest inserted=14436387 age=42s
Mar 22 14:34:34 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:34:34.035] [4/16 Bodies] Processing bodies...       from=14431200 to=14436387
Mar 22 14:34:51 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:34:51.855] [4/16 Bodies] Processed                  highest=14436387
Mar 22 14:34:51 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:34:51.855] [5/16 Senders] Started                   from=14430200 to=14436387
Mar 22 14:34:51 Ubuntu-1804-bionic-64-minimal erigon[893562]: [EROR] [03-22|14:34:51.902] [5/16 Senders] Error recovering senders for block 14430211 e0fb8e1d205311456ab660f12256bce53909eb3c74a9fa38262a4d2dd3cde308): 5/16 Senders: error recovering sender for tx=77da1edb1aadb006a4cdd17d8e6eb6c38a52e96725c773116d9e1bcd5d6f3e9e, invalid transaction v, r, s values
Mar 22 14:34:51 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:34:51.902] UnwindTo                                 block=14430210 bad_block_hash=0xe0fb8e1d205311456ab660f12256bce53909eb3c74a9fa38262a4d2dd3cde308
Mar 22 14:35:00 Ubuntu-1804-bionic-64-minimal erigon[893562]: [EROR] [03-22|14:35:00.354] [5/16 Senders] Error recovering senders for block 14430204 fd3bcaf52dfd166cbfdffcc23c57534ef159f8c372af162b215b89a897041fdc): 5/16 Senders: error recovering sender for tx=eed266a786459b9c2d72faac7038b078f92def1e423b27fe257ed8f2fe059b7c, invalid transaction v, r, s values
Mar 22 14:35:00 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:35:00.354] UnwindTo                                 block=14430203 bad_block_hash=0xfd3bcaf52dfd166cbfdffcc23c57534ef159f8c372af162b215b89a897041fdc
Mar 22 14:35:00 Ubuntu-1804-bionic-64-minimal erigon[893562]: [EROR] [03-22|14:35:00.426] [6/16 Execution] Execution failed        block=14430201 hash=0x066d9952f380e9b9d24f73c3064c7f92e3cf497b9db8d2e7b434ec953eea4ac5 err="could not apply tx 0 from block 14430201 [0xed783100fa25f8f1ef36fd01a05c59b86657101b89807428630de91038673ddf]: nonce too low: address 0x13Ba6D703C312b7E00765c7FDb3328482C4F7f7E, tx: 78 state: 121"
Mar 22 14:35:00 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:35:00.426] UnwindTo                                 block=14430200 bad_block_hash=0x066d9952f380e9b9d24f73c3064c7f92e3cf497b9db8d2e7b434ec953eea4ac5
Mar 22 14:35:00 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:35:00.426] [6/16 Execution] Completed on            block=14430200
Mar 22 14:35:00 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:35:00.957] Timings (slower than 10ms)               Headers=36.477s BlockHashes=114ms Bodies=17.82s Senders=47ms Unwind Bodies=8.394s Unwind Headers=33ms Senders=21ms Unwind Headers=30ms Senders=31ms Unwind Bodies=496ms Unwind Headers=29ms
Mar 22 14:35:00 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:35:00.959] RPC Daemon notified of new headers       from=14430200 to=14436387
Mar 22 14:35:00 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:35:00.959] [1/16 Headers] Waiting for headers...    from=14430200
Mar 22 14:35:03 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:35:03.219] [txpool] Started
Mar 22 14:35:30 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:35:30.960] [1/16 Headers] Wrote block headers       number=14430200 blk/second=0.000 alloc="421.45 MiB" sys="1.22 GiB"
Mar 22 14:35:57 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:35:57.094] [p2p] GoodPeers                          eth66=7
Mar 22 14:35:57 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:35:57.440] [txpool] stat                            block=14430201 pending=692 baseFee=30000 queued=16254 alloc_mb=354 sys_mb=1249
Mar 22 14:36:00 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:36:00.960] [1/16 Headers] Wrote block headers       number=14430200 blk/second=0.000 alloc="451.83 MiB" sys="1.22 GiB"
Mar 22 14:36:13 Ubuntu-1804-bionic-64-minimal erigon[893562]: [WARN] [03-22|14:36:13.956] Invalidating anchor                      height=14436388 hash=0x6b60e18581dbd13f16f8c0181df9f7a8b8578ae40aca5a630f606395e4f5e496 reason="suspected unavailability"
Mar 22 14:36:30 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:36:30.961] [1/16 Headers] Wrote block headers       number=14430200 blk/second=0.000 alloc="318.36 MiB" sys="1.22 GiB"
Mar 22 14:36:57 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:36:57.432] [txpool] stat                            block=14430201 pending=773 baseFee=30000 queued=16562 alloc_mb=372 sys_mb=1249
Mar 22 14:37:00 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:37:00.961] [1/16 Headers] Wrote block headers       number=14430200 blk/second=0.000 alloc="388.44 MiB" sys="1.22 GiB"
Mar 22 14:37:30 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:37:30.961] [1/16 Headers] Wrote block headers       number=14430200 blk/second=0.000 alloc="460.37 MiB" sys="1.22 GiB"
Mar 22 14:37:57 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:37:57.094] [p2p] GoodPeers                          eth66=9
Mar 22 14:37:57 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:37:57.440] [txpool] stat                            block=14430201 pending=901 baseFee=30000 queued=16929 alloc_mb=283 sys_mb=1249
Mar 22 14:38:00 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:38:00.961] [1/16 Headers] Wrote block headers       number=14430200 blk/second=0.000 alloc="339.00 MiB" sys="1.22 GiB"
Mar 22 14:38:06 Ubuntu-1804-bionic-64-minimal erigon[893562]: [WARN] [03-22|14:38:06.965] Invalidating anchor                      height=14436391 hash=0x23b021f4659187e7e36a1c8212c0a0358ad1634ade993b5c2069d56c14406a30 reason="suspected unavailability"
Mar 22 14:38:30 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:38:30.960] [1/16 Headers] Wrote block headers       number=14430200 blk/second=0.000 alloc="363.56 MiB" sys="1.22 GiB"
Mar 22 14:38:51 Ubuntu-1804-bionic-64-minimal erigon[893562]: [WARN] [03-22|14:38:51.187] Invalidating anchor                      height=14436397 hash=0x84db666578c530039ac0aed6cdd99cfe4a6d378446713331396fecea6cc4f820 reason="suspected unavailability"
Mar 22 14:39:00 Ubuntu-1804-bionic-64-minimal erigon[893562]: [INFO] [03-22|14:39:00.961] [1/16 Headers] Wrote block headers       number=14430200 blk/second=0.000 alloc="401.47 MiB" sys="1.22 GiB"

@leafyoung
Copy link
Author

To round up my story with stuck with sync, I did another sync for 3 days. I have a working chain data again.

@MaximusDecimus
Copy link

@leafyoung from the start?

@leafyoung
Copy link
Author

@leafyoung from the start?

Yes, from the start. I reported earlier that unwinding of bodies was not successful. If this could work, I would guess that re-syncing could be avoided.

The unwind bodies output is shown below. It was stuck with 14378699. The stages and headers unwinding had shown it went back 100 blocks earlier to 14378599.

@data-dude92
Copy link

also having the same issue on mainnet. reverting back 1000 or 5000 blocks does not work.

is resyncing the entire chain the only option?

@MaximusDecimus
Copy link

Looks like it is for the moment being

@AskAlexSharov
Copy link
Collaborator

You also may try to increase --maxpeers flag (it helped some people on BSC network)

@iFA88
Copy link
Contributor

iFA88 commented Mar 25, 2022

Sadly have the same issue for ropsten with erigon 2022.03.1-beta:
image
The top is an openethereum ropsten node which has my erigon as reserved peer.

@AskAlexSharov
Copy link
Collaborator

try new release, thanks

@brockelmore
Copy link

just hit this on latest devel. A simple stop, integration print_stages & restart of the process seemed to have fixed it.

Peers were > 0,

[brock@main erigon]$ ./build/bin/integration print_stages --datadir=../erigon_db
Note: prune_at doesn't mean 'all data before were deleted' - it just mean stage.Prune function were run to this block. Because 1 stage may prune multiple data types to different prune distance.

                         stage_at        prune_at
Headers                  463872          0
BlockHashes              0               0
Bodies                   0               0
Senders                  0               0
Execution                0               0
Translation              0               0
HashState                0               0
IntermediateHashes       0               0
AccountHistoryIndex      0               0
StorageHistoryIndex      0               0
LogIndex                 0               0
CallTraces               0               0
TxLookup                 0               0
Finish                   0               0

Just kept looping like:

INFO[05-25|10:47:59.941] [1/16 Headers] Wrote block headers       number=463872 blk/second=0.000 alloc=736.0Mb sys=1.2Gb
INFO[05-25|10:47:59.941] Req/resp stats                           req=0 reqMin=0 reqMax=0 skel=16 skelMin=465408 skelMax=760128 resp=14 respMin=465408 respMax=758784
INFO[05-25|10:47:59.949] {  464064-465408 links=1345 (464064-465408)} => 8a8ca133ee04f032ff775c20e538a6c604525c08a3af4befa144d29167c55749
INFO[05-25|10:47:59.949] {  466560-592896 links=126337 (466560-592896)} => 31eebb4ec4a3d7a66f54dd6d0307a693ff86dfc99967d422b0a86edbfb63279a
INFO[05-25|10:47:59.949] {  593280-619008 links=25729 (593280-619008)} => 49949a12f44a423f2bc04586cd270b5403ea523b406fdfe3dadc248e0998309c
...

@AskAlexSharov
Copy link
Collaborator

good to hear

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests