Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syncing stuck due to lack of peers #1794

Closed
sirgarfieldc opened this issue Aug 2, 2023 · 6 comments
Closed

Syncing stuck due to lack of peers #1794

sirgarfieldc opened this issue Aug 2, 2023 · 6 comments
Labels
X-nodesync task filter for node sync issue: full, snap, light...

Comments

@sirgarfieldc
Copy link

sirgarfieldc commented Aug 2, 2023

I am syncing my BSC node.

Cmd

./build/bin/geth --config ./config.toml --datadir ./node  --cache 8000 --rpc.allow-unprotected-txs --txlookuplimit 0

I was able to download all the block from genesis until 5 days from now.

Then, the syncing went into this loop without making further progress:

Very low number of peer drops to 0
->
Rewinding blockchain
->
Synchronisation failed
->
Roll back chain segment
->
Looking for peers ( with 0 peer)
->
Find some peer and download a bunch of blocks & receipt 

I am not able to sync block with age less than 5 days.

Example Console Printout:
Screenshot 2023-08-02 at 9 52 15 PM

Questions:

  1. What is wrong ? Why can't I sync stuff within 5 days ?
  2. How should I crawl a list of high quality (healthy and synced) peers so that I can put it under the static node configuration ( under config.toml [Node.P2P] section) ?
@sirgarfieldc sirgarfieldc changed the title How to obtain a high quality list of peers Syncing stuck due to lack of peers Aug 2, 2023
@deepcrazy
Copy link

Hi there, Which geth version are you using?
I would recommend to use latest geth version: https://github.com/bnb-chain/bsc/releases/tag/v1.2.9.

Secondly, you can append the static nodes from here: https://api.binance.org/v1/discovery/peers in your config.toml file.

In addition to this:

  • Can you provide the output of net.peerCount? If it is less, then try to increase maxpeers value by adding this flag --maxpeers 200 in your geth startup command and let your node run for a while.
  • Can you provide your hardware specs? (Note: It should fulfill: https://docs.bnbchain.org/docs/validator/fullnode#suggested-requirements if running node using snapshot)
  • Can you confirm if you're using snapshot and if so, which snapshot you used?

@sirgarfieldc
Copy link
Author

sirgarfieldc commented Aug 2, 2023

using 1.2.9

you can append the static nodes from here: https://api.binance.org/v1/discovery/peers in your config.toml file.
Tried it. Didn't help

Not using snapshot.

Hardware spec:

16 core
128GB RAM

Disk is 3.5TB Nvme SSD

root@Ubuntu-2204-jammy-amd64-base ~/disk1/workspace/bsc # hdparm -tT /dev/nvme1n1

/dev/nvme1n1:
 Timing cached reads:   45976 MB in  2.00 seconds = 23021.61 MB/sec
 Timing buffered disk reads: 10384 MB in  3.00 seconds = 3460.90 MB/sec
root@Ubuntu-2204-jammy-amd64-base ~/disk1/workspace/bsc # fio --randrepeat=1 --ioengine=posixaio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=64
fio-3.33
Starting 1 process
test: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=52.5MiB/s,w=17.3MiB/s][r=13.4k,w=4423 IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=485132: Wed Aug  2 22:00:50 2023
  read: IOPS=12.9k, BW=50.6MiB/s (53.0MB/s)(3070MiB/60731msec)
   bw (  KiB/s): min=42680, max=55544, per=100.00%, avg=51783.21, stdev=3114.91, samples=121
   iops        : min=10670, max=13886, avg=12945.80, stdev=778.73, samples=121
  write: IOPS=4324, BW=16.9MiB/s (17.7MB/s)(1026MiB/60731msec); 0 zone resets
   bw (  KiB/s): min=14672, max=19552, per=100.00%, avg=17307.77, stdev=1130.63, samples=121
   iops        : min= 3668, max= 4888, avg=4326.94, stdev=282.66, samples=121
  cpu          : usr=2.74%, sys=0.39%, ctx=126198, majf=0, minf=22
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.7%, 16=12.7%, 32=73.9%, >=64=12.7%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=95.8%, 8=2.2%, 16=0.6%, 32=0.1%, 64=1.3%, >=64=0.0%
     issued rwts: total=785920,262656,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=50.6MiB/s (53.0MB/s), 50.6MiB/s-50.6MiB/s (53.0MB/s-53.0MB/s), io=3070MiB (3219MB), run=60731-60731msec
  WRITE: bw=16.9MiB/s (17.7MB/s), 16.9MiB/s-16.9MiB/s (17.7MB/s-17.7MB/s), io=1026MiB (1076MB), run=60731-60731msec

Disk stats (read/write):
  nvme1n1: ios=784695/262287, merge=0/41, ticks=53682/2889, in_queue=56571, util=99.89%

Does this disk must sync from snapshot ? Is it not powerful enough to sync from genesis ?

If disk is the issue, how does it explain the pattern I notice above ? ( peers stay at low number and continue to drop. Chain keeps rewinding)

@deepcrazy
Copy link

Hmm, Seems like you're having limited hardware resources. If you refer this doc: https://docs.bnbchain.org/docs/validator/fullnode#sync-from-genesis-block-not-recommended, then ideally you need heavy hardware specs and also more than 40K IOPS (my recommendation 60K IOPS)
The hardware specs, which you have, are suitable for running the full node using snapshot. So, I would recommend you running using node using snapshot from here: https://github.com/bnb-chain/bsc-snapshots#endpoint

If the hd specs are limited, then it will be hard to catch up the current block as the speed it will try to sync will not be to match with the blocks being generated.

JFYR: You can also refer this post: ethereum/go-ethereum#16796 (comment)

@sirgarfieldc
Copy link
Author

I am trying to download the snapshot at
https://pub-c0627345c16f47ab858c9469133073a8.r2.dev/geth-20230719.tar.lz4
with 3.8TB disk.

The process is not straightforward without further instruction.

If I use wget, then if the download errors out, i can use -c flag to continue the downlaod.

The problem is that after I download the zip file, there isn't enough space to host both the zip file and the un-compressed file. So it is not possible to decompress it.

Then I try to pipe the downloaded content to decompress process directly

wget -O  - https://pub-c0627345c16f47ab858c9469133073a8.r2.dev/geth-20230719.tar.lz4 | lz4 -dc | tar xvf - -C ./node

but the issue is that once the download errors out it is not possible to continue where it left off --> never being able to download the file since it took too long and will error out at least once

@0x090909
Copy link

0x090909 commented Aug 11, 2023

@sirgarfieldc you can try to download bnb48 snapshots now they provide it with local trie verification which is the same as the one that youre downloading but without ancient data.

with your disk you can download with aria2c and then extract without problems.

here it is:
https://github.com/48Club/bsc-snapshots

Screenshot 2023-08-11 at 09 19 05

if you stilll want to download with wget you have to use the --tries=0 parameter so wget is not capped to 20 maximum tries but has unlimited tries for network errors

@CocoStarZ
Copy link
Collaborator

No more follow-up, close it for now

@weiihann weiihann added the X-nodesync task filter for node sync issue: full, snap, light... label Dec 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
X-nodesync task filter for node sync issue: full, snap, light...
Projects
None yet
Development

No branches or pull requests

5 participants