Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eth/protocols/snap: make better use of delivered data #44

Merged
merged 11 commits into from
Apr 27, 2021

Conversation

holiman
Copy link

@holiman holiman commented Apr 21, 2021

This is a bit of a rough implementation, I haven't verifiied if it works totally correct.
Currently, when we retrieve a contract storage which is too large to fit in a single response, but otherwise "pretty small", then we discard data.

The received data might be [0x0..,,,0x1...., 0x4..]. We divide the 'space' into 16 chunks, and immediately fill the first chunk with the data. Then we trim the edge, and thus keep [0x0..,]' but throw away [0x1...., 0x4..], which will instead be fetched separately in later chunks.

This PR also adds the (so far not used) method estimateRemainingSlots, which can be used to further tune the level of parallelism we choose.

@holiman holiman requested a review from karalabe as a code owner April 21, 2021 12:25
@holiman
Copy link
Author

holiman commented Apr 21, 2021

Added a commit to make use of the storage estimator. Tested it out on goerli:

[user@work go-ethereum]$ cat gethlog.txt | grep orage 
INFO [04-21|15:12:16.137] Resuming state snapshot generation       root=5d6cde..8b3008 accounts=0 slots=0 storage=0.00B elapsed="285.66µs"
INFO [04-21|15:12:16.161] Generated state snapshot                 accounts=260 slots=0 storage=9.69KiB elapsed=24.464ms
INFO [04-21|15:16:09.217] Storage estimation                       delivered=10311 remaining=4775 parallelism=0
INFO [04-21|15:16:09.217] Created (large) storage sync task        account=1135f4..9e491e root=58f340..4bd365 from=000000..000000 last=ffffff..ffffff parallelism=0
INFO [04-21|15:16:13.611] Storage estimation                       delivered=8409  remaining=7405 parallelism=0
INFO [04-21|15:16:13.611] Created (large) storage sync task        account=04d07a..1ba51a root=0a6d2a..9584d7 from=000000..000000 last=ffffff..ffffff parallelism=0
INFO [04-21|15:16:19.035] Storage estimation                       delivered=11154 remaining=96061 parallelism=12
INFO [04-21|15:16:19.035] Created (large) storage sync task        account=127f3d..474d23 root=c02a4c..bdfac9 from=000000..000000 last=1aa1f2..f87861 parallelism=12
INFO [04-21|15:16:19.035] Created (large) storage sync task        account=127f3d..474d23 root=c02a4c..bdfac9 from=1aa1f2..f87862 last=2dbf1e..63c3af parallelism=12
INFO [04-21|15:16:19.035] Created (large) storage sync task        account=127f3d..474d23 root=c02a4c..bdfac9 from=2dbf1e..63c3b0 last=40dc4a..cf0efd parallelism=12
INFO [04-21|15:16:19.035] Created (large) storage sync task        account=127f3d..474d23 root=c02a4c..bdfac9 from=40dc4a..cf0efe last=53f975..3a5a4b parallelism=12
INFO [04-21|15:16:19.035] Created (large) storage sync task        account=127f3d..474d23 root=c02a4c..bdfac9 from=53f975..3a5a4c last=6716a1..a5a599 parallelism=12
INFO [04-21|15:16:19.035] Created (large) storage sync task        account=127f3d..474d23 root=c02a4c..bdfac9 from=6716a1..a5a59a last=7a33cd..10f0e7 parallelism=12
INFO [04-21|15:16:19.035] Created (large) storage sync task        account=127f3d..474d23 root=c02a4c..bdfac9 from=7a33cd..10f0e8 last=8d50f9..7c3c35 parallelism=12
INFO [04-21|15:16:19.035] Created (large) storage sync task        account=127f3d..474d23 root=c02a4c..bdfac9 from=8d50f9..7c3c36 last=a06e25..e78783 parallelism=12
INFO [04-21|15:16:19.035] Created (large) storage sync task        account=127f3d..474d23 root=c02a4c..bdfac9 from=a06e25..e78784 last=b38b50..52d2d1 parallelism=12
INFO [04-21|15:16:19.035] Created (large) storage sync task        account=127f3d..474d23 root=c02a4c..bdfac9 from=b38b50..52d2d2 last=c6a87c..be1e1f parallelism=12
INFO [04-21|15:16:19.035] Created (large) storage sync task        account=127f3d..474d23 root=c02a4c..bdfac9 from=c6a87c..be1e20 last=d9c5a8..29696d parallelism=12
INFO [04-21|15:16:19.035] Created (large) storage sync task        account=127f3d..474d23 root=c02a4c..bdfac9 from=d9c5a8..29696e last=ece2d4..94b4bb parallelism=12
INFO [04-21|15:16:19.035] Created (large) storage sync task        account=127f3d..474d23 root=c02a4c..bdfac9 from=ece2d4..94b4bc last=ffffff..ffffff parallelism=12
INFO [04-21|15:16:22.216] Storage estimation                       delivered=5347  remaining=10056 parallelism=1
INFO [04-21|15:16:22.216] Created (large) storage sync task        account=07f155..7a4bed root=13316a..47488a from=000000..000000 last=58dd13..97d5ae parallelism=1
INFO [04-21|15:16:22.216] Created (large) storage sync task        account=07f155..7a4bed root=13316a..47488a from=58dd13..97d5af last=ffffff..ffffff parallelism=1
INFO [04-21|15:16:25.197] Storage estimation                       delivered=8320  remaining=759   parallelism=0
INFO [04-21|15:16:25.197] Created (large) storage sync task        account=089a49..9b1acb root=a34bdf..86fc60 from=000000..000000 last=ffffff..ffffff parallelism=0
INFO [04-21|15:16:41.984] Storage estimation                       delivered=7164  remaining=1095  parallelism=0
INFO [04-21|15:16:41.984] Created (large) storage sync task        account=0a7cd7..714b5e root=524b7c..3bb854 from=000000..000000 last=ffffff..ffffff parallelism=0
INFO [04-21|15:16:43.973] Storage estimation                       delivered=8661  remaining=11954 parallelism=1
INFO [04-21|15:16:43.973] Created (large) storage sync task        account=0b2eb1..376216 root=787004..be0f82 from=000000..000000 last=6b8d27..14e298 parallelism=1
INFO [04-21|15:16:43.973] Created (large) storage sync task        account=0b2eb1..376216 root=787004..be0f82 from=6b8d27..14e299 last=ffffff..ffffff parallelism=1
INFO [04-21|15:16:45.736] Storage estimation                       delivered=6292  remaining=3680  parallelism=0
INFO [04-21|15:16:45.736] Created (large) storage sync task        account=0c9613..a599f3 root=2051a3..e500dc from=000000..000000 last=ffffff..ffffff parallelism=0
INFO [04-21|15:16:48.839] Storage estimation                       delivered=15102 remaining=31279 parallelism=3
INFO [04-21|15:16:48.839] Created (large) storage sync task        account=0def03..72c5ef root=c1c771..736669 from=000000..000000 last=535af8..627555 parallelism=3
INFO [04-21|15:16:48.839] Created (large) storage sync task        account=0def03..72c5ef root=c1c771..736669 from=535af8..627556 last=8ce750..ec4e39 parallelism=3
INFO [04-21|15:16:48.839] Created (large) storage sync task        account=0def03..72c5ef root=c1c771..736669 from=8ce750..ec4e3a last=c673a8..76271d parallelism=3
INFO [04-21|15:16:48.839] Created (large) storage sync task        account=0def03..72c5ef root=c1c771..736669 from=c673a8..76271e last=ffffff..ffffff parallelism=3
INFO [04-21|15:16:50.537] Storage estimation                       delivered=11210 remaining=7302  parallelism=0
INFO [04-21|15:16:50.537] Created (large) storage sync task        account=0e5471..ec3ecc root=79435f..7c5f1a from=000000..000000 last=ffffff..ffffff parallelism=0
INFO [04-21|15:16:50.832] Storage estimation                       delivered=13805 remaining=575,882 parallelism=16
INFO [04-21|15:16:50.832] Created (large) storage sync task        account=0e6140..d195c9 root=4f0730..f366ff from=000000..000000 last=05fe3e..c74f7e parallelism=16
INFO [04-21|15:16:50.833] Created (large) storage sync task        account=0e6140..d195c9 root=4f0730..f366ff from=05fe3e..c74f7f last=159e5a..2ada87 parallelism=16
INFO [04-21|15:16:50.833] Created (large) storage sync task        account=0e6140..d195c9 root=4f0730..f366ff from=159e5a..2ada88 last=253e76..8e6590 parallelism=16
INFO [04-21|15:16:50.833] Created (large) storage sync task        account=0e6140..d195c9 root=4f0730..f366ff from=253e76..8e6591 last=34de92..f1f099 parallelism=16
INFO [04-21|15:16:50.833] Created (large) storage sync task        account=0e6140..d195c9 root=4f0730..f366ff from=34de92..f1f09a last=447eaf..557ba2 parallelism=16
INFO [04-21|15:16:50.833] Created (large) storage sync task        account=0e6140..d195c9 root=4f0730..f366ff from=447eaf..557ba3 last=541ecb..b906ab parallelism=16
INFO [04-21|15:16:50.833] Created (large) storage sync task        account=0e6140..d195c9 root=4f0730..f366ff from=541ecb..b906ac last=63bee7..1c91b4 parallelism=16
INFO [04-21|15:16:50.833] Created (large) storage sync task        account=0e6140..d195c9 root=4f0730..f366ff from=63bee7..1c91b5 last=735f03..801cbd parallelism=16
INFO [04-21|15:16:50.833] Created (large) storage sync task        account=0e6140..d195c9 root=4f0730..f366ff from=735f03..801cbe last=82ff1f..e3a7c6 parallelism=16
INFO [04-21|15:16:50.833] Created (large) storage sync task        account=0e6140..d195c9 root=4f0730..f366ff from=82ff1f..e3a7c7 last=929f3b..4732cf parallelism=16
INFO [04-21|15:16:50.833] Created (large) storage sync task        account=0e6140..d195c9 root=4f0730..f366ff from=929f3b..4732d0 last=a23f57..aabdd8 parallelism=16
INFO [04-21|15:16:50.833] Created (large) storage sync task        account=0e6140..d195c9 root=4f0730..f366ff from=a23f57..aabdd9 last=b1df73..0e48e1 parallelism=16
INFO [04-21|15:16:50.833] Created (large) storage sync task        account=0e6140..d195c9 root=4f0730..f366ff from=b1df73..0e48e2 last=c17f8f..71d3ea parallelism=16
INFO [04-21|15:16:50.833] Created (large) storage sync task        account=0e6140..d195c9 root=4f0730..f366ff from=c17f8f..71d3eb last=d11fab..d55ef3 parallelism=16
INFO [04-21|15:16:50.833] Created (large) storage sync task        account=0e6140..d195c9 root=4f0730..f366ff from=d11fab..d55ef4 last=e0bfc7..38e9fc parallelism=16
INFO [04-21|15:16:50.833] Created (large) storage sync task        account=0e6140..d195c9 root=4f0730..f366ff from=e0bfc7..38e9fd last=f05fe3..9c7505 parallelism=16
INFO [04-21|15:16:50.833] Created (large) storage sync task        account=0e6140..d195c9 root=4f0730..f366ff from=f05fe3..9c7506 last=ffffff..ffffff parallelism=16
INFO [04-21|15:17:13.047] Storage estimation                       delivered=8454  remaining=7605    parallelism=0
INFO [04-21|15:17:13.047] Created (large) storage sync task        account=0f3152..e609c3 root=2aa9f9..d3273e from=000000..000000 last=ffffff..ffffff parallelism=0
INFO [04-21|15:17:13.411] Storage estimation                       delivered=8279  remaining=985,124 parallelism=16
INFO [04-21|15:17:13.412] Created (large) storage sync task        account=0ef61d..dbca40 root=e9076c..ec9a4f from=000000..000000 last=02222c..5f4e39 parallelism=16
INFO [04-21|15:17:13.412] Created (large) storage sync task        account=0ef61d..dbca40 root=e9076c..ec9a4f from=02222c..5f4e3a last=12000a..f95956 parallelism=16
INFO [04-21|15:17:13.412] Created (large) storage sync task        account=0ef61d..dbca40 root=e9076c..ec9a4f from=12000a..f95957 last=21dde7..936473 parallelism=16

EDIT: updated

@holiman
Copy link
Author

holiman commented Apr 21, 2021

These are interesting:

Storage estimation                       delivered=8320  remaining=759   parallelism=0
Created (large) storage sync task        account=089a49..9b1acb root=a34bdf..86fc60 from=000000..000000 last=ffffff..ffffff parallelism=0
Storage estimation                       delivered=7164  remaining=1095  parallelism=0
Created (large) storage sync task        account=0a7cd7..714b5e root=524b7c..3bb854 from=000000..000000 last=ffffff..ffffff parallelism=0
Storage estimation                       delivered=8661  remaining=11954 parallelism=1
Created (large) storage sync task        account=0b2eb1..376216 root=787004..be0f82 from=000000..000000 last=6b8d27..14e298 parallelism=1
Created (large) storage sync task        account=0b2eb1..376216 root=787004..be0f82 from=6b8d27..14e299 last=ffffff..ffffff parallelism=1
Storage estimation                       delivered=6292  remaining=3680  parallelism=0
Created (large) storage sync task        account=0c9613..a599f3 root=2051a3..e500dc from=000000..000000 last=ffffff..ffffff parallelism=0

In many cases, this PR avoid the chunking totally (no healing required), and in other cases, heavily reduces the healing required. Plus, in cases like delivered=8320 remaining=759 -- we would have otherwise chunked up ~9000 nodes into 16 segments, stored the first segment, thrown away the rest, to refetch in tiny slices from 16 peers.

@holiman
Copy link
Author

holiman commented Apr 21, 2021

Running this on mannet now. Some preliminary data:

$ cat gethlog.txt | grep "Storage estimation" | head -n100
INFO [04-21|22:14:06.106] Storage estimation                       delivered=13540 remaining=80924 chunks=5
INFO [04-21|22:14:06.467] Storage estimation                       delivered=10192 remaining=6541  chunks=1
INFO [04-21|22:14:09.170] Storage estimation                       delivered=10264 remaining=9562  chunks=1
INFO [04-21|22:14:10.134] Storage estimation                       delivered=16202 remaining=107,036 chunks=6
INFO [04-21|22:14:11.409] Storage estimation                       delivered=4432  remaining=1330    chunks=1
INFO [04-21|22:14:13.322] Storage estimation                       delivered=13386 remaining=6139    chunks=1
INFO [04-21|22:14:13.918] Storage estimation                       delivered=13550 remaining=40964   chunks=3
INFO [04-21|22:14:15.812] Storage estimation                       delivered=13106 remaining=74040   chunks=4
INFO [04-21|22:14:16.625] Storage estimation                       delivered=12988 remaining=6954    chunks=1
INFO [04-21|22:14:17.040] Storage estimation                       delivered=12221 remaining=93026   chunks=5
INFO [04-21|22:14:17.976] Storage estimation                       delivered=6599  remaining=22354   chunks=2
INFO [04-21|22:14:19.938] Storage estimation                       delivered=5820  remaining=5174    chunks=1
INFO [04-21|22:14:22.685] Storage estimation                       delivered=12453 remaining=545,759 chunks=16
INFO [04-21|22:14:22.875] Storage estimation                       delivered=12823 remaining=4827    chunks=1
INFO [04-21|22:14:23.451] Storage estimation                       delivered=7880  remaining=21325   chunks=2
INFO [04-21|22:14:24.340] Storage estimation                       delivered=13402 remaining=6840    chunks=1
INFO [04-21|22:14:29.338] Storage estimation                       delivered=17162 remaining=32508   chunks=2
INFO [04-21|22:14:30.246] Storage estimation                       delivered=8684  remaining=37647   chunks=2
INFO [04-21|22:14:31.441] Storage estimation                       delivered=8809  remaining=16817   chunks=1
INFO [04-21|22:14:35.003] Storage estimation                       delivered=8369  remaining=5636    chunks=1
INFO [04-21|22:14:35.210] Storage estimation                       delivered=13678 remaining=743,827 chunks=16
INFO [04-21|22:14:36.554] Storage estimation                       delivered=10577 remaining=70602   chunks=4
INFO [04-21|22:14:39.150] Storage estimation                       delivered=13491 remaining=1280    chunks=1
INFO [04-21|22:14:46.440] Storage estimation                       delivered=9764  remaining=137,366 chunks=7
INFO [04-21|22:14:52.793] Storage estimation                       delivered=2902  remaining=2788    chunks=1
INFO [04-21|22:14:53.016] Storage estimation                       delivered=12137 remaining=4,925,201 chunks=16
INFO [04-21|22:14:54.069] Storage estimation                       delivered=11931 remaining=765       chunks=1
INFO [04-21|22:14:54.361] Storage estimation                       delivered=6986  remaining=6595      chunks=1
INFO [04-21|22:14:55.242] Storage estimation                       delivered=10965 remaining=2146      chunks=1
INFO [04-21|22:14:58.928] Storage estimation                       delivered=7673  remaining=18341     chunks=1
INFO [04-21|22:15:00.124] Storage estimation                       delivered=12025 remaining=5804      chunks=1
INFO [04-21|22:15:00.460] Storage estimation                       delivered=12700 remaining=1923      chunks=1
INFO [04-21|22:15:01.822] Storage estimation                       delivered=6567  remaining=5677      chunks=1
INFO [04-21|22:15:02.978] Storage estimation                       delivered=5318  remaining=6852      chunks=1
INFO [04-21|22:15:03.074] Storage estimation                       delivered=7880  remaining=2166      chunks=1
INFO [04-21|22:15:04.688] Storage estimation                       delivered=11899 remaining=89020     chunks=5
INFO [04-21|22:15:06.576] Storage estimation                       delivered=13343 remaining=31417     chunks=2
INFO [04-21|22:15:08.235] Storage estimation                       delivered=9762  remaining=13635     chunks=1
INFO [04-21|22:15:08.730] Storage estimation                       delivered=8293  remaining=9160      chunks=1
INFO [04-21|22:15:11.184] Storage estimation                       delivered=12959 remaining=6107      chunks=1
INFO [04-21|22:15:11.415] Storage estimation                       delivered=12444 remaining=643       chunks=1
INFO [04-21|22:15:11.962] Storage estimation                       delivered=9152  remaining=167,030   chunks=9
INFO [04-21|22:15:13.094] Storage estimation                       delivered=7538  remaining=16499     chunks=1
INFO [04-21|22:15:13.559] Storage estimation                       delivered=14085 remaining=56971     chunks=3
INFO [04-21|22:15:15.073] Storage estimation                       delivered=9847  remaining=961       chunks=1
INFO [04-21|22:15:15.483] Storage estimation                       delivered=12424 remaining=30924     chunks=2
INFO [04-21|22:15:18.217] Storage estimation                       delivered=13534 remaining=6233      chunks=1
INFO [04-21|22:15:22.907] Storage estimation                       delivered=14539 remaining=10197     chunks=1
INFO [04-21|22:15:23.176] Storage estimation                       delivered=4909  remaining=657       chunks=1
INFO [04-21|22:15:23.781] Storage estimation                       delivered=5113  remaining=4464      chunks=1
INFO [04-21|22:15:24.331] Storage estimation                       delivered=11886 remaining=9621      chunks=1
INFO [04-21|22:15:24.607] Storage estimation                       delivered=9773  remaining=7721      chunks=1
INFO [04-21|22:15:25.741] Storage estimation                       delivered=7493  remaining=22234     chunks=2
INFO [04-21|22:15:29.677] Storage estimation                       delivered=11643 remaining=20797     chunks=2
INFO [04-21|22:15:31.630] Storage estimation                       delivered=7704  remaining=8889      chunks=1
INFO [04-21|22:15:32.803] Storage estimation                       delivered=1507  remaining=16153     chunks=1
INFO [04-21|22:15:35.064] Storage estimation                       delivered=10114 remaining=47974     chunks=3
INFO [04-21|22:15:37.670] Storage estimation                       delivered=8865  remaining=2372      chunks=1
INFO [04-21|22:15:40.232] Storage estimation                       delivered=10684 remaining=170,745   chunks=9
INFO [04-21|22:15:43.097] Storage estimation                       delivered=11081 remaining=1016      chunks=1
INFO [04-21|22:15:45.119] Storage estimation                       delivered=8354  remaining=14071     chunks=1
INFO [04-21|22:15:46.918] Storage estimation                       delivered=11416 remaining=47291     chunks=3
INFO [04-21|22:15:48.825] Storage estimation                       delivered=7056  remaining=6384      chunks=1
INFO [04-21|22:15:51.648] Storage estimation                       delivered=12461 remaining=2464      chunks=1
INFO [04-21|22:15:52.077] Storage estimation                       delivered=10973 remaining=23716     chunks=2
INFO [04-21|22:15:52.896] Storage estimation                       delivered=13596 remaining=4521      chunks=1
INFO [04-21|22:15:53.552] Storage estimation                       delivered=7220  remaining=5,994,121 chunks=16
INFO [04-21|22:16:02.747] Storage estimation                       delivered=8363  remaining=8598      chunks=1
INFO [04-21|22:16:03.995] Storage estimation                       delivered=10582 remaining=14400     chunks=1
INFO [04-21|22:16:12.072] Storage estimation                       delivered=7739  remaining=26643     chunks=2
INFO [04-21|22:16:17.807] Storage estimation                       delivered=7126  remaining=413,866   chunks=16
INFO [04-21|22:16:30.014] Storage estimation                       delivered=12111 remaining=293,145   chunks=15
INFO [04-21|22:17:02.921] Storage estimation                       delivered=8952  remaining=69747     chunks=4
INFO [04-21|22:17:05.911] Storage estimation                       delivered=10820 remaining=121,029   chunks=7
INFO [04-21|22:17:26.719] Storage estimation                       delivered=10881 remaining=8318      chunks=1
INFO [04-21|22:17:33.641] Storage estimation                       delivered=11468 remaining=5142      chunks=1
INFO [04-21|22:17:38.126] Storage estimation                       delivered=11775 remaining=81817     chunks=5
INFO [04-21|22:17:41.261] Storage estimation                       delivered=11544 remaining=14765     chunks=1
INFO [04-21|22:17:47.486] Storage estimation                       delivered=14029 remaining=10737     chunks=1
INFO [04-21|22:17:48.882] Storage estimation                       delivered=7935  remaining=18351     chunks=1
INFO [04-21|22:17:50.359] Storage estimation                       delivered=8818  remaining=9310      chunks=1
INFO [04-21|22:17:50.804] Storage estimation                       delivered=14262 remaining=167,070   chunks=9
INFO [04-21|22:17:58.755] Storage estimation                       delivered=10667 remaining=27570     chunks=2
INFO [04-21|22:18:02.728] Storage estimation                       delivered=9794  remaining=1143      chunks=1
INFO [04-21|22:18:14.769] Storage estimation                       delivered=13400 remaining=7690      chunks=1
INFO [04-21|22:18:19.600] Storage estimation                       delivered=2868  remaining=52045     chunks=3
INFO [04-21|22:18:25.174] Storage estimation                       delivered=16051 remaining=1,043,694 chunks=16
INFO [04-21|22:18:30.846] Storage estimation                       delivered=7815  remaining=40422     chunks=3
INFO [04-21|22:18:47.228] Storage estimation                       delivered=10438 remaining=3777      chunks=1
INFO [04-21|22:18:54.220] Storage estimation                       delivered=8207  remaining=7817      chunks=1
INFO [04-21|22:19:02.931] Storage estimation                       delivered=11316 remaining=17222     chunks=1
INFO [04-21|22:19:11.715] Storage estimation                       delivered=10761 remaining=79069     chunks=4
INFO [04-21|22:19:14.395] Storage estimation                       delivered=5922  remaining=20633     chunks=2
INFO [04-21|22:19:14.468] Storage estimation                       delivered=13912 remaining=49042     chunks=3
INFO [04-21|22:19:27.460] Storage estimation                       delivered=11764 remaining=3758      chunks=1
INFO [04-21|22:19:34.096] Storage estimation                       delivered=3411  remaining=4693      chunks=1
INFO [04-21|22:19:34.375] Storage estimation                       delivered=14874 remaining=35307     chunks=2
INFO [04-21|22:19:35.401] Storage estimation                       delivered=4476  remaining=12507     chunks=1
INFO [04-21|22:19:39.894] Storage estimation                       delivered=13126 remaining=9710      chunks=1
INFO [04-21|22:19:40.665] Storage estimation                       delivered=14851 remaining=40064     chunks=3

Out of 100 "large" storage tries, only 6 were deemed to actually require the full 16 parallelism.
58 were deemed to not even need any chunking at all -- so that's over 50% less heals.

The largest partial delivery was delivered=17298 remaining=95701 chunks=5 -- which is interesting, I hadn't thought that 17K slots would fit into a 500Kb package.

@holiman
Copy link
Author

holiman commented Apr 22, 2021

From my mainnet run on a NUC:

INFO [04-21|22:14:03.256] State sync in progress                   synced=+Inf% state=622.52KiB accounts=0@0.00B slots=2818@622.52KiB codes=0@0.00B eta=-808.455ms
...
INFO [04-22|04:33:11.545] State sync in progress                   synced=99.97% state=77.95GiB   accounts=131,332,785@6.88GiB  slots=423,216,502@69.23GiB codes=375,805@1.84GiB    eta=-59.331s
...
INFO [04-22|04:33:30.361] State heal in progress                   accounts=0@0.00B              slots=0@0.00B              codes=0@0.00B            nodes=785@407.83KiB pending=10800
...
INFO [04-22|05:54:22.644] State heal in progress                   accounts=220,702@10.49MiB     slots=242,763@18.30MiB     codes=68@475.02KiB       nodes=1,623,581@428.17MiB pending=0
INFO [04-22|05:54:22.681] Rebuilding state snapshot 
...
INFO [04-22|09:01:52.809] Resuming state snapshot generation       root=03104b..ff3e94 in=5b4552..818b95 at=792db9..f6ad8b accounts=45,716,538           slots=151,451,600          storage=12.99GiB   elapsed=3h7m30.127s  eta=5h38m24.685s

So snap 6h to do the snap sync regular phase, then 1.5h to do the healing, and an estimated 8,5h to do the generation.

@holiman
Copy link
Author

holiman commented Apr 22, 2021

17452 being the largest seen partial delivery.

This is how the parallelism was performed:

   2367 1
    535 2
    306 3
    168 4
    116 5
    156 16
     74 6
     60 7
     37 8
     37 9
     25 10
     23 11
     14 12
     14 13
     15 14
      6 15

I.e: 2367 storages were fetched in one contiguous chunk, without healing needed. 535 were split into two chunks, etc. Only 156 were large enough to warrant the full 16 chunking.

@@ -1803,30 +1804,46 @@ func (s *Syncer) processStorageResponse(res *storageResponse) {
// the subtasks for it within the main account task
if tasks, ok := res.mainTask.SubTasks[account]; !ok {
var (
next common.Hash
keys = res.hashes[i]
lastKey = keys[len(keys)-1]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, could someone send us an empty list of keys as the last batch and crash?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Looks like it (but I wasn't able to trivially trigger it in a test). I'll fix it somehow

// Somewhere on the order of 10K slots fits into a packet. We'd rather not chunk if
// each chunk is going to be only one single packet, so we use a factor of 2 to
// avoid chunking if we expect the remaining data to be filled with ~2 packets.
if n := estimate / (2 * 10000); n+1 < chunks {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This 10K hard coded will blow up the moment we change anything at the packet sizes. Perhaps lets use maxRequestSize / 64? That seems a bit saner.

}

// newHashRange creates a new hashRange, initiated at the start position,
// and with the step set to fill the desired 'num' chunks
func newHashRange(start common.Hash, num uint64) *hashRange {
i := uint256.NewInt()
i.SetBytes32(start[:])
left := new(big.Int).Sub(new(big.Int).Add(math.MaxBig256, common.Big1), start.Big())
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get this.
new(big.Int).Add(math.MaxBig256, common.Big1) == 0 ?
Which makes this left := 0 - start ??
So step = (-start / num) which maybe works out in the end, but still I don't see how this is more correct than what was there previously?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used big ints, not uint256 for this calculation, so there's no overflow. The number of items in the hash range is 2^256, which cannot be represented by uint256. I needed the +1 for the correct calculation.

@holiman
Copy link
Author

holiman commented Apr 27, 2021

With the latest change, where we only stop the chunking at overflow, it sometimes gets into the position where the steps doesn't quite reach the end on the correct chunk, forcing one more tiny chunk to be created:

 Apr 26 18:53:23 bench03.ethdevops.io geth INFO [04-26|16:53:23.754] Created storage sync task account=900e6b..24c213 root=1903da..f5de54 from=e9f685..dfd0fd last=ffffff..fffffb
Apr 26 18:53:23 bench03.ethdevops.io geth INFO [04-26|16:53:23.754] Created storage sync task account=900e6b..24c213 root=1903da..f5de54 from=ffffff..fffffc last=ffffff..ffffff

Otherwise LGTM, but I'm not sure why you swapped to use big.Int in the ctor, instead of using uint256

@karalabe
Copy link
Owner

The bug was that I rounded down the step, not up, so the last chunk got ever a bit smaller than needed to over the full range. Fixed now.

@holiman
Copy link
Author

holiman commented Apr 27, 2021

LGTM!

@karalabe karalabe merged this pull request into karalabe:dirty-snap Apr 27, 2021
karalabe added a commit that referenced this pull request Apr 27, 2021
* eth/protocols/snap: make better use of delivered data

* squashme

* eth/protocols/snap: reduce chunking

* squashme

* eth/protocols/snap: reduce chunking further

* eth/protocols/snap: break out hash range calculations

* eth/protocols/snap: use sort.Search instead of looping

* eth/protocols/snap: prevent crash on storage response with no keys

* eth/protocols/snap: nitpicks all around

* eth/protocols/snap: clear heal need on 1-chunk storage completion

* eth/protocols/snap: fix range chunker, add tests

Co-authored-by: Péter Szilágyi <peterke@gmail.com>
karalabe added a commit that referenced this pull request Apr 27, 2021
…thereum#22668)

* eth/protocols/snap: generate storage trie from full dirty snap data

* eth/protocols/snap: get rid of some more dead code

* eth/protocols/snap: less frequent logs, also log during trie generation

* eth/protocols/snap: implement dirty account range stack-hashing

* eth/protocols/snap: don't loop on account trie generation

* eth/protocols/snap: fix account format in trie

* core, eth, ethdb: glue snap packets together, but not chunks

* eth/protocols/snap: print completion log for snap phase

* eth/protocols/snap: extended tests

* eth/protocols/snap: make testcase pass

* eth/protocols/snap: fix account stacktrie commit without defer

* ethdb: fix key counts on reset

* eth/protocols: fix typos

* eth/protocols/snap: make better use of delivered data (#44)

* eth/protocols/snap: make better use of delivered data

* squashme

* eth/protocols/snap: reduce chunking

* squashme

* eth/protocols/snap: reduce chunking further

* eth/protocols/snap: break out hash range calculations

* eth/protocols/snap: use sort.Search instead of looping

* eth/protocols/snap: prevent crash on storage response with no keys

* eth/protocols/snap: nitpicks all around

* eth/protocols/snap: clear heal need on 1-chunk storage completion

* eth/protocols/snap: fix range chunker, add tests

Co-authored-by: Péter Szilágyi <peterke@gmail.com>

* trie: fix test API error

* eth/protocols/snap: fix some further liter issues

* eth/protocols/snap: fix accidental batch reuse

Co-authored-by: Martin Holst Swende <martin@swende.se>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants