Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/rawdb: implement sequential reads in freezer_table #23117

Merged
merged 6 commits into from
Aug 13, 2021

Conversation

holiman
Copy link
Contributor

@holiman holiman commented Jun 28, 2021

This PR is a bit of a successor to #20308, which implemented sequential access to freezer data via an iterator interface.
This PR instead returns a slice of data. It does not defer reads until Next(), so instead does all the reads while holding the readlock.
It also switches the implementation of Retrieve to be a just a short-cut for doing a sequential read with max items set to 1. This PR doesn't really add any uses of the new method (except the Retrieve), but does actually improve one thing as-is: previously, two syscalls were performed to read two index items from the index-file. This PR instead reads all (both) index items using only one syscall.

Later on, this can be used to serve eth64 data such as headers directly from ancients with a minimum number of syscalls. If that is done, we can get by with 2 syscalls per "192-header-delivery" instead of 576.

@holiman holiman requested a review from fjl June 28, 2021 11:50
@holiman
Copy link
Contributor Author

holiman commented Jun 29, 2021

I made a little benchmark tool for this, in this branch.

Using the old reads, where each header is read separately, and causing 3 syscalls per header (so 576 syscalls per 192 headers):

192 headers in 962.083µs: 199.57 headers/ms
688128 headers in 5.000379735s: 137.62 headers/ms
1617984 headers in 10.00078688s: 161.79 headers/ms
2851968 headers in 14.999695984s: 190.14 headers/ms
4238400 headers in 19.998605178s: 211.93 headers/ms
5996928 headers in 24.997846647s: 239.90 headers/ms
8137536 headers in 29.996676305s: 271.28 headers/ms
10647360 headers in 34.995323103s: 304.25 headers/ms
13380480 headers in 39.994063183s: 334.56 headers/ms
16335360 headers in 44.993105166s: 363.06 headers/ms
19403136 headers in 49.991620151s: 388.13 headers/ms
22560576 headers in 54.990467372s: 410.26 headers/ms
25768320 headers in 59.988958187s: 429.55 headers/ms
28992768 headers in 1m4.987618054s: 446.13 headers/ms
32201472 headers in 1m9.986038652s: 460.11 headers/ms
35439360 headers in 1m14.984864442s: 472.62 headers/ms
38681856 headers in 1m19.983580689s: 483.62 headers/ms
41926272 headers in 1m24.98204837s: 493.35 headers/ms
45177216 headers in 1m29.980422998s: 502.08 headers/ms
48415680 headers in 1m34.978883429s: 509.75 headers/ms
51676800 headers in 1m39.97733937s: 516.89 headers/ms
54936768 headers in 1m44.975900812s: 523.33 headers/ms
58201344 headers in 1m49.974443941s: 529.23 headers/ms
61437888 headers in 1m54.973013666s: 534.37 headers/ms
64694400 headers in 1m59.971581851s: 539.25 headers/ms
67878528 headers in 2m4.970055655s: 543.16 headers/ms
71136576 headers in 2m9.968586182s: 547.34 headers/ms
74400192 headers in 2m14.96715173s: 551.25 headers/ms
77652096 headers in 2m19.965662809s: 554.79 headers/ms
80911488 headers in 2m24.964044114s: 558.15 headers/ms
84173184 headers in 2m29.962889512s: 561.29 headers/ms
87436416 headers in 2m34.961447644s: 564.25 headers/ms
90681792 headers in 2m39.959987835s: 566.90 headers/ms
93945792 headers in 2m44.958553317s: 569.51 headers/ms
97208064 headers in 2m49.957255953s: 571.96 headers/ms
100463424 headers in 2m54.955884838s: 574.22 headers/ms
103725120 headers in 2m59.954570697s: 576.40 headers/ms
106978176 headers in 3m4.953132496s: 578.41 headers/ms
110243712 headers in 3m9.951801007s: 580.38 headers/ms
113503296 headers in 3m14.950283968s: 582.22 headers/ms
116766528 headers in 3m19.948830709s: 583.98 headers/ms
120026304 headers in 3m24.947243624s: 585.64 headers/ms

It speeds up continously (it's on a NUC with pretty hefty RAM, so the disk reads will be better cached) but after three and a half minute, it's reached 585 headers per millisecond.

If the sequential-read mechanism in this PR is used instead, which does 2 (instead of 576) syscalls per 192 headers , we get

./build/bin/geth db bench2 10000000
INFO [06-29|14:28:05.085] Maximum peer count                       ETH=50 LES=0 total=50
INFO [06-29|14:28:05.085] Smartcard socket not found, disabling    err="stat /run/pcscd/pcscd.comm: no such file or directory"
INFO [06-29|14:28:05.085] Set global gas cap                       cap=50,000,000
INFO [06-29|14:28:05.085] Opening freezer                          location=/home/martin/.ethereum/geth/chaindata/ancient name=headers
192 headers in 125.972µs: 1524.15 headers/ms
9017856 headers in 4.994948262s: 1805.40 headers/ms
18025920 headers in 9.990446257s: 1804.32 headers/ms
27043200 headers in 14.98572444s: 1804.60 headers/ms
36030528 headers in 19.980900692s: 1803.25 headers/ms
45035328 headers in 24.976188053s: 1803.13 headers/ms
54063360 headers in 29.97161218s: 1803.82 headers/ms
63072576 headers in 34.967231597s: 1803.76 headers/ms
72077952 headers in 39.962521288s: 1803.64 headers/ms
81045120 headers in 44.957913329s: 1802.69 headers/ms
90033600 headers in 49.953242546s: 1802.36 headers/ms
99019392 headers in 54.948543408s: 1802.04 headers/ms
108010944 headers in 59.943880611s: 1801.87 headers/ms

@karalabe karalabe self-assigned this Aug 3, 2021
@karalabe karalabe added this to the 1.10.7 milestone Aug 3, 2021
Copy link
Member

@karalabe karalabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nitpicks, but there's one corner case that I believe is broken.

core/rawdb/freezer.go Outdated Show resolved Hide resolved
core/rawdb/freezer.go Outdated Show resolved Hide resolved
ethdb/database.go Outdated Show resolved Hide resolved
ethdb/database.go Outdated Show resolved Hide resolved
core/rawdb/table.go Outdated Show resolved Hide resolved
core/rawdb/freezer_table.go Outdated Show resolved Hide resolved
core/rawdb/freezer_table.go Show resolved Hide resolved
core/rawdb/freezer_table.go Show resolved Hide resolved
core/rawdb/freezer_table_test.go Outdated Show resolved Hide resolved
core/rawdb/freezer_table_test.go Show resolved Hide resolved
@holiman holiman force-pushed the sequential_freezer_reads branch from 8f1453f to 821c696 Compare August 7, 2021 18:15
@karalabe karalabe removed this from the 1.10.7 milestone Aug 10, 2021
Copy link
Member

@karalabe karalabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@karalabe karalabe added this to the 1.10.8 milestone Aug 10, 2021
@karalabe karalabe merged commit 5f98020 into ethereum:master Aug 13, 2021
atif-konasl pushed a commit to frozeman/pandora-execution-engine that referenced this pull request Oct 15, 2021
* core/rawdb: implement sequential reads in freezer_table

* core/rawdb, ethdb: add sequential reader to db interface

* core/rawdb: lint nitpicks

* core/rawdb: fix some nitpicks

* core/rawdb: fix flaw with deferred reads not being performed

* core/rawdb: better documentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants