Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rlp, trie: faster trie node encoding (#24126) #606

Open
wants to merge 1 commit into
base: path-base-implementing
Choose a base branch
from

Conversation

minh-bq
Copy link
Contributor

@minh-bq minh-bq commented Oct 18, 2024

commit ethereum/go-ethereum@65ed1a6.

This change speeds up trie hashing and all other activities that require RLP encoding of trie nodes by approximately 20%. The speedup is achieved by avoiding reflection overhead during node encoding.

The interface type trie.node now contains a method 'encode' that works with rlp.EncoderBuffer. Management of EncoderBuffers is left to calling code. trie.hasher, which is pooled to avoid allocations, now maintains an EncoderBuffer. This means memory resources related to trie node encoding are tied to the hasher pool.

goos: linux
goarch: amd64
cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
                          │   old.txt    │               new.txt                │
                          │    sec/op    │    sec/op     vs base                │
DeriveSha200/std_trie-8     725.1µ ± 31%   613.8µ ± 37%        ~ (p=0.481 n=10)
DeriveSha200/stack_trie-8   572.3µ ± 10%   493.1µ ± 13%  -13.85% (p=0.005 n=10)
geomean                     644.2µ         550.1µ        -14.61%

                          │   old.txt    │               new.txt                │
                          │     B/op     │     B/op      vs base                │
DeriveSha200/std_trie-8     287.4Ki ± 0%   283.0Ki ± 0%   -1.53% (p=0.000 n=10)
DeriveSha200/stack_trie-8   56.34Ki ± 0%   42.43Ki ± 0%  -24.69% (p=0.000 n=10)
geomean                     127.2Ki        109.6Ki       -13.88%

                          │   old.txt   │               new.txt               │
                          │  allocs/op  │  allocs/op   vs base                │
DeriveSha200/std_trie-8     2.931k ± 0%   2.917k ± 0%   -0.46% (p=0.000 n=10)
DeriveSha200/stack_trie-8   1.462k ± 0%   1.246k ± 0%  -14.77% (p=0.000 n=10)
geomean                     2.070k        1.907k        -7.90%

                         │   old.txt    │               new.txt                │
                         │    sec/op    │    sec/op     vs base                │
Prove-8                    664.0µ ± 21%   450.2µ ± 27%  -32.20% (p=0.000 n=10)
VerifyProof-8              8.643µ ± 18%   9.009µ ± 33%        ~ (p=0.684 n=10)
VerifyRangeProof10-8       99.18µ ± 25%   67.60µ ± 67%        ~ (p=0.089 n=10)
VerifyRangeProof100-8      496.3µ ± 20%   487.0µ ± 33%        ~ (p=0.739 n=10)
VerifyRangeProof1000-8     5.149m ± 32%   4.095m ± 49%        ~ (p=0.971 n=10)
VerifyRangeProof5000-8     19.79m ± 60%   19.16m ± 28%        ~ (p=0.631 n=10)
VerifyRangeNoProof10-8     499.0µ ± 15%   422.8µ ± 29%  -15.25% (p=0.035 n=10)
VerifyRangeNoProof500-8    1.747m ± 30%   1.417m ± 24%  -18.91% (p=0.023 n=10)
VerifyRangeNoProof1000-8   3.025m ± 29%   2.239m ± 33%  -25.98% (p=0.009 n=10)
geomean                    750.9µ         622.6µ        -17.09%

                     │    old.txt    │               new.txt                │
                     │    sec/op     │    sec/op     vs base                │
HashFixedSize/10-8      60.30µ ± 19%   44.84µ ± 17%  -25.64% (p=0.000 n=10)
HashFixedSize/100-8     205.9µ ± 32%   145.2µ ± 19%  -29.48% (p=0.000 n=10)
HashFixedSize/1K-8     1326.5µ ± 23%   939.2µ ± 25%  -29.20% (p=0.002 n=10)
HashFixedSize/10K-8     14.77m ± 25%   12.74m ± 19%        ~ (p=0.075 n=10)
HashFixedSize/100K-8    135.2m ± 19%   104.1m ± 18%  -23.03% (p=0.003 n=10)
geomean                 2.011m         1.520m        -24.43%

                     │    old.txt    │               new.txt                │
                     │     B/op      │     B/op      vs base                │
HashFixedSize/10-8     11.729Ki ± 0%   9.752Ki ± 0%  -16.85% (p=0.000 n=10)
HashFixedSize/100-8     58.56Ki ± 0%   49.23Ki ± 0%  -15.93% (p=0.000 n=10)
HashFixedSize/1K-8      578.1Ki ± 0%   481.5Ki ± 0%  -16.72% (p=0.000 n=10)
HashFixedSize/10K-8     6.019Mi ± 0%   4.985Mi ± 0%  -17.18% (p=0.000 n=10)
HashFixedSize/100K-8    59.53Mi ± 0%   49.29Mi ± 0%  -17.20% (p=0.000 n=10)
geomean                 683.5Ki        568.8Ki       -16.78%

                     │   old.txt   │              new.txt               │
                     │  allocs/op  │  allocs/op   vs base               │
HashFixedSize/10-8      149.0 ± 0%    142.0 ± 0%  -4.70% (p=0.000 n=10)
HashFixedSize/100-8     772.0 ± 0%    739.0 ± 0%  -4.27% (p=0.000 n=10)
HashFixedSize/1K-8     7.443k ± 0%   7.099k ± 0%  -4.62% (p=0.000 n=10)
HashFixedSize/10K-8    77.09k ± 0%   73.32k ± 0%  -4.89% (p=0.000 n=10)
HashFixedSize/100K-8   767.8k ± 0%   730.5k ± 0%  -4.86% (p=0.000 n=10)
geomean                8.729k        8.321k       -4.67%

commit ethereum/go-ethereum@65ed1a6.

This change speeds up trie hashing and all other activities that require
RLP encoding of trie nodes by approximately 20%. The speedup is achieved by
avoiding reflection overhead during node encoding.

The interface type trie.node now contains a method 'encode' that works with
rlp.EncoderBuffer. Management of EncoderBuffers is left to calling code.
trie.hasher, which is pooled to avoid allocations, now maintains an
EncoderBuffer. This means memory resources related to trie node encoding
are tied to the hasher pool.

goos: linux
goarch: amd64
cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
                          │   old.txt    │               new.txt                │
                          │    sec/op    │    sec/op     vs base                │
DeriveSha200/std_trie-8     725.1µ ± 31%   613.8µ ± 37%        ~ (p=0.481 n=10)
DeriveSha200/stack_trie-8   572.3µ ± 10%   493.1µ ± 13%  -13.85% (p=0.005 n=10)
geomean                     644.2µ         550.1µ        -14.61%

                          │   old.txt    │               new.txt                │
                          │     B/op     │     B/op      vs base                │
DeriveSha200/std_trie-8     287.4Ki ± 0%   283.0Ki ± 0%   -1.53% (p=0.000 n=10)
DeriveSha200/stack_trie-8   56.34Ki ± 0%   42.43Ki ± 0%  -24.69% (p=0.000 n=10)
geomean                     127.2Ki        109.6Ki       -13.88%

                          │   old.txt   │               new.txt               │
                          │  allocs/op  │  allocs/op   vs base                │
DeriveSha200/std_trie-8     2.931k ± 0%   2.917k ± 0%   -0.46% (p=0.000 n=10)
DeriveSha200/stack_trie-8   1.462k ± 0%   1.246k ± 0%  -14.77% (p=0.000 n=10)
geomean                     2.070k        1.907k        -7.90%

                         │   old.txt    │               new.txt                │
                         │    sec/op    │    sec/op     vs base                │
Prove-8                    664.0µ ± 21%   450.2µ ± 27%  -32.20% (p=0.000 n=10)
VerifyProof-8              8.643µ ± 18%   9.009µ ± 33%        ~ (p=0.684 n=10)
VerifyRangeProof10-8       99.18µ ± 25%   67.60µ ± 67%        ~ (p=0.089 n=10)
VerifyRangeProof100-8      496.3µ ± 20%   487.0µ ± 33%        ~ (p=0.739 n=10)
VerifyRangeProof1000-8     5.149m ± 32%   4.095m ± 49%        ~ (p=0.971 n=10)
VerifyRangeProof5000-8     19.79m ± 60%   19.16m ± 28%        ~ (p=0.631 n=10)
VerifyRangeNoProof10-8     499.0µ ± 15%   422.8µ ± 29%  -15.25% (p=0.035 n=10)
VerifyRangeNoProof500-8    1.747m ± 30%   1.417m ± 24%  -18.91% (p=0.023 n=10)
VerifyRangeNoProof1000-8   3.025m ± 29%   2.239m ± 33%  -25.98% (p=0.009 n=10)
geomean                    750.9µ         622.6µ        -17.09%

                     │    old.txt    │               new.txt                │
                     │    sec/op     │    sec/op     vs base                │
HashFixedSize/10-8      60.30µ ± 19%   44.84µ ± 17%  -25.64% (p=0.000 n=10)
HashFixedSize/100-8     205.9µ ± 32%   145.2µ ± 19%  -29.48% (p=0.000 n=10)
HashFixedSize/1K-8     1326.5µ ± 23%   939.2µ ± 25%  -29.20% (p=0.002 n=10)
HashFixedSize/10K-8     14.77m ± 25%   12.74m ± 19%        ~ (p=0.075 n=10)
HashFixedSize/100K-8    135.2m ± 19%   104.1m ± 18%  -23.03% (p=0.003 n=10)
geomean                 2.011m         1.520m        -24.43%

                     │    old.txt    │               new.txt                │
                     │     B/op      │     B/op      vs base                │
HashFixedSize/10-8     11.729Ki ± 0%   9.752Ki ± 0%  -16.85% (p=0.000 n=10)
HashFixedSize/100-8     58.56Ki ± 0%   49.23Ki ± 0%  -15.93% (p=0.000 n=10)
HashFixedSize/1K-8      578.1Ki ± 0%   481.5Ki ± 0%  -16.72% (p=0.000 n=10)
HashFixedSize/10K-8     6.019Mi ± 0%   4.985Mi ± 0%  -17.18% (p=0.000 n=10)
HashFixedSize/100K-8    59.53Mi ± 0%   49.29Mi ± 0%  -17.20% (p=0.000 n=10)
geomean                 683.5Ki        568.8Ki       -16.78%

                     │   old.txt   │              new.txt               │
                     │  allocs/op  │  allocs/op   vs base               │
HashFixedSize/10-8      149.0 ± 0%    142.0 ± 0%  -4.70% (p=0.000 n=10)
HashFixedSize/100-8     772.0 ± 0%    739.0 ± 0%  -4.27% (p=0.000 n=10)
HashFixedSize/1K-8     7.443k ± 0%   7.099k ± 0%  -4.62% (p=0.000 n=10)
HashFixedSize/10K-8    77.09k ± 0%   73.32k ± 0%  -4.89% (p=0.000 n=10)
HashFixedSize/100K-8   767.8k ± 0%   730.5k ± 0%  -4.86% (p=0.000 n=10)
geomean                8.729k        8.321k       -4.67%

Co-authored-by: Felix Lange <fjl@twurst.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants