Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: decompose AttesterStatus #6945

Merged
merged 12 commits into from
Jul 16, 2024
Merged

Conversation

wemeetagain
Copy link
Member

@wemeetagain wemeetagain commented Jul 11, 2024

Motivation

  • slow epoch transition, code investigation

Description

  • decompose AttesterStatus[] into separate arrays in EpochTransitionCache
  • these separate arrays are filled ahead of time rather than created one-by-one
  • these separate arrays are also preallocated and reused instead of recreated / gc'd each epoch transition
  • running on feat3 (since 2024-07-11 15:30 utc), initial metrics look good (faster epoch transition), will return with more metrics after it runs a while

@wemeetagain wemeetagain requested a review from a team as a code owner July 11, 2024 17:24
Copy link

codecov bot commented Jul 11, 2024

Codecov Report

Attention: Patch coverage is 34.39153% with 124 lines in your changes missing coverage. Please review.

Project coverage is 62.49%. Comparing base (6ab2697) to head (21e3728).
Report is 2 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #6945      +/-   ##
============================================
- Coverage     62.51%   62.49%   -0.02%     
============================================
  Files           575      575              
  Lines         61015    61079      +64     
  Branches       2120     2129       +9     
============================================
+ Hits          38141    38173      +32     
- Misses        22835    22867      +32     
  Partials         39       39              

Copy link
Contributor

github-actions bot commented Jul 11, 2024

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: eb9ac68 Previous: 6ab2697 Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 725.77 us/op 518.91 us/op 1.40
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 48.943 us/op 54.965 us/op 0.89
BLS verify - blst-native 1.2082 ms/op 1.2238 ms/op 0.99
BLS verifyMultipleSignatures 3 - blst-native 2.5721 ms/op 2.6068 ms/op 0.99
BLS verifyMultipleSignatures 8 - blst-native 5.6715 ms/op 5.7385 ms/op 0.99
BLS verifyMultipleSignatures 32 - blst-native 20.810 ms/op 21.080 ms/op 0.99
BLS verifyMultipleSignatures 64 - blst-native 40.959 ms/op 41.548 ms/op 0.99
BLS verifyMultipleSignatures 128 - blst-native 81.343 ms/op 82.586 ms/op 0.98
BLS deserializing 10000 signatures 850.56 ms/op 885.36 ms/op 0.96
BLS deserializing 100000 signatures 8.4453 s/op 8.5756 s/op 0.98
BLS verifyMultipleSignatures - same message - 3 - blst-native 1.2178 ms/op 1.2246 ms/op 0.99
BLS verifyMultipleSignatures - same message - 8 - blst-native 1.3749 ms/op 1.4011 ms/op 0.98
BLS verifyMultipleSignatures - same message - 32 - blst-native 2.1509 ms/op 2.1983 ms/op 0.98
BLS verifyMultipleSignatures - same message - 64 - blst-native 3.6500 ms/op 3.3846 ms/op 1.08
BLS verifyMultipleSignatures - same message - 128 - blst-native 5.4056 ms/op 5.9969 ms/op 0.90
BLS aggregatePubkeys 32 - blst-native 24.694 us/op 24.774 us/op 1.00
BLS aggregatePubkeys 128 - blst-native 97.099 us/op 97.276 us/op 1.00
notSeenSlots=1 numMissedVotes=1 numBadVotes=10 79.688 ms/op 55.820 ms/op 1.43
notSeenSlots=1 numMissedVotes=0 numBadVotes=4 54.513 ms/op 49.741 ms/op 1.10
notSeenSlots=2 numMissedVotes=1 numBadVotes=10 36.657 ms/op 31.733 ms/op 1.16
getSlashingsAndExits - default max 90.633 us/op 92.110 us/op 0.98
getSlashingsAndExits - 2k 277.71 us/op 289.99 us/op 0.96
proposeBlockBody type=full, size=empty 5.7623 ms/op 5.7864 ms/op 1.00
isKnown best case - 1 super set check 272.00 ns/op 303.00 ns/op 0.90
isKnown normal case - 2 super set checks 270.00 ns/op 283.00 ns/op 0.95
isKnown worse case - 16 super set checks 258.00 ns/op 279.00 ns/op 0.92
InMemoryCheckpointStateCache - add get delete 4.9290 us/op 5.0600 us/op 0.97
validate api signedAggregateAndProof - struct 2.5995 ms/op 2.6356 ms/op 0.99
validate gossip signedAggregateAndProof - struct 2.6004 ms/op 2.6518 ms/op 0.98
validate gossip attestation - vc 640000 1.2349 ms/op 1.2797 ms/op 0.96
batch validate gossip attestation - vc 640000 - chunk 32 147.80 us/op 150.81 us/op 0.98
batch validate gossip attestation - vc 640000 - chunk 64 129.33 us/op 134.79 us/op 0.96
batch validate gossip attestation - vc 640000 - chunk 128 119.84 us/op 122.20 us/op 0.98
batch validate gossip attestation - vc 640000 - chunk 256 116.82 us/op 120.67 us/op 0.97
pickEth1Vote - no votes 1.0209 ms/op 1.0999 ms/op 0.93
pickEth1Vote - max votes 9.2567 ms/op 8.4193 ms/op 1.10
pickEth1Vote - Eth1Data hashTreeRoot value x2048 15.353 ms/op 12.868 ms/op 1.19
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 20.258 ms/op 17.388 ms/op 1.17
pickEth1Vote - Eth1Data fastSerialize value x2048 544.26 us/op 532.40 us/op 1.02
pickEth1Vote - Eth1Data fastSerialize tree x2048 4.9967 ms/op 3.7685 ms/op 1.33
bytes32 toHexString 445.00 ns/op 469.00 ns/op 0.95
bytes32 Buffer.toString(hex) 260.00 ns/op 260.00 ns/op 1.00
bytes32 Buffer.toString(hex) from Uint8Array 392.00 ns/op 368.00 ns/op 1.07
bytes32 Buffer.toString(hex) + 0x 260.00 ns/op 258.00 ns/op 1.01
Object access 1 prop 0.14200 ns/op 0.13500 ns/op 1.05
Map access 1 prop 0.13600 ns/op 0.13800 ns/op 0.99
Object get x1000 5.9090 ns/op 6.0670 ns/op 0.97
Map get x1000 6.5570 ns/op 6.5280 ns/op 1.00
Object set x1000 34.619 ns/op 35.641 ns/op 0.97
Map set x1000 23.782 ns/op 26.327 ns/op 0.90
Return object 10000 times 0.29600 ns/op 0.30100 ns/op 0.98
Throw Error 10000 times 3.3629 us/op 3.4539 us/op 0.97
fastMsgIdFn sha256 / 200 bytes 2.1950 us/op 2.3120 us/op 0.95
fastMsgIdFn h32 xxhash / 200 bytes 247.00 ns/op 256.00 ns/op 0.96
fastMsgIdFn h64 xxhash / 200 bytes 279.00 ns/op 286.00 ns/op 0.98
fastMsgIdFn sha256 / 1000 bytes 7.5460 us/op 7.7180 us/op 0.98
fastMsgIdFn h32 xxhash / 1000 bytes 381.00 ns/op 422.00 ns/op 0.90
fastMsgIdFn h64 xxhash / 1000 bytes 358.00 ns/op 382.00 ns/op 0.94
fastMsgIdFn sha256 / 10000 bytes 66.163 us/op 64.852 us/op 1.02
fastMsgIdFn h32 xxhash / 10000 bytes 1.9550 us/op 1.9180 us/op 1.02
fastMsgIdFn h64 xxhash / 10000 bytes 1.2330 us/op 1.2570 us/op 0.98
send data - 1000 256B messages 11.393 ms/op 12.429 ms/op 0.92
send data - 1000 512B messages 16.026 ms/op 16.949 ms/op 0.95
send data - 1000 1024B messages 25.178 ms/op 28.131 ms/op 0.90
send data - 1000 1200B messages 26.734 ms/op 26.033 ms/op 1.03
send data - 1000 2048B messages 32.237 ms/op 33.181 ms/op 0.97
send data - 1000 4096B messages 32.360 ms/op 32.264 ms/op 1.00
send data - 1000 16384B messages 69.920 ms/op 74.794 ms/op 0.93
send data - 1000 65536B messages 221.91 ms/op 214.29 ms/op 1.04
enrSubnets - fastDeserialize 64 bits 1.0740 us/op 1.0770 us/op 1.00
enrSubnets - ssz BitVector 64 bits 350.00 ns/op 358.00 ns/op 0.98
enrSubnets - fastDeserialize 4 bits 142.00 ns/op 161.00 ns/op 0.88
enrSubnets - ssz BitVector 4 bits 346.00 ns/op 362.00 ns/op 0.96
prioritizePeers score -10:0 att 32-0.1 sync 2-0 140.90 us/op 148.56 us/op 0.95
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 176.11 us/op 165.16 us/op 1.07
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 351.23 us/op 236.93 us/op 1.48
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 678.86 us/op 409.70 us/op 1.66
prioritizePeers score 0:0 att 64-1 sync 4-1 1.0614 ms/op 585.00 us/op 1.81
array of 16000 items push then shift 1.6262 us/op 1.6668 us/op 0.98
LinkedList of 16000 items push then shift 7.0250 ns/op 7.9150 ns/op 0.89
array of 16000 items push then pop 98.676 ns/op 125.54 ns/op 0.79
LinkedList of 16000 items push then pop 6.9000 ns/op 7.5740 ns/op 0.91
array of 24000 items push then shift 2.4035 us/op 2.4831 us/op 0.97
LinkedList of 24000 items push then shift 6.9960 ns/op 7.1990 ns/op 0.97
array of 24000 items push then pop 138.31 ns/op 141.14 ns/op 0.98
LinkedList of 24000 items push then pop 6.7370 ns/op 7.0670 ns/op 0.95
intersect bitArray bitLen 8 6.1970 ns/op 9.8180 ns/op 0.63
intersect array and set length 8 45.650 ns/op 45.301 ns/op 1.01
intersect bitArray bitLen 128 29.201 ns/op 32.617 ns/op 0.90
intersect array and set length 128 663.57 ns/op 767.35 ns/op 0.86
bitArray.getTrueBitIndexes() bitLen 128 2.3700 us/op 1.9520 us/op 1.21
bitArray.getTrueBitIndexes() bitLen 248 3.6260 us/op 3.5120 us/op 1.03
bitArray.getTrueBitIndexes() bitLen 512 8.3750 us/op 9.7140 us/op 0.86
Buffer.concat 32 items 932.00 ns/op 1.0670 us/op 0.87
Uint8Array.set 32 items 1.6390 us/op 1.8270 us/op 0.90
Buffer.copy 2.3710 us/op 1.7100 us/op 1.39
Uint8Array.set - with subarray 2.9210 us/op 2.8090 us/op 1.04
Uint8Array.set - without subarray 1.5480 us/op 1.3750 us/op 1.13
getUint32 - dataview 235.00 ns/op 277.00 ns/op 0.85
getUint32 - manual 155.00 ns/op 203.00 ns/op 0.76
Set add up to 64 items then delete first 2.1324 us/op 2.7639 us/op 0.77
OrderedSet add up to 64 items then delete first 3.2146 us/op 4.0149 us/op 0.80
Set add up to 64 items then delete last 2.4452 us/op 3.0501 us/op 0.80
OrderedSet add up to 64 items then delete last 3.7949 us/op 4.8644 us/op 0.78
Set add up to 64 items then delete middle 2.4660 us/op 3.8259 us/op 0.64
OrderedSet add up to 64 items then delete middle 5.1582 us/op 7.4127 us/op 0.70
Set add up to 128 items then delete first 4.8282 us/op 7.2724 us/op 0.66
OrderedSet add up to 128 items then delete first 7.1397 us/op 10.860 us/op 0.66
Set add up to 128 items then delete last 4.9333 us/op 6.2012 us/op 0.80
OrderedSet add up to 128 items then delete last 7.5315 us/op 10.818 us/op 0.70
Set add up to 128 items then delete middle 4.7731 us/op 8.0648 us/op 0.59
OrderedSet add up to 128 items then delete middle 13.251 us/op 18.096 us/op 0.73
Set add up to 256 items then delete first 10.177 us/op 13.567 us/op 0.75
OrderedSet add up to 256 items then delete first 14.512 us/op 20.642 us/op 0.70
Set add up to 256 items then delete last 9.7418 us/op 11.637 us/op 0.84
OrderedSet add up to 256 items then delete last 15.372 us/op 19.221 us/op 0.80
Set add up to 256 items then delete middle 9.6933 us/op 12.614 us/op 0.77
OrderedSet add up to 256 items then delete middle 39.585 us/op 50.380 us/op 0.79
transfer serialized Status (84 B) 1.2890 us/op 1.4990 us/op 0.86
copy serialized Status (84 B) 1.0870 us/op 1.4060 us/op 0.77
transfer serialized SignedVoluntaryExit (112 B) 1.3720 us/op 1.8010 us/op 0.76
copy serialized SignedVoluntaryExit (112 B) 1.1020 us/op 1.4670 us/op 0.75
transfer serialized ProposerSlashing (416 B) 1.5820 us/op 2.2540 us/op 0.70
copy serialized ProposerSlashing (416 B) 2.5250 us/op 1.9810 us/op 1.27
transfer serialized Attestation (485 B) 2.8680 us/op 2.7350 us/op 1.05
copy serialized Attestation (485 B) 2.7450 us/op 2.9920 us/op 0.92
transfer serialized AttesterSlashing (33232 B) 2.9430 us/op 3.3000 us/op 0.89
copy serialized AttesterSlashing (33232 B) 6.7340 us/op 12.267 us/op 0.55
transfer serialized Small SignedBeaconBlock (128000 B) 3.7980 us/op 4.4260 us/op 0.86
copy serialized Small SignedBeaconBlock (128000 B) 15.151 us/op 30.308 us/op 0.50
transfer serialized Avg SignedBeaconBlock (200000 B) 3.6040 us/op 4.1160 us/op 0.88
copy serialized Avg SignedBeaconBlock (200000 B) 22.076 us/op 38.166 us/op 0.58
transfer serialized BlobsSidecar (524380 B) 2.7800 us/op 3.9580 us/op 0.70
copy serialized BlobsSidecar (524380 B) 111.64 us/op 132.94 us/op 0.84
transfer serialized Big SignedBeaconBlock (1000000 B) 3.5320 us/op 3.8840 us/op 0.91
copy serialized Big SignedBeaconBlock (1000000 B) 153.52 us/op 277.03 us/op 0.55
pass gossip attestations to forkchoice per slot 3.2651 ms/op 3.4718 ms/op 0.94
forkChoice updateHead vc 100000 bc 64 eq 0 593.94 us/op 501.49 us/op 1.18
forkChoice updateHead vc 600000 bc 64 eq 0 3.0364 ms/op 3.4180 ms/op 0.89
forkChoice updateHead vc 1000000 bc 64 eq 0 5.2007 ms/op 5.7096 ms/op 0.91
forkChoice updateHead vc 600000 bc 320 eq 0 3.0934 ms/op 3.2031 ms/op 0.97
forkChoice updateHead vc 600000 bc 1200 eq 0 3.1682 ms/op 3.2316 ms/op 0.98
forkChoice updateHead vc 600000 bc 7200 eq 0 4.0891 ms/op 3.8087 ms/op 1.07
forkChoice updateHead vc 600000 bc 64 eq 1000 10.582 ms/op 10.881 ms/op 0.97
forkChoice updateHead vc 600000 bc 64 eq 10000 10.597 ms/op 10.899 ms/op 0.97
forkChoice updateHead vc 600000 bc 64 eq 300000 14.828 ms/op 16.617 ms/op 0.89
computeDeltas 500000 validators 300 proto nodes 3.5566 ms/op 3.8475 ms/op 0.92
computeDeltas 500000 validators 1200 proto nodes 3.6592 ms/op 4.3051 ms/op 0.85
computeDeltas 500000 validators 7200 proto nodes 3.7737 ms/op 4.3346 ms/op 0.87
computeDeltas 750000 validators 300 proto nodes 5.5941 ms/op 5.3784 ms/op 1.04
computeDeltas 750000 validators 1200 proto nodes 5.5127 ms/op 5.1419 ms/op 1.07
computeDeltas 750000 validators 7200 proto nodes 5.4477 ms/op 5.0829 ms/op 1.07
computeDeltas 1400000 validators 300 proto nodes 10.253 ms/op 9.8479 ms/op 1.04
computeDeltas 1400000 validators 1200 proto nodes 9.7365 ms/op 9.4734 ms/op 1.03
computeDeltas 1400000 validators 7200 proto nodes 9.9877 ms/op 9.8064 ms/op 1.02
computeDeltas 2100000 validators 300 proto nodes 14.972 ms/op 15.664 ms/op 0.96
computeDeltas 2100000 validators 1200 proto nodes 14.960 ms/op 15.715 ms/op 0.95
computeDeltas 2100000 validators 7200 proto nodes 14.975 ms/op 16.380 ms/op 0.91
altair processAttestation - 250000 vs - 7PWei normalcase 1.7124 ms/op 2.1560 ms/op 0.79
altair processAttestation - 250000 vs - 7PWei worstcase 2.5011 ms/op 3.0342 ms/op 0.82
altair processAttestation - setStatus - 1/6 committees join 86.477 us/op 105.28 us/op 0.82
altair processAttestation - setStatus - 1/3 committees join 176.26 us/op 204.94 us/op 0.86
altair processAttestation - setStatus - 1/2 committees join 248.06 us/op 273.37 us/op 0.91
altair processAttestation - setStatus - 2/3 committees join 318.46 us/op 350.41 us/op 0.91
altair processAttestation - setStatus - 4/5 committees join 483.86 us/op 510.23 us/op 0.95
altair processAttestation - setStatus - 100% committees join 577.96 us/op 623.04 us/op 0.93
altair processBlock - 250000 vs - 7PWei normalcase 4.6050 ms/op 4.7356 ms/op 0.97
altair processBlock - 250000 vs - 7PWei normalcase hashState 29.154 ms/op 25.252 ms/op 1.15
altair processBlock - 250000 vs - 7PWei worstcase 43.538 ms/op 42.758 ms/op 1.02
altair processBlock - 250000 vs - 7PWei worstcase hashState 81.119 ms/op 75.708 ms/op 1.07
phase0 processBlock - 250000 vs - 7PWei normalcase 2.2923 ms/op 2.1613 ms/op 1.06
phase0 processBlock - 250000 vs - 7PWei worstcase 30.688 ms/op 29.612 ms/op 1.04
altair processEth1Data - 250000 vs - 7PWei normalcase 343.19 us/op 427.18 us/op 0.80
getExpectedWithdrawals 250000 eb:1,eth1:1,we:0,wn:0,smpl:15 6.4470 us/op 6.6730 us/op 0.97
getExpectedWithdrawals 250000 eb:0.95,eth1:0.1,we:0.05,wn:0,smpl:219 32.058 us/op 33.012 us/op 0.97
getExpectedWithdrawals 250000 eb:0.95,eth1:0.3,we:0.05,wn:0,smpl:42 9.8430 us/op 9.5980 us/op 1.03
getExpectedWithdrawals 250000 eb:0.95,eth1:0.7,we:0.05,wn:0,smpl:18 6.7230 us/op 6.2720 us/op 1.07
getExpectedWithdrawals 250000 eb:0.1,eth1:0.1,we:0,wn:0,smpl:1020 127.83 us/op 117.82 us/op 1.08
getExpectedWithdrawals 250000 eb:0.03,eth1:0.03,we:0,wn:0,smpl:11777 959.15 us/op 747.56 us/op 1.28
getExpectedWithdrawals 250000 eb:0.01,eth1:0.01,we:0,wn:0,smpl:16384 1.1692 ms/op 917.62 us/op 1.27
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,smpl:16384 1.1604 ms/op 944.30 us/op 1.23
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,nocache,smpl:16384 2.6846 ms/op 2.6030 ms/op 1.03
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,smpl:16384 1.8623 ms/op 1.6317 ms/op 1.14
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,nocache,smpl:16384 4.1726 ms/op 3.9186 ms/op 1.06
Tree 40 250000 create 230.48 ms/op 222.55 ms/op 1.04
Tree 40 250000 get(125000) 149.61 ns/op 149.44 ns/op 1.00
Tree 40 250000 set(125000) 664.84 ns/op 636.50 ns/op 1.04
Tree 40 250000 toArray() 15.053 ms/op 19.160 ms/op 0.79
Tree 40 250000 iterate all - toArray() + loop 15.257 ms/op 18.434 ms/op 0.83
Tree 40 250000 iterate all - get(i) 54.638 ms/op 54.716 ms/op 1.00
MutableVector 250000 create 13.391 ms/op 7.9821 ms/op 1.68
MutableVector 250000 get(125000) 6.4950 ns/op 6.2720 ns/op 1.04
MutableVector 250000 set(125000) 224.32 ns/op 194.93 ns/op 1.15
MutableVector 250000 toArray() 4.4329 ms/op 3.4138 ms/op 1.30
MutableVector 250000 iterate all - toArray() + loop 4.4859 ms/op 3.3611 ms/op 1.33
MutableVector 250000 iterate all - get(i) 1.7453 ms/op 1.5728 ms/op 1.11
Array 250000 create 3.9194 ms/op 2.9976 ms/op 1.31
Array 250000 clone - spread 1.5697 ms/op 1.4375 ms/op 1.09
Array 250000 get(125000) 0.43300 ns/op 0.42700 ns/op 1.01
Array 250000 set(125000) 0.45600 ns/op 0.44700 ns/op 1.02
Array 250000 iterate all - loop 113.36 us/op 92.220 us/op 1.23
effectiveBalanceIncrements clone Uint8Array 300000 36.194 us/op 26.729 us/op 1.35
effectiveBalanceIncrements clone MutableVector 300000 130.00 ns/op 131.00 ns/op 0.99
effectiveBalanceIncrements rw all Uint8Array 300000 203.49 us/op 200.41 us/op 1.02
effectiveBalanceIncrements rw all MutableVector 300000 66.637 ms/op 63.890 ms/op 1.04
phase0 afterProcessEpoch - 250000 vs - 7PWei 85.568 ms/op 86.740 ms/op 0.99
Array.fill - length 1000000 3.5759 ms/op
Array push - length 1000000 17.083 ms/op
Array.get 0.28260 ns/op
Uint8Array.get 0.43998 ns/op
phase0 beforeProcessEpoch - 250000 vs - 7PWei 22.026 ms/op 41.110 ms/op 0.54
altair processEpoch - mainnet_e81889 351.27 ms/op 375.20 ms/op 0.94
mainnet_e81889 - altair beforeProcessEpoch 29.149 ms/op 66.430 ms/op 0.44
mainnet_e81889 - altair processJustificationAndFinalization 14.786 us/op 19.044 us/op 0.78
mainnet_e81889 - altair processInactivityUpdates 5.7054 ms/op 7.1404 ms/op 0.80
mainnet_e81889 - altair processRewardsAndPenalties 48.014 ms/op 38.237 ms/op 1.26
mainnet_e81889 - altair processRegistryUpdates 2.0640 us/op 2.6490 us/op 0.78
mainnet_e81889 - altair processSlashings 392.00 ns/op 354.00 ns/op 1.11
mainnet_e81889 - altair processEth1DataReset 364.00 ns/op 631.00 ns/op 0.58
mainnet_e81889 - altair processEffectiveBalanceUpdates 3.1774 ms/op 1.1853 ms/op 2.68
mainnet_e81889 - altair processSlashingsReset 4.4850 us/op 3.4800 us/op 1.29
mainnet_e81889 - altair processRandaoMixesReset 3.7600 us/op 5.5550 us/op 0.68
mainnet_e81889 - altair processHistoricalRootsUpdate 1.2590 us/op 618.00 ns/op 2.04
mainnet_e81889 - altair processParticipationFlagUpdates 2.3160 us/op 1.8210 us/op 1.27
mainnet_e81889 - altair processSyncCommitteeUpdates 626.00 ns/op 411.00 ns/op 1.52
mainnet_e81889 - altair afterProcessEpoch 91.818 ms/op 93.571 ms/op 0.98
capella processEpoch - mainnet_e217614 1.1457 s/op 1.2878 s/op 0.89
mainnet_e217614 - capella beforeProcessEpoch 124.68 ms/op 275.82 ms/op 0.45
mainnet_e217614 - capella processJustificationAndFinalization 13.029 us/op 14.360 us/op 0.91
mainnet_e217614 - capella processInactivityUpdates 16.519 ms/op 18.515 ms/op 0.89
mainnet_e217614 - capella processRewardsAndPenalties 248.34 ms/op 242.18 ms/op 1.03
mainnet_e217614 - capella processRegistryUpdates 14.005 us/op 17.576 us/op 0.80
mainnet_e217614 - capella processSlashings 674.00 ns/op 513.00 ns/op 1.31
mainnet_e217614 - capella processEth1DataReset 570.00 ns/op 433.00 ns/op 1.32
mainnet_e217614 - capella processEffectiveBalanceUpdates 5.0730 ms/op 16.891 ms/op 0.30
mainnet_e217614 - capella processSlashingsReset 5.4870 us/op 3.9440 us/op 1.39
mainnet_e217614 - capella processRandaoMixesReset 5.8110 us/op 4.3050 us/op 1.35
mainnet_e217614 - capella processHistoricalRootsUpdate 535.00 ns/op 1.2000 us/op 0.45
mainnet_e217614 - capella processParticipationFlagUpdates 2.7580 us/op 1.9940 us/op 1.38
mainnet_e217614 - capella afterProcessEpoch 274.94 ms/op 286.32 ms/op 0.96
phase0 processEpoch - mainnet_e58758 323.14 ms/op 409.84 ms/op 0.79
mainnet_e58758 - phase0 beforeProcessEpoch 79.400 ms/op 123.02 ms/op 0.65
mainnet_e58758 - phase0 processJustificationAndFinalization 12.519 us/op 15.710 us/op 0.80
mainnet_e58758 - phase0 processRewardsAndPenalties 36.022 ms/op 22.266 ms/op 1.62
mainnet_e58758 - phase0 processRegistryUpdates 6.5300 us/op 9.0780 us/op 0.72
mainnet_e58758 - phase0 processSlashings 449.00 ns/op 492.00 ns/op 0.91
mainnet_e58758 - phase0 processEth1DataReset 495.00 ns/op 427.00 ns/op 1.16
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 1.2703 ms/op 977.78 us/op 1.30
mainnet_e58758 - phase0 processSlashingsReset 6.6490 us/op 4.1170 us/op 1.62
mainnet_e58758 - phase0 processRandaoMixesReset 5.9560 us/op 6.6170 us/op 0.90
mainnet_e58758 - phase0 processHistoricalRootsUpdate 804.00 ns/op 492.00 ns/op 1.63
mainnet_e58758 - phase0 processParticipationRecordUpdates 4.7110 us/op 2.8050 us/op 1.68
mainnet_e58758 - phase0 afterProcessEpoch 76.583 ms/op 80.045 ms/op 0.96
phase0 processEffectiveBalanceUpdates - 250000 normalcase 1.0731 ms/op 1.2360 ms/op 0.87
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 2.4681 ms/op 2.3864 ms/op 1.03
altair processInactivityUpdates - 250000 normalcase 21.339 ms/op 20.073 ms/op 1.06
altair processInactivityUpdates - 250000 worstcase 20.728 ms/op 18.647 ms/op 1.11
phase0 processRegistryUpdates - 250000 normalcase 8.9580 us/op 6.5920 us/op 1.36
phase0 processRegistryUpdates - 250000 badcase_full_deposits 452.06 us/op 282.00 us/op 1.60
phase0 processRegistryUpdates - 250000 worstcase 0.5 134.57 ms/op 116.69 ms/op 1.15
altair processRewardsAndPenalties - 250000 normalcase 49.624 ms/op 43.376 ms/op 1.14
altair processRewardsAndPenalties - 250000 worstcase 47.481 ms/op 42.948 ms/op 1.11
phase0 getAttestationDeltas - 250000 normalcase 7.4356 ms/op 7.9645 ms/op 0.93
phase0 getAttestationDeltas - 250000 worstcase 7.6116 ms/op 8.8043 ms/op 0.86
phase0 processSlashings - 250000 worstcase 122.56 us/op 106.11 us/op 1.16
altair processSyncCommitteeUpdates - 250000 128.14 ms/op 136.89 ms/op 0.94
BeaconState.hashTreeRoot - No change 260.00 ns/op 425.00 ns/op 0.61
BeaconState.hashTreeRoot - 1 full validator 133.31 us/op 122.61 us/op 1.09
BeaconState.hashTreeRoot - 32 full validator 1.2501 ms/op 1.1437 ms/op 1.09
BeaconState.hashTreeRoot - 512 full validator 12.350 ms/op 12.707 ms/op 0.97
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 163.90 us/op 124.68 us/op 1.31
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 2.0008 ms/op 2.3156 ms/op 0.86
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 24.899 ms/op 31.595 ms/op 0.79
BeaconState.hashTreeRoot - 1 balances 102.33 us/op 122.28 us/op 0.84
BeaconState.hashTreeRoot - 32 balances 971.74 us/op 1.3544 ms/op 0.72
BeaconState.hashTreeRoot - 512 balances 9.9536 ms/op 14.037 ms/op 0.71
BeaconState.hashTreeRoot - 250000 balances 155.83 ms/op 189.25 ms/op 0.82
aggregationBits - 2048 els - zipIndexesInBitList 24.525 us/op 38.120 us/op 0.64
byteArrayEquals 32 54.992 ns/op 57.394 ns/op 0.96
Buffer.compare 32 17.494 ns/op 18.855 ns/op 0.93
byteArrayEquals 1024 1.6278 us/op 1.6299 us/op 1.00
Buffer.compare 1024 29.099 ns/op 27.372 ns/op 1.06
byteArrayEquals 16384 25.927 us/op 26.258 us/op 0.99
Buffer.compare 16384 199.62 ns/op 203.25 ns/op 0.98
byteArrayEquals 123687377 190.81 ms/op 194.16 ms/op 0.98
Buffer.compare 123687377 6.1432 ms/op 8.1771 ms/op 0.75
byteArrayEquals 32 - diff last byte 51.383 ns/op 52.391 ns/op 0.98
Buffer.compare 32 - diff last byte 16.877 ns/op 17.750 ns/op 0.95
byteArrayEquals 1024 - diff last byte 1.5539 us/op 1.5766 us/op 0.99
Buffer.compare 1024 - diff last byte 24.816 ns/op 26.493 ns/op 0.94
byteArrayEquals 16384 - diff last byte 24.807 us/op 25.081 us/op 0.99
Buffer.compare 16384 - diff last byte 178.51 ns/op 190.36 ns/op 0.94
byteArrayEquals 123687377 - diff last byte 191.04 ms/op 192.55 ms/op 0.99
Buffer.compare 123687377 - diff last byte 6.8937 ms/op 6.3036 ms/op 1.09
byteArrayEquals 32 - random bytes 5.1860 ns/op 5.1430 ns/op 1.01
Buffer.compare 32 - random bytes 17.426 ns/op 17.781 ns/op 0.98
byteArrayEquals 1024 - random bytes 5.2130 ns/op 5.1250 ns/op 1.02
Buffer.compare 1024 - random bytes 17.449 ns/op 17.675 ns/op 0.99
byteArrayEquals 16384 - random bytes 5.1780 ns/op 5.1190 ns/op 1.01
Buffer.compare 16384 - random bytes 17.261 ns/op 17.691 ns/op 0.98
byteArrayEquals 123687377 - random bytes 6.5200 ns/op 6.5300 ns/op 1.00
Buffer.compare 123687377 - random bytes 19.000 ns/op 19.120 ns/op 0.99
regular array get 100000 times 32.819 us/op 33.213 us/op 0.99
wrappedArray get 100000 times 42.766 us/op 32.709 us/op 1.31
arrayWithProxy get 100000 times 14.293 ms/op 13.189 ms/op 1.08
ssz.Root.equals 46.046 ns/op 45.640 ns/op 1.01
byteArrayEquals 45.460 ns/op 45.031 ns/op 1.01
Buffer.compare 10.494 ns/op 10.425 ns/op 1.01
shuffle list - 16384 els 6.2701 ms/op 6.2550 ms/op 1.00
shuffle list - 250000 els 92.773 ms/op 91.230 ms/op 1.02
processSlot - 1 slots 12.550 us/op 12.869 us/op 0.98
processSlot - 32 slots 2.9722 ms/op 2.8433 ms/op 1.05
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 40.779 ms/op 36.244 ms/op 1.13
getCommitteeAssignments - req 1 vs - 250000 vc 2.1737 ms/op 2.1460 ms/op 1.01
getCommitteeAssignments - req 100 vs - 250000 vc 4.1616 ms/op 4.2215 ms/op 0.99
getCommitteeAssignments - req 1000 vs - 250000 vc 4.4446 ms/op 4.5269 ms/op 0.98
findModifiedValidators - 10000 modified validators 247.96 ms/op 251.79 ms/op 0.98
findModifiedValidators - 1000 modified validators 171.37 ms/op 196.34 ms/op 0.87
findModifiedValidators - 100 modified validators 183.05 ms/op 171.40 ms/op 1.07
findModifiedValidators - 10 modified validators 172.24 ms/op 202.65 ms/op 0.85
findModifiedValidators - 1 modified validators 159.94 ms/op 179.11 ms/op 0.89
findModifiedValidators - no difference 148.23 ms/op 167.62 ms/op 0.88
compare ViewDUs 3.1377 s/op 3.0496 s/op 1.03
compare each validator Uint8Array 1.8195 s/op 996.62 ms/op 1.83
compare ViewDU to Uint8Array 954.91 ms/op 1.0236 s/op 0.93
migrate state 1000000 validators, 24 modified, 0 new 621.62 ms/op 579.08 ms/op 1.07
migrate state 1000000 validators, 1700 modified, 1000 new 895.20 ms/op 835.89 ms/op 1.07
migrate state 1000000 validators, 3400 modified, 2000 new 1.0956 s/op 995.84 ms/op 1.10
migrate state 1500000 validators, 24 modified, 0 new 628.40 ms/op 566.09 ms/op 1.11
migrate state 1500000 validators, 1700 modified, 1000 new 898.15 ms/op 771.57 ms/op 1.16
migrate state 1500000 validators, 3400 modified, 2000 new 1.1069 s/op 952.64 ms/op 1.16
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei 4.3000 ns/op 4.7700 ns/op 0.90
state getBlockRootAtSlot - 250000 vs - 7PWei 950.45 ns/op 640.17 ns/op 1.48
computeProposers - vc 250000 6.5020 ms/op 8.2538 ms/op 0.79
computeEpochShuffling - vc 250000 90.543 ms/op 94.202 ms/op 0.96
getNextSyncCommittee - vc 250000 126.94 ms/op 137.29 ms/op 0.92
computeSigningRoot for AttestationData 23.995 us/op 22.041 us/op 1.09
hash AttestationData serialized data then Buffer.toString(base64) 1.4833 us/op 1.5252 us/op 0.97
toHexString serialized data 882.07 ns/op 891.44 ns/op 0.99
Buffer.toString(base64) 171.81 ns/op 198.65 ns/op 0.86

by benchmarkbot/action

@twoeths
Copy link
Contributor

twoeths commented Jul 12, 2024

@wemeetagain could you verify through beforeProcessEpoch benchmarks first? it'll save us a lot of time

@twoeths
Copy link
Contributor

twoeths commented Jul 12, 2024

also @wemeetagain this is on stable-mainnet test node

Screenshot 2024-07-12 at 15 17 58

on a real mainnet node

Screenshot 2024-07-12 at 15 18 22

it's 3x slower, could be the gc. If we allocate arrays once at the file level (and grow if needed similar to the BufferPool) there's a chance we can cut down a lot of gc time. But for each array we need to track its length inside EpochTransitionCache because we do not use up to the preallocated length

@twoeths
Copy link
Contributor

twoeths commented Jul 13, 2024

feat3-lg1k shows amazing result @wemeetagain 🚀 , it saved ~620ms on beforeProcessEpoch on a 6h rate interval. I hope this saves ~500ms or more on a real mainnet node

@twoeths
Copy link
Contributor

twoeths commented Jul 16, 2024

persist some great metrics in this branch (6h rate interval)

  • epoch transition on holesky 1k node (~2.5s on stable)
Screenshot 2024-07-16 at 08 37 47
  • beforeProcessEpoch ( ~900ms on stable)
Screenshot 2024-07-16 at 08 39 09

Copy link
Contributor

@twoeths twoeths left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a great improvement on state transition ❤️

@twoeths twoeths merged commit 79380f0 into unstable Jul 16, 2024
25 of 26 checks passed
@twoeths twoeths deleted the cayman/epoch-transition-cache-flags branch July 16, 2024 01:46
@wemeetagain
Copy link
Member Author

🎉 This PR is included in v1.21.0 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants