Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use rust shuffle #7120

Merged
merged 17 commits into from
Oct 11, 2024
Merged

feat: use rust shuffle #7120

merged 17 commits into from
Oct 11, 2024

Conversation

matthewkeil
Copy link
Member

Motivation

Moves shuffling computation to native code for async/multithreaded implementation

Copy link

codecov bot commented Oct 1, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 49.13%. Comparing base (cbc7c90) to head (9f2db05).
Report is 2 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #7120      +/-   ##
============================================
+ Coverage     49.12%   49.13%   +0.01%     
============================================
  Files           597      598       +1     
  Lines         39721    39753      +32     
  Branches       2075     2075              
============================================
+ Hits          19511    19533      +22     
- Misses        20169    20179      +10     
  Partials         41       41              

Copy link
Contributor

github-actions bot commented Oct 1, 2024

⚠️ Performance Alert ⚠️

Possible performance regression was detected for some benchmarks.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold.

Benchmark suite Current: cafe868 Previous: ac6edd3 Ratio
send data - 1000 4096B messages 95.552 ms/op 31.093 ms/op 3.07
forkChoice updateHead vc 600000 bc 64 eq 300000 45.724 ms/op 12.221 ms/op 3.74
Full benchmark results
Benchmark suite Current: cafe868 Previous: ac6edd3 Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 2.0811 ms/op 1.9204 ms/op 1.08
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 54.907 us/op 39.560 us/op 1.39
BLS verify - blst 765.38 us/op 898.92 us/op 0.85
BLS verifyMultipleSignatures 3 - blst 1.2398 ms/op 1.3742 ms/op 0.90
BLS verifyMultipleSignatures 8 - blst 1.8437 ms/op 2.1552 ms/op 0.86
BLS verifyMultipleSignatures 32 - blst 5.2762 ms/op 4.5677 ms/op 1.16
BLS verifyMultipleSignatures 64 - blst 9.1603 ms/op 8.5810 ms/op 1.07
BLS verifyMultipleSignatures 128 - blst 17.335 ms/op 16.269 ms/op 1.07
BLS deserializing 10000 signatures 726.71 ms/op 616.99 ms/op 1.18
BLS deserializing 100000 signatures 7.1361 s/op 6.2932 s/op 1.13
BLS verifyMultipleSignatures - same message - 3 - blst 1.0093 ms/op 989.40 us/op 1.02
BLS verifyMultipleSignatures - same message - 8 - blst 1.1833 ms/op 1.0942 ms/op 1.08
BLS verifyMultipleSignatures - same message - 32 - blst 1.8373 ms/op 1.7047 ms/op 1.08
BLS verifyMultipleSignatures - same message - 64 - blst 2.6581 ms/op 2.6104 ms/op 1.02
BLS verifyMultipleSignatures - same message - 128 - blst 4.4426 ms/op 4.2510 ms/op 1.05
BLS aggregatePubkeys 32 - blst 19.938 us/op 18.701 us/op 1.07
BLS aggregatePubkeys 128 - blst 71.552 us/op 64.932 us/op 1.10
notSeenSlots=1 numMissedVotes=1 numBadVotes=10 78.361 ms/op 66.626 ms/op 1.18
notSeenSlots=1 numMissedVotes=0 numBadVotes=4 58.615 ms/op 47.974 ms/op 1.22
notSeenSlots=2 numMissedVotes=1 numBadVotes=10 47.262 ms/op 31.813 ms/op 1.49
getSlashingsAndExits - default max 96.570 us/op 82.961 us/op 1.16
getSlashingsAndExits - 2k 318.96 us/op 304.25 us/op 1.05
proposeBlockBody type=full, size=empty 6.9267 ms/op 4.7030 ms/op 1.47
isKnown best case - 1 super set check 308.00 ns/op 548.00 ns/op 0.56
isKnown normal case - 2 super set checks 285.00 ns/op 481.00 ns/op 0.59
isKnown worse case - 16 super set checks 294.00 ns/op 475.00 ns/op 0.62
InMemoryCheckpointStateCache - add get delete 3.0540 us/op 2.9970 us/op 1.02
updateUnfinalizedPubkeys - updating 10 pubkeys 1.2077 ms/op 871.76 us/op 1.39
updateUnfinalizedPubkeys - updating 100 pubkeys 3.7018 ms/op 2.8367 ms/op 1.30
updateUnfinalizedPubkeys - updating 1000 pubkeys 52.598 ms/op 38.542 ms/op 1.36
validate api signedAggregateAndProof - struct 1.4403 ms/op 1.5673 ms/op 0.92
validate gossip signedAggregateAndProof - struct 1.3672 ms/op 1.5267 ms/op 0.90
batch validate gossip attestation - vc 640000 - chunk 32 127.67 us/op 124.93 us/op 1.02
batch validate gossip attestation - vc 640000 - chunk 64 136.45 us/op 114.94 us/op 1.19
batch validate gossip attestation - vc 640000 - chunk 128 133.64 us/op 109.35 us/op 1.22
batch validate gossip attestation - vc 640000 - chunk 256 133.09 us/op 105.38 us/op 1.26
pickEth1Vote - no votes 1.6119 ms/op 893.66 us/op 1.80
pickEth1Vote - max votes 11.354 ms/op 4.9052 ms/op 2.31
pickEth1Vote - Eth1Data hashTreeRoot value x2048 19.671 ms/op 13.985 ms/op 1.41
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 24.796 ms/op 22.188 ms/op 1.12
pickEth1Vote - Eth1Data fastSerialize value x2048 588.05 us/op 373.22 us/op 1.58
pickEth1Vote - Eth1Data fastSerialize tree x2048 3.2052 ms/op 3.1453 ms/op 1.02
bytes32 toHexString 663.00 ns/op 896.00 ns/op 0.74
bytes32 Buffer.toString(hex) 274.00 ns/op 510.00 ns/op 0.54
bytes32 Buffer.toString(hex) from Uint8Array 533.00 ns/op 712.00 ns/op 0.75
bytes32 Buffer.toString(hex) + 0x 292.00 ns/op 484.00 ns/op 0.60
Object access 1 prop 0.20100 ns/op 0.38900 ns/op 0.52
Map access 1 prop 0.14800 ns/op 0.34400 ns/op 0.43
Object get x1000 6.3210 ns/op 5.2860 ns/op 1.20
Map get x1000 6.9020 ns/op 6.3410 ns/op 1.09
Object set x1000 58.261 ns/op 25.085 ns/op 2.32
Map set x1000 39.720 ns/op 22.085 ns/op 1.80
Return object 10000 times 0.34680 ns/op 0.31840 ns/op 1.09
Throw Error 10000 times 3.8099 us/op 3.1949 us/op 1.19
toHex 187.51 ns/op 133.55 ns/op 1.40
Buffer.from 183.47 ns/op 112.24 ns/op 1.63
shared Buffer 114.84 ns/op 92.703 ns/op 1.24
fastMsgIdFn sha256 / 200 bytes 2.8410 us/op 2.3970 us/op 1.19
fastMsgIdFn h32 xxhash / 200 bytes 378.00 ns/op 514.00 ns/op 0.74
fastMsgIdFn h64 xxhash / 200 bytes 359.00 ns/op 468.00 ns/op 0.77
fastMsgIdFn sha256 / 1000 bytes 9.0680 us/op 6.2330 us/op 1.45
fastMsgIdFn h32 xxhash / 1000 bytes 512.00 ns/op 595.00 ns/op 0.86
fastMsgIdFn h64 xxhash / 1000 bytes 400.00 ns/op 581.00 ns/op 0.69
fastMsgIdFn sha256 / 10000 bytes 72.954 us/op 59.864 us/op 1.22
fastMsgIdFn h32 xxhash / 10000 bytes 2.1660 us/op 2.1390 us/op 1.01
fastMsgIdFn h64 xxhash / 10000 bytes 1.4590 us/op 1.4100 us/op 1.03
send data - 1000 256B messages 23.758 ms/op 10.938 ms/op 2.17
send data - 1000 512B messages 24.757 ms/op 16.919 ms/op 1.46
send data - 1000 1024B messages 39.100 ms/op 26.077 ms/op 1.50
send data - 1000 1200B messages 40.897 ms/op 25.485 ms/op 1.60
send data - 1000 2048B messages 51.674 ms/op 35.649 ms/op 1.45
send data - 1000 4096B messages 95.552 ms/op 31.093 ms/op 3.07
send data - 1000 16384B messages 187.66 ms/op 71.227 ms/op 2.63
send data - 1000 65536B messages 457.89 ms/op 541.30 ms/op 0.85
enrSubnets - fastDeserialize 64 bits 1.9940 us/op 1.7220 us/op 1.16
enrSubnets - ssz BitVector 64 bits 574.00 ns/op 847.00 ns/op 0.68
enrSubnets - fastDeserialize 4 bits 256.00 ns/op 495.00 ns/op 0.52
enrSubnets - ssz BitVector 4 bits 661.00 ns/op 799.00 ns/op 0.83
prioritizePeers score -10:0 att 32-0.1 sync 2-0 285.89 us/op 256.68 us/op 1.11
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 285.48 us/op 305.57 us/op 0.93
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 547.18 us/op 534.34 us/op 1.02
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 545.86 us/op 591.33 us/op 0.92
prioritizePeers score 0:0 att 64-1 sync 4-1 1.1264 ms/op 1.1677 ms/op 0.96
array of 16000 items push then shift 2.0054 us/op 1.6760 us/op 1.20
LinkedList of 16000 items push then shift 12.756 ns/op 10.528 ns/op 1.21
array of 16000 items push then pop 178.73 ns/op 187.12 ns/op 0.96
LinkedList of 16000 items push then pop 10.230 ns/op 8.3810 ns/op 1.22
array of 24000 items push then shift 3.1963 us/op 2.1895 us/op 1.46
LinkedList of 24000 items push then shift 12.018 ns/op 11.874 ns/op 1.01
array of 24000 items push then pop 263.08 ns/op 217.78 ns/op 1.21
LinkedList of 24000 items push then pop 12.670 ns/op 11.271 ns/op 1.12
intersect bitArray bitLen 8 8.1770 ns/op 6.3000 ns/op 1.30
intersect array and set length 8 114.87 ns/op 94.322 ns/op 1.22
intersect bitArray bitLen 128 37.085 ns/op 32.119 ns/op 1.15
intersect array and set length 128 1.2875 us/op 1.0127 us/op 1.27
bitArray.getTrueBitIndexes() bitLen 128 3.3820 us/op 3.0460 us/op 1.11
bitArray.getTrueBitIndexes() bitLen 248 7.6940 us/op 3.8340 us/op 2.01
bitArray.getTrueBitIndexes() bitLen 512 11.935 us/op 9.3330 us/op 1.28
Buffer.concat 32 items 1.7270 us/op 1.2100 us/op 1.43
Uint8Array.set 32 items 3.4530 us/op 1.7760 us/op 1.94
Buffer.copy 3.8630 us/op 2.0720 us/op 1.86
Uint8Array.set - with subarray 4.9190 us/op 2.2560 us/op 2.18
Uint8Array.set - without subarray 2.2980 us/op 1.9770 us/op 1.16
getUint32 - dataview 413.00 ns/op 532.00 ns/op 0.78
getUint32 - manual 322.00 ns/op 522.00 ns/op 0.62
Set add up to 64 items then delete first 3.9352 us/op 2.2781 us/op 1.73
OrderedSet add up to 64 items then delete first 5.4416 us/op 3.0991 us/op 1.76
Set add up to 64 items then delete last 4.0883 us/op 4.4103 us/op 0.93
OrderedSet add up to 64 items then delete last 6.1608 us/op 5.2723 us/op 1.17
Set add up to 64 items then delete middle 4.2452 us/op 3.2397 us/op 1.31
OrderedSet add up to 64 items then delete middle 8.3846 us/op 6.9798 us/op 1.20
Set add up to 128 items then delete first 9.0222 us/op 6.3323 us/op 1.42
OrderedSet add up to 128 items then delete first 14.442 us/op 7.5891 us/op 1.90
Set add up to 128 items then delete last 8.2310 us/op 6.1522 us/op 1.34
OrderedSet add up to 128 items then delete last 12.644 us/op 9.2507 us/op 1.37
Set add up to 128 items then delete middle 7.5340 us/op 5.4761 us/op 1.38
OrderedSet add up to 128 items then delete middle 20.803 us/op 15.739 us/op 1.32
Set add up to 256 items then delete first 18.062 us/op 12.492 us/op 1.45
OrderedSet add up to 256 items then delete first 24.649 us/op 18.626 us/op 1.32
Set add up to 256 items then delete last 18.400 us/op 10.199 us/op 1.80
OrderedSet add up to 256 items then delete last 24.227 us/op 15.434 us/op 1.57
Set add up to 256 items then delete middle 16.088 us/op 8.7193 us/op 1.85
OrderedSet add up to 256 items then delete middle 61.124 us/op 40.521 us/op 1.51
transfer serialized Status (84 B) 2.4990 us/op 1.4910 us/op 1.68
copy serialized Status (84 B) 2.1400 us/op 1.3910 us/op 1.54
transfer serialized SignedVoluntaryExit (112 B) 1.9240 us/op 1.5540 us/op 1.24
copy serialized SignedVoluntaryExit (112 B) 1.7900 us/op 1.5800 us/op 1.13
transfer serialized ProposerSlashing (416 B) 2.7080 us/op 2.3130 us/op 1.17
copy serialized ProposerSlashing (416 B) 2.6640 us/op 2.5450 us/op 1.05
transfer serialized Attestation (485 B) 2.5400 us/op 2.0620 us/op 1.23
copy serialized Attestation (485 B) 3.0510 us/op 2.1810 us/op 1.40
transfer serialized AttesterSlashing (33232 B) 4.2790 us/op 2.5530 us/op 1.68
copy serialized AttesterSlashing (33232 B) 11.925 us/op 4.8200 us/op 2.47
transfer serialized Small SignedBeaconBlock (128000 B) 4.5020 us/op 2.6460 us/op 1.70
copy serialized Small SignedBeaconBlock (128000 B) 33.253 us/op 13.227 us/op 2.51
transfer serialized Avg SignedBeaconBlock (200000 B) 5.2040 us/op 2.8010 us/op 1.86
copy serialized Avg SignedBeaconBlock (200000 B) 48.775 us/op 12.476 us/op 3.91
transfer serialized BlobsSidecar (524380 B) 6.1730 us/op 2.6340 us/op 2.34
copy serialized BlobsSidecar (524380 B) 161.15 us/op 113.04 us/op 1.43
transfer serialized Big SignedBeaconBlock (1000000 B) 8.2860 us/op 3.3680 us/op 2.46
copy serialized Big SignedBeaconBlock (1000000 B) 314.32 us/op 141.36 us/op 2.22
pass gossip attestations to forkchoice per slot 4.5241 ms/op 2.6964 ms/op 1.68
forkChoice updateHead vc 100000 bc 64 eq 0 780.72 us/op 355.03 us/op 2.20
forkChoice updateHead vc 600000 bc 64 eq 0 5.0034 ms/op 3.8167 ms/op 1.31
forkChoice updateHead vc 1000000 bc 64 eq 0 10.047 ms/op 4.0543 ms/op 2.48
forkChoice updateHead vc 600000 bc 320 eq 0 6.0208 ms/op 2.4958 ms/op 2.41
forkChoice updateHead vc 600000 bc 1200 eq 0 4.6927 ms/op 2.6482 ms/op 1.77
forkChoice updateHead vc 600000 bc 7200 eq 0 5.7486 ms/op 2.9965 ms/op 1.92
forkChoice updateHead vc 600000 bc 64 eq 1000 13.437 ms/op 10.668 ms/op 1.26
forkChoice updateHead vc 600000 bc 64 eq 10000 14.661 ms/op 9.7946 ms/op 1.50
forkChoice updateHead vc 600000 bc 64 eq 300000 45.724 ms/op 12.221 ms/op 3.74
computeDeltas 500000 validators 300 proto nodes 5.9161 ms/op 3.6387 ms/op 1.63
computeDeltas 500000 validators 1200 proto nodes 6.0686 ms/op 3.7342 ms/op 1.63
computeDeltas 500000 validators 7200 proto nodes 4.9576 ms/op 3.4967 ms/op 1.42
computeDeltas 750000 validators 300 proto nodes 6.7886 ms/op 5.1593 ms/op 1.32
computeDeltas 750000 validators 1200 proto nodes 6.5015 ms/op 5.2520 ms/op 1.24
computeDeltas 750000 validators 7200 proto nodes 6.2964 ms/op 5.3245 ms/op 1.18
computeDeltas 1400000 validators 300 proto nodes 11.721 ms/op 9.8863 ms/op 1.19
computeDeltas 1400000 validators 1200 proto nodes 12.108 ms/op 9.6558 ms/op 1.25
computeDeltas 1400000 validators 7200 proto nodes 11.700 ms/op 9.7000 ms/op 1.21
computeDeltas 2100000 validators 300 proto nodes 17.060 ms/op 16.990 ms/op 1.00
computeDeltas 2100000 validators 1200 proto nodes 17.058 ms/op 18.979 ms/op 0.90
computeDeltas 2100000 validators 7200 proto nodes 16.968 ms/op 15.660 ms/op 1.08
altair processAttestation - 250000 vs - 7PWei normalcase 1.8956 ms/op 2.3777 ms/op 0.80
altair processAttestation - 250000 vs - 7PWei worstcase 2.6893 ms/op 3.5025 ms/op 0.77
altair processAttestation - setStatus - 1/6 committees join 122.36 us/op 72.341 us/op 1.69
altair processAttestation - setStatus - 1/3 committees join 242.42 us/op 149.83 us/op 1.62
altair processAttestation - setStatus - 1/2 committees join 319.83 us/op 237.31 us/op 1.35
altair processAttestation - setStatus - 2/3 committees join 447.39 us/op 317.13 us/op 1.41
altair processAttestation - setStatus - 4/5 committees join 631.93 us/op 449.53 us/op 1.41
altair processAttestation - setStatus - 100% committees join 732.55 us/op 545.13 us/op 1.34
altair processBlock - 250000 vs - 7PWei normalcase 5.0794 ms/op 7.9496 ms/op 0.64
altair processBlock - 250000 vs - 7PWei normalcase hashState 31.635 ms/op 27.443 ms/op 1.15
altair processBlock - 250000 vs - 7PWei worstcase 40.787 ms/op 37.072 ms/op 1.10
altair processBlock - 250000 vs - 7PWei worstcase hashState 76.701 ms/op 75.007 ms/op 1.02
phase0 processBlock - 250000 vs - 7PWei normalcase 2.3636 ms/op 2.3181 ms/op 1.02
phase0 processBlock - 250000 vs - 7PWei worstcase 25.899 ms/op 23.530 ms/op 1.10
altair processEth1Data - 250000 vs - 7PWei normalcase 400.72 us/op 249.48 us/op 1.61
getExpectedWithdrawals 250000 eb:1,eth1:1,we:0,wn:0,smpl:15 6.7030 us/op 5.2270 us/op 1.28
getExpectedWithdrawals 250000 eb:0.95,eth1:0.1,we:0.05,wn:0,smpl:219 45.832 us/op 32.433 us/op 1.41
getExpectedWithdrawals 250000 eb:0.95,eth1:0.3,we:0.05,wn:0,smpl:42 12.100 us/op 9.8310 us/op 1.23
getExpectedWithdrawals 250000 eb:0.95,eth1:0.7,we:0.05,wn:0,smpl:18 7.6020 us/op 6.5550 us/op 1.16
getExpectedWithdrawals 250000 eb:0.1,eth1:0.1,we:0,wn:0,smpl:1020 174.42 us/op 129.12 us/op 1.35
getExpectedWithdrawals 250000 eb:0.03,eth1:0.03,we:0,wn:0,smpl:11777 1.2157 ms/op 1.5190 ms/op 0.80
getExpectedWithdrawals 250000 eb:0.01,eth1:0.01,we:0,wn:0,smpl:16384 1.5940 ms/op 1.2299 ms/op 1.30
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,smpl:16384 1.5889 ms/op 1.2067 ms/op 1.32
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,nocache,smpl:16384 4.1662 ms/op 2.9728 ms/op 1.40
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,smpl:16384 1.6880 ms/op 1.3222 ms/op 1.28
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,nocache,smpl:16384 3.9469 ms/op 3.4650 ms/op 1.14
Tree 40 250000 create 259.82 ms/op 222.80 ms/op 1.17
Tree 40 250000 get(125000) 164.72 ns/op 124.08 ns/op 1.33
Tree 40 250000 set(125000) 743.74 ns/op 573.02 ns/op 1.30
Tree 40 250000 toArray() 21.612 ms/op 19.261 ms/op 1.12
Tree 40 250000 iterate all - toArray() + loop 21.835 ms/op 21.185 ms/op 1.03
Tree 40 250000 iterate all - get(i) 59.290 ms/op 52.360 ms/op 1.13
Array 250000 create 3.7773 ms/op 3.2951 ms/op 1.15
Array 250000 clone - spread 1.5242 ms/op 1.3187 ms/op 1.16
Array 250000 get(125000) 0.43800 ns/op 0.62900 ns/op 0.70
Array 250000 set(125000) 0.46000 ns/op 0.60400 ns/op 0.76
Array 250000 iterate all - loop 88.419 us/op 77.227 us/op 1.14
phase0 afterProcessEpoch - 250000 vs - 7PWei 53.404 ms/op 79.980 ms/op 0.67
Array.fill - length 1000000 3.7408 ms/op 3.7940 ms/op 0.99
Array push - length 1000000 18.276 ms/op 23.446 ms/op 0.78
Array.get 0.28933 ns/op 0.27727 ns/op 1.04
Uint8Array.get 0.43460 ns/op 0.36063 ns/op 1.21
phase0 beforeProcessEpoch - 250000 vs - 7PWei 18.880 ms/op 22.445 ms/op 0.84
altair processEpoch - mainnet_e81889 298.60 ms/op 306.13 ms/op 0.98
mainnet_e81889 - altair beforeProcessEpoch 20.469 ms/op 17.391 ms/op 1.18
mainnet_e81889 - altair processJustificationAndFinalization 13.871 us/op 14.043 us/op 0.99
mainnet_e81889 - altair processInactivityUpdates 6.4604 ms/op 4.7967 ms/op 1.35
mainnet_e81889 - altair processRewardsAndPenalties 41.072 ms/op 55.773 ms/op 0.74
mainnet_e81889 - altair processRegistryUpdates 1.7400 us/op 2.3400 us/op 0.74
mainnet_e81889 - altair processSlashings 460.00 ns/op 956.00 ns/op 0.48
mainnet_e81889 - altair processEth1DataReset 380.00 ns/op 791.00 ns/op 0.48
mainnet_e81889 - altair processEffectiveBalanceUpdates 1.9267 ms/op 1.0995 ms/op 1.75
mainnet_e81889 - altair processSlashingsReset 3.5000 us/op 2.9970 us/op 1.17
mainnet_e81889 - altair processRandaoMixesReset 4.9840 us/op 3.9390 us/op 1.27
mainnet_e81889 - altair processHistoricalRootsUpdate 632.00 ns/op 789.00 ns/op 0.80
mainnet_e81889 - altair processParticipationFlagUpdates 2.9360 us/op 2.4860 us/op 1.18
mainnet_e81889 - altair processSyncCommitteeUpdates 900.00 ns/op 676.00 ns/op 1.33
mainnet_e81889 - altair afterProcessEpoch 54.241 ms/op 79.409 ms/op 0.68
capella processEpoch - mainnet_e217614 1.2244 s/op 1.1434 s/op 1.07
mainnet_e217614 - capella beforeProcessEpoch 79.338 ms/op 66.945 ms/op 1.19
mainnet_e217614 - capella processJustificationAndFinalization 17.202 us/op 12.042 us/op 1.43
mainnet_e217614 - capella processInactivityUpdates 19.565 ms/op 12.975 ms/op 1.51
mainnet_e217614 - capella processRewardsAndPenalties 231.30 ms/op 246.79 ms/op 0.94
mainnet_e217614 - capella processRegistryUpdates 14.316 us/op 10.753 us/op 1.33
mainnet_e217614 - capella processSlashings 406.00 ns/op 781.00 ns/op 0.52
mainnet_e217614 - capella processEth1DataReset 258.00 ns/op 660.00 ns/op 0.39
mainnet_e217614 - capella processEffectiveBalanceUpdates 19.359 ms/op 5.7367 ms/op 3.37
mainnet_e217614 - capella processSlashingsReset 3.6040 us/op 2.5470 us/op 1.41
mainnet_e217614 - capella processRandaoMixesReset 6.6120 us/op 3.4900 us/op 1.89
mainnet_e217614 - capella processHistoricalRootsUpdate 509.00 ns/op 718.00 ns/op 0.71
mainnet_e217614 - capella processParticipationFlagUpdates 1.8620 us/op 1.7030 us/op 1.09
mainnet_e217614 - capella afterProcessEpoch 125.62 ms/op 198.29 ms/op 0.63
phase0 processEpoch - mainnet_e58758 328.65 ms/op 337.65 ms/op 0.97
mainnet_e58758 - phase0 beforeProcessEpoch 81.440 ms/op 77.098 ms/op 1.06
mainnet_e58758 - phase0 processJustificationAndFinalization 15.774 us/op 10.025 us/op 1.57
mainnet_e58758 - phase0 processRewardsAndPenalties 34.109 ms/op 34.291 ms/op 0.99
mainnet_e58758 - phase0 processRegistryUpdates 7.5790 us/op 5.8330 us/op 1.30
mainnet_e58758 - phase0 processSlashings 345.00 ns/op 806.00 ns/op 0.43
mainnet_e58758 - phase0 processEth1DataReset 321.00 ns/op 721.00 ns/op 0.45
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 1.2283 ms/op 1.3683 ms/op 0.90
mainnet_e58758 - phase0 processSlashingsReset 3.0950 us/op 2.0320 us/op 1.52
mainnet_e58758 - phase0 processRandaoMixesReset 3.5860 us/op 3.5750 us/op 1.00
mainnet_e58758 - phase0 processHistoricalRootsUpdate 318.00 ns/op 766.00 ns/op 0.42
mainnet_e58758 - phase0 processParticipationRecordUpdates 2.9880 us/op 3.1920 us/op 0.94
mainnet_e58758 - phase0 afterProcessEpoch 43.155 ms/op 61.709 ms/op 0.70
phase0 processEffectiveBalanceUpdates - 250000 normalcase 1.9324 ms/op 988.74 us/op 1.95
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 2.4203 ms/op 1.5894 ms/op 1.52
altair processInactivityUpdates - 250000 normalcase 16.905 ms/op 17.099 ms/op 0.99
altair processInactivityUpdates - 250000 worstcase 16.138 ms/op 17.587 ms/op 0.92
phase0 processRegistryUpdates - 250000 normalcase 9.6140 us/op 7.1050 us/op 1.35
phase0 processRegistryUpdates - 250000 badcase_full_deposits 313.15 us/op 302.31 us/op 1.04
phase0 processRegistryUpdates - 250000 worstcase 0.5 105.45 ms/op 107.00 ms/op 0.99
altair processRewardsAndPenalties - 250000 normalcase 35.249 ms/op 45.341 ms/op 0.78
altair processRewardsAndPenalties - 250000 worstcase 37.463 ms/op 37.840 ms/op 0.99
phase0 getAttestationDeltas - 250000 normalcase 7.6762 ms/op 5.9599 ms/op 1.29
phase0 getAttestationDeltas - 250000 worstcase 7.7569 ms/op 6.0685 ms/op 1.28
phase0 processSlashings - 250000 worstcase 106.42 us/op 83.104 us/op 1.28
altair processSyncCommitteeUpdates - 250000 130.52 ms/op 97.052 ms/op 1.34
BeaconState.hashTreeRoot - No change 235.00 ns/op 458.00 ns/op 0.51
BeaconState.hashTreeRoot - 1 full validator 138.51 us/op 87.623 us/op 1.58
BeaconState.hashTreeRoot - 32 full validator 1.4596 ms/op 1.0850 ms/op 1.35
BeaconState.hashTreeRoot - 512 full validator 11.127 ms/op 10.523 ms/op 1.06
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 131.87 us/op 125.69 us/op 1.05
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 1.4937 ms/op 1.5622 ms/op 0.96
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 18.374 ms/op 27.840 ms/op 0.66
BeaconState.hashTreeRoot - 1 balances 99.076 us/op 82.283 us/op 1.20
BeaconState.hashTreeRoot - 32 balances 775.46 us/op 1.0130 ms/op 0.77
BeaconState.hashTreeRoot - 512 balances 7.2966 ms/op 10.494 ms/op 0.70
BeaconState.hashTreeRoot - 250000 balances 147.60 ms/op 178.60 ms/op 0.83
aggregationBits - 2048 els - zipIndexesInBitList 25.112 us/op 20.546 us/op 1.22
byteArrayEquals 32 55.071 ns/op 49.003 ns/op 1.12
Buffer.compare 32 17.440 ns/op 16.161 ns/op 1.08
byteArrayEquals 1024 1.6326 us/op 1.2888 us/op 1.27
Buffer.compare 1024 26.058 ns/op 24.238 ns/op 1.08
byteArrayEquals 16384 25.978 us/op 20.385 us/op 1.27
Buffer.compare 16384 211.88 ns/op 185.87 ns/op 1.14
byteArrayEquals 123687377 192.95 ms/op 152.20 ms/op 1.27
Buffer.compare 123687377 7.1662 ms/op 4.2070 ms/op 1.70
byteArrayEquals 32 - diff last byte 52.231 ns/op 48.243 ns/op 1.08
Buffer.compare 32 - diff last byte 17.257 ns/op 16.944 ns/op 1.02
byteArrayEquals 1024 - diff last byte 1.5903 us/op 1.2880 us/op 1.23
Buffer.compare 1024 - diff last byte 25.182 ns/op 23.946 ns/op 1.05
byteArrayEquals 16384 - diff last byte 25.455 us/op 20.477 us/op 1.24
Buffer.compare 16384 - diff last byte 196.01 ns/op 215.31 ns/op 0.91
byteArrayEquals 123687377 - diff last byte 192.44 ms/op 151.04 ms/op 1.27
Buffer.compare 123687377 - diff last byte 7.2938 ms/op 4.9965 ms/op 1.46
byteArrayEquals 32 - random bytes 6.9180 ns/op 5.0290 ns/op 1.38
Buffer.compare 32 - random bytes 17.390 ns/op 16.653 ns/op 1.04
byteArrayEquals 1024 - random bytes 5.1550 ns/op 4.9890 ns/op 1.03
Buffer.compare 1024 - random bytes 17.113 ns/op 16.974 ns/op 1.01
byteArrayEquals 16384 - random bytes 5.1380 ns/op 5.0290 ns/op 1.02
Buffer.compare 16384 - random bytes 17.099 ns/op 16.573 ns/op 1.03
byteArrayEquals 123687377 - random bytes 6.4300 ns/op 8.0200 ns/op 0.80
Buffer.compare 123687377 - random bytes 18.350 ns/op 20.010 ns/op 0.92
regular array get 100000 times 41.954 us/op 31.522 us/op 1.33
wrappedArray get 100000 times 33.092 us/op 31.526 us/op 1.05
arrayWithProxy get 100000 times 13.793 ms/op 11.120 ms/op 1.24
ssz.Root.equals 46.503 ns/op 44.719 ns/op 1.04
byteArrayEquals 45.399 ns/op 44.238 ns/op 1.03
Buffer.compare 10.430 ns/op 9.9170 ns/op 1.05
processSlot - 1 slots 12.391 us/op 15.230 us/op 0.81
processSlot - 32 slots 3.3096 ms/op 3.1004 ms/op 1.07
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 36.297 ms/op 38.122 ms/op 0.95
getCommitteeAssignments - req 1 vs - 250000 vc 2.1716 ms/op 1.7527 ms/op 1.24
getCommitteeAssignments - req 100 vs - 250000 vc 4.1962 ms/op 3.4437 ms/op 1.22
getCommitteeAssignments - req 1000 vs - 250000 vc 4.5441 ms/op 3.6663 ms/op 1.24
findModifiedValidators - 10000 modified validators 244.21 ms/op 238.22 ms/op 1.03
findModifiedValidators - 1000 modified validators 162.65 ms/op 156.00 ms/op 1.04
findModifiedValidators - 100 modified validators 166.64 ms/op 138.06 ms/op 1.21
findModifiedValidators - 10 modified validators 146.14 ms/op 136.05 ms/op 1.07
findModifiedValidators - 1 modified validators 178.42 ms/op 145.25 ms/op 1.23
findModifiedValidators - no difference 165.21 ms/op 148.01 ms/op 1.12
compare ViewDUs 3.0520 s/op 3.3983 s/op 0.90
compare each validator Uint8Array 1.2181 s/op 799.17 ms/op 1.52
compare ViewDU to Uint8Array 1.0944 s/op 705.81 ms/op 1.55
migrate state 1000000 validators, 24 modified, 0 new 743.03 ms/op 797.34 ms/op 0.93
migrate state 1000000 validators, 1700 modified, 1000 new 946.76 ms/op 1.0750 s/op 0.88
migrate state 1000000 validators, 3400 modified, 2000 new 1.2929 s/op 1.1947 s/op 1.08
migrate state 1500000 validators, 24 modified, 0 new 849.99 ms/op 817.64 ms/op 1.04
migrate state 1500000 validators, 1700 modified, 1000 new 1.0542 s/op 1.0937 s/op 0.96
migrate state 1500000 validators, 3400 modified, 2000 new 1.2641 s/op 1.3017 s/op 0.97
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei 4.7500 ns/op 6.5700 ns/op 0.72
state getBlockRootAtSlot - 250000 vs - 7PWei 953.69 ns/op 993.34 ns/op 0.96
computeProposers - vc 250000 6.7839 ms/op 7.4911 ms/op 0.91
computeEpochShuffling - vc 250000 41.859 ms/op 83.022 ms/op 0.50
getNextSyncCommittee - vc 250000 128.73 ms/op 145.01 ms/op 0.89
computeSigningRoot for AttestationData 23.689 us/op 19.660 us/op 1.20
hash AttestationData serialized data then Buffer.toString(base64) 1.6354 us/op 1.2527 us/op 1.31
toHexString serialized data 980.96 ns/op 891.23 ns/op 1.10
Buffer.toString(base64) 189.11 ns/op 166.14 ns/op 1.14
nodejs block root to RootHex using toHex 171.64 ns/op 123.24 ns/op 1.39
nodejs block root to RootHex using toRootHex 102.55 ns/op 78.949 ns/op 1.30
browser block root to RootHex using the deprecated toHexString 255.38 ns/op 224.22 ns/op 1.14
browser block root to RootHex using toHex 199.97 ns/op 184.90 ns/op 1.08
browser block root to RootHex using toRootHex 168.00 ns/op 157.75 ns/op 1.07

by benchmarkbot/action

@matthewkeil matthewkeil marked this pull request as ready for review October 3, 2024 12:39
@matthewkeil matthewkeil requested a review from a team as a code owner October 3, 2024 12:39
@matthewkeil matthewkeil requested review from twoeths and nflaig October 3, 2024 12:40
@twoeths
Copy link
Contributor

twoeths commented Oct 4, 2024

now in EpochCache we call Node computeEpochShuffling version as a fallback while doing napi-rs version for epoch transition, should both of these use napi-rs version and we get rid of Node version completely? the benefit is it's scanned through spec test and it'll give us a lot of confidence

I feel unsafe to let spec test scan through Node computeEpochShuffling while we use napi-rs version most of the times when running a node

@matthewkeil
Copy link
Member Author

now in EpochCache we call Node computeEpochShuffling version as a fallback while doing napi-rs version for epoch transition, should both of these use napi-rs version and we get rid of Node version completely? the benefit is it's scanned through spec test and it'll give us a lot of confidence

I feel unsafe to let spec test scan through Node computeEpochShuffling while we use napi-rs version most of the times when running a node

Both the sync and async exported functions are using the same implementation under the hood in rust so we can be sure the spec is the same for both. 🙂

The old JS implementation was deleted from lodestar and is just in the swap-or-not-shuffle repo for unit testing against, as the reference impl, and for performance comparison. Other than that it's not used. All lodestar code paths are using the rust one now.

@matthewkeil matthewkeil added the status-do-not-merge Merging this issue will break the build. Do not merge! label Oct 4, 2024
@matthewkeil
Copy link
Member Author

There is a persistent segfault in the state-transition unit test on CI that has not shown up locally for me. Investigating so added the do-not-merge tag until that is resolved @philknows @wemeetagain

@matthewkeil matthewkeil removed the status-do-not-merge Merging this issue will break the build. Do not merge! label Oct 8, 2024
@matthewkeil matthewkeil requested review from wemeetagain and twoeths and removed request for twoeths October 8, 2024 06:25
@wemeetagain wemeetagain merged commit 911a3f5 into unstable Oct 11, 2024
18 of 19 checks passed
@wemeetagain wemeetagain deleted the mkeil/use-rust-shuffle branch October 11, 2024 20:26
@twoeths
Copy link
Contributor

twoeths commented Oct 13, 2024

this saves ~850ms on unstable lg1k node, 6h rate interval

Screenshot 2024-10-13 at 17 45 02

which is the same to afterProcessEpoch time on stable lg1k node (same rate interval)
Screenshot 2024-10-13 at 17 47 03

note that stable lg1k node has higher gc so afterProcessEpoch is not as good as unstable

philknows pushed a commit that referenced this pull request Oct 18, 2024
* feat: add temp-deps to test on feat group

* feat: add build_temp_deps.sh process

* feat: use async for shuffling in epoch transition

* feat: use v0.0.1 instead of temp-deps

* chore: lint and check-types

* fix: log error context

* refactor: use toHex

* fix: address computeEpochShuffling refactor comments

* test: remove perf tests that were moved to swp-or-not-shuffle package

* test: add strict equal check for sync/async computeEpochShuffling impls

* Revert "refactor: use toHex"

This reverts commit 9d64b67.

* fix: EpochShuffling ssz type issue

* feat: upgrade swap-or-not to 0.0.2

* refactor: buildCommitteesFromShuffling

* docs: add TODO about removing state from shuffling computation

* docs: add TODO about removing state from shuffling computation
@wemeetagain
Copy link
Member

🎉 This PR is included in v1.23.0 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants