-
-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: batch hash tree root #378
Draft
matthewkeil
wants to merge
118
commits into
master
Choose a base branch
from
te/batch_hash_tree_root
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
github-actions
bot
added
as-sha256
persistent-merkle-tree
simpleserialize.com
ssz
labels
Jun 16, 2024
Ran perf tests on several machines. In all instances i stopped the major services checked with
|
|
Benchmark suite | Current: 178a6c8 | Previous: b4beed2 | Ratio |
---|---|---|---|
batchHash 250000 nodes | 314.05 ms/op | 88.547 ms/op | 3.55 |
get root 250000 nodes | 816.77 ms/op | 118.56 ms/op | 6.89 |
batchHash 500000 nodes | 564.12 ms/op | 142.17 ms/op | 3.97 |
get root 500000 nodes | 1.5257 s/op | 238.10 ms/op | 6.41 |
batchHash 1000000 nodes | 1.1228 s/op | 328.90 ms/op | 3.41 |
get root 1000000 nodes | 3.0909 s/op | 477.38 ms/op | 6.47 |
250000 validators root getter | 829.31 ms/op | 117.93 ms/op | 7.03 |
250000 validators batchHash() | 287.17 ms/op | 81.978 ms/op | 3.50 |
BeaconState ViewDU hashTreeRoot() vc=200000 | 738.93 ms/op | 101.07 ms/op | 7.31 |
🚀🚀 Significant benchmark improvement detected
Benchmark suite | Current: 178a6c8 | Previous: b4beed2 | Ratio |
---|---|---|---|
List(Validator-NS) 100000 struct -> binary | 7.4331 ms/op | 27.056 ms/op | 0.27 |
Full benchmark results
Benchmark suite | Current: 178a6c8 | Previous: b4beed2 | Ratio |
---|---|---|---|
digestTwoHashObjects 50023 times | 47.887 ms/op | 47.563 ms/op | 1.01 |
digest64 50023 times | 54.142 ms/op | 53.231 ms/op | 1.02 |
digest 50023 times | 54.710 ms/op | 54.726 ms/op | 1.00 |
input length 32 | 1.2420 us/op | 1.2230 us/op | 1.02 |
input length 64 | 1.3520 us/op | 1.3700 us/op | 0.99 |
input length 128 | 2.3310 us/op | 2.2950 us/op | 1.02 |
input length 256 | 3.4570 us/op | 3.4350 us/op | 1.01 |
input length 512 | 5.6200 us/op | 5.5950 us/op | 1.00 |
input length 1024 | 10.759 us/op | 10.736 us/op | 1.00 |
digest 1000000 times | 882.12 ms/op | 920.41 ms/op | 0.96 |
hashObjectToByteArray 50023 times | 1.4314 ms/op | 1.4270 ms/op | 1.00 |
byteArrayToHashObject 50023 times | 2.5494 ms/op | 2.5095 ms/op | 1.02 |
digest64 200092 times | 217.56 ms/op | 219.30 ms/op | 0.99 |
hash 200092 times using batchHash4UintArray64s | 238.10 ms/op | 237.94 ms/op | 1.00 |
digest64HashObjects 200092 times | 192.82 ms/op | 191.47 ms/op | 1.01 |
hash 200092 times using batchHash4HashObjectInputs | 202.43 ms/op | 199.39 ms/op | 1.02 |
getGindicesAtDepth | 4.4580 us/op | 4.3020 us/op | 1.04 |
iterateAtDepth | 7.7400 us/op | 7.6230 us/op | 1.02 |
getGindexBits | 465.00 ns/op | 478.00 ns/op | 0.97 |
gindexIterator | 1.0940 us/op | 1.0660 us/op | 1.03 |
HashComputationLevel.push then loop | 24.947 ms/op | 25.273 ms/op | 0.99 |
HashComputation[] push then loop | 49.330 ms/op | 47.331 ms/op | 1.04 |
hash 2 Uint8Array 500000 times - as-sha256 | 542.80 ms/op | 538.91 ms/op | 1.01 |
hashTwoObjects 500000 times - as-sha256 | 514.18 ms/op | 496.24 ms/op | 1.04 |
executeHashComputations - as-sha256 | 48.294 ms/op | 46.232 ms/op | 1.04 |
hash 2 Uint8Array 500000 times - noble | 1.0735 s/op | 1.0572 s/op | 1.02 |
hashTwoObjects 500000 times - noble | 1.4991 s/op | 1.4922 s/op | 1.00 |
executeHashComputations - noble | 40.414 ms/op | 40.539 ms/op | 1.00 |
hash 2 Uint8Array 500000 times - hashtree | 224.10 ms/op | 224.17 ms/op | 1.00 |
hashTwoObjects 500000 times - hashtree | 218.22 ms/op | 218.75 ms/op | 1.00 |
executeHashComputations - hashtree | 10.857 ms/op | 10.730 ms/op | 1.01 |
getNodeH() x7812.5 avg hindex | 12.621 us/op | 12.511 us/op | 1.01 |
getNodeH() x7812.5 index 0 | 6.3370 us/op | 6.2320 us/op | 1.02 |
getNodeH() x7812.5 index 7 | 6.2730 us/op | 6.3240 us/op | 0.99 |
getNodeH() x7812.5 index 7 with key array | 6.2510 us/op | 6.2210 us/op | 1.00 |
new LeafNode() x7812.5 | 14.740 us/op | 14.723 us/op | 1.00 |
getHashComputations 250000 nodes | 15.312 ms/op | 19.422 ms/op | 0.79 |
batchHash 250000 nodes | 314.05 ms/op | 88.547 ms/op | 3.55 |
get root 250000 nodes | 816.77 ms/op | 118.56 ms/op | 6.89 |
getHashComputations 500000 nodes | 30.479 ms/op | 28.728 ms/op | 1.06 |
batchHash 500000 nodes | 564.12 ms/op | 142.17 ms/op | 3.97 |
get root 500000 nodes | 1.5257 s/op | 238.10 ms/op | 6.41 |
getHashComputations 1000000 nodes | 51.810 ms/op | 70.132 ms/op | 0.74 |
batchHash 1000000 nodes | 1.1228 s/op | 328.90 ms/op | 3.41 |
get root 1000000 nodes | 3.0909 s/op | 477.38 ms/op | 6.47 |
multiproof - depth 15, 1 requested leaves | 8.0800 us/op | 8.1730 us/op | 0.99 |
tree offset multiproof - depth 15, 1 requested leaves | 18.295 us/op | 18.112 us/op | 1.01 |
compact multiproof - depth 15, 1 requested leaves | 3.4280 us/op | 3.4020 us/op | 1.01 |
multiproof - depth 15, 2 requested leaves | 11.948 us/op | 11.860 us/op | 1.01 |
tree offset multiproof - depth 15, 2 requested leaves | 21.608 us/op | 21.715 us/op | 1.00 |
compact multiproof - depth 15, 2 requested leaves | 3.4820 us/op | 3.4780 us/op | 1.00 |
multiproof - depth 15, 3 requested leaves | 16.444 us/op | 16.901 us/op | 0.97 |
tree offset multiproof - depth 15, 3 requested leaves | 27.441 us/op | 27.903 us/op | 0.98 |
compact multiproof - depth 15, 3 requested leaves | 4.1650 us/op | 4.2200 us/op | 0.99 |
multiproof - depth 15, 4 requested leaves | 21.696 us/op | 21.786 us/op | 1.00 |
tree offset multiproof - depth 15, 4 requested leaves | 33.847 us/op | 35.400 us/op | 0.96 |
compact multiproof - depth 15, 4 requested leaves | 4.8280 us/op | 5.0270 us/op | 0.96 |
packedRootsBytesToLeafNodes bytes 4000 offset 0 | 1.8860 us/op | 2.0580 us/op | 0.92 |
packedRootsBytesToLeafNodes bytes 4000 offset 1 | 1.8700 us/op | 2.0460 us/op | 0.91 |
packedRootsBytesToLeafNodes bytes 4000 offset 2 | 1.8620 us/op | 2.0460 us/op | 0.91 |
packedRootsBytesToLeafNodes bytes 4000 offset 3 | 1.9060 us/op | 2.0170 us/op | 0.94 |
subtreeFillToContents depth 40 count 250000 | 42.318 ms/op | 39.406 ms/op | 1.07 |
setRoot - gindexBitstring | 9.2668 ms/op | 9.4713 ms/op | 0.98 |
setRoot - gindex | 9.4822 ms/op | 9.7799 ms/op | 0.97 |
getRoot - gindexBitstring | 2.4363 ms/op | 2.3727 ms/op | 1.03 |
getRoot - gindex | 3.1066 ms/op | 3.2045 ms/op | 0.97 |
getHashObject then setHashObject | 9.8422 ms/op | 10.076 ms/op | 0.98 |
setNodeWithFn | 7.6997 ms/op | 7.6212 ms/op | 1.01 |
getNodeAtDepth depth 0 x100000 | 1.1140 ms/op | 1.1147 ms/op | 1.00 |
setNodeAtDepth depth 0 x100000 | 2.4540 ms/op | 2.4581 ms/op | 1.00 |
getNodesAtDepth depth 0 x100000 | 1.0530 ms/op | 1.0597 ms/op | 0.99 |
setNodesAtDepth depth 0 x100000 | 1.5167 ms/op | 1.5213 ms/op | 1.00 |
getNodeAtDepth depth 1 x100000 | 1.1805 ms/op | 1.1774 ms/op | 1.00 |
setNodeAtDepth depth 1 x100000 | 5.1819 ms/op | 5.1523 ms/op | 1.01 |
getNodesAtDepth depth 1 x100000 | 1.1777 ms/op | 1.1761 ms/op | 1.00 |
setNodesAtDepth depth 1 x100000 | 4.3257 ms/op | 4.3912 ms/op | 0.99 |
getNodeAtDepth depth 2 x100000 | 1.4553 ms/op | 1.4585 ms/op | 1.00 |
setNodeAtDepth depth 2 x100000 | 8.9919 ms/op | 8.9137 ms/op | 1.01 |
getNodesAtDepth depth 2 x100000 | 18.677 ms/op | 17.966 ms/op | 1.04 |
setNodesAtDepth depth 2 x100000 | 12.947 ms/op | 12.902 ms/op | 1.00 |
tree.getNodesAtDepth - gindexes | 7.3524 ms/op | 7.2954 ms/op | 1.01 |
tree.getNodesAtDepth - push all nodes | 1.8070 ms/op | 1.8712 ms/op | 0.97 |
tree.getNodesAtDepth - navigation | 235.82 us/op | 233.54 us/op | 1.01 |
tree.setNodesAtDepth - indexes | 393.33 us/op | 389.71 us/op | 1.01 |
set at depth 8 | 451.00 ns/op | 455.00 ns/op | 0.99 |
set at depth 16 | 617.00 ns/op | 584.00 ns/op | 1.06 |
set at depth 32 | 915.00 ns/op | 931.00 ns/op | 0.98 |
iterateNodesAtDepth 8 256 | 13.036 us/op | 13.033 us/op | 1.00 |
getNodesAtDepth 8 256 | 3.3830 us/op | 3.3670 us/op | 1.00 |
iterateNodesAtDepth 16 65536 | 4.2376 ms/op | 4.2538 ms/op | 1.00 |
getNodesAtDepth 16 65536 | 1.4871 ms/op | 1.5348 ms/op | 0.97 |
iterateNodesAtDepth 32 250000 | 15.015 ms/op | 15.465 ms/op | 0.97 |
getNodesAtDepth 32 250000 | 4.2326 ms/op | 4.2977 ms/op | 0.98 |
iterateNodesAtDepth 40 250000 | 15.273 ms/op | 15.075 ms/op | 1.01 |
getNodesAtDepth 40 250000 | 4.2321 ms/op | 4.2544 ms/op | 0.99 |
250000 validators root getter | 829.31 ms/op | 117.93 ms/op | 7.03 |
250000 validators batchHash() | 287.17 ms/op | 81.978 ms/op | 3.50 |
250000 validators hashComputations | 16.674 ms/op | 17.881 ms/op | 0.93 |
bitlist bytes to struct (120,90) | 701.00 ns/op | 821.00 ns/op | 0.85 |
bitlist bytes to tree (120,90) | 2.7660 us/op | 3.1870 us/op | 0.87 |
bitlist bytes to struct (2048,2048) | 1.0780 us/op | 1.1530 us/op | 0.93 |
bitlist bytes to tree (2048,2048) | 4.0390 us/op | 3.9570 us/op | 1.02 |
ByteListType - deserialize | 7.7894 ms/op | 7.1776 ms/op | 1.09 |
BasicListType - deserialize | 17.966 ms/op | 15.106 ms/op | 1.19 |
ByteListType - serialize | 7.8533 ms/op | 7.3920 ms/op | 1.06 |
BasicListType - serialize | 10.540 ms/op | 10.336 ms/op | 1.02 |
BasicListType - tree_convertToStruct | 29.413 ms/op | 26.430 ms/op | 1.11 |
List[uint8, 68719476736] len 300000 ViewDU.getAll() + iterate | 4.7271 ms/op | 4.3451 ms/op | 1.09 |
List[uint8, 68719476736] len 300000 ViewDU.get(i) | 4.0231 ms/op | 4.0176 ms/op | 1.00 |
Array.push len 300000 empty Array - number | 6.1254 ms/op | 5.7115 ms/op | 1.07 |
Array.set len 300000 from new Array - number | 2.1033 ms/op | 1.8534 ms/op | 1.13 |
Array.set len 300000 - number | 6.0065 ms/op | 5.6512 ms/op | 1.06 |
Uint8Array.set len 300000 | 372.80 us/op | 367.16 us/op | 1.02 |
Uint32Array.set len 300000 | 437.91 us/op | 420.72 us/op | 1.04 |
Container({a: uint8, b: uint8}) getViewDU x300000 | 26.554 ms/op | 45.041 ms/op | 0.59 |
ContainerNodeStruct({a: uint8, b: uint8}) getViewDU x300000 | 11.231 ms/op | 10.702 ms/op | 1.05 |
List(Container) len 300000 ViewDU.getAllReadonly() + iterate | 211.05 ms/op | 198.38 ms/op | 1.06 |
List(Container) len 300000 ViewDU.getAllReadonlyValues() + iterate | 237.85 ms/op | 230.48 ms/op | 1.03 |
List(Container) len 300000 ViewDU.get(i) | 6.1234 ms/op | 6.3351 ms/op | 0.97 |
List(Container) len 300000 ViewDU.getReadonly(i) | 6.0767 ms/op | 6.2013 ms/op | 0.98 |
List(ContainerNodeStruct) len 300000 ViewDU.getAllReadonly() + iterate | 40.154 ms/op | 40.398 ms/op | 0.99 |
List(ContainerNodeStruct) len 300000 ViewDU.getAllReadonlyValues() + iterate | 5.3753 ms/op | 5.0758 ms/op | 1.06 |
List(ContainerNodeStruct) len 300000 ViewDU.get(i) | 5.9325 ms/op | 5.9316 ms/op | 1.00 |
List(ContainerNodeStruct) len 300000 ViewDU.getReadonly(i) | 5.8328 ms/op | 5.8203 ms/op | 1.00 |
Array.push len 300000 empty Array - object | 5.8677 ms/op | 5.8913 ms/op | 1.00 |
Array.set len 300000 from new Array - object | 2.0802 ms/op | 2.0034 ms/op | 1.04 |
Array.set len 300000 - object | 5.6109 ms/op | 6.2420 ms/op | 0.90 |
ListUintNum64Type.toViewDU 1900000 -> 2000000 | 204.12 ms/op | ||
ListUintNum64Type.toViewDU() | 177.74 ms/op | ||
cachePermanentRootStruct no cache | 3.5550 us/op | 5.1110 us/op | 0.70 |
cachePermanentRootStruct with cache | 203.00 ns/op | 211.00 ns/op | 0.96 |
epochParticipation len 250000 rws 7813 | 2.1959 ms/op | 2.1637 ms/op | 1.01 |
Deneb BeaconBlock.hashTreeRoot(), numTransaction=200 | 4.1054 ms/op | ||
BeaconState ViewDU batchHashTreeRoot vc=200000 | 218.73 ms/op | 89.382 ms/op | 2.45 |
BeaconState ViewDU batchHashTreeRoot - commit step vc=200000 | 191.73 ms/op | ||
BeaconState ViewDU batchHashTreeRoot - hash step vc=200000 | 48.397 ms/op | ||
BeaconState ViewDU hashTreeRoot() vc=200000 | 738.93 ms/op | 101.07 ms/op | 7.31 |
BeaconState ViewDU hashTreeRoot - commit step vc=200000 | 62.920 ms/op | 80.229 ms/op | 0.78 |
BeaconState ViewDU hashTreeRoot - validator tree creation vc=100000 | 227.56 ms/op | ||
deserialize Attestation - tree | 4.0080 us/op | 3.9630 us/op | 1.01 |
deserialize Attestation - struct | 1.8090 us/op | 1.7580 us/op | 1.03 |
deserialize SignedAggregateAndProof - tree | 3.7710 us/op | 3.6310 us/op | 1.04 |
deserialize SignedAggregateAndProof - struct | 2.9350 us/op | 2.9160 us/op | 1.01 |
deserialize SyncCommitteeMessage - tree | 1.1060 us/op | 1.0250 us/op | 1.08 |
deserialize SyncCommitteeMessage - struct | 1.0840 us/op | 1.0620 us/op | 1.02 |
deserialize SignedContributionAndProof - tree | 2.1710 us/op | 2.0080 us/op | 1.08 |
deserialize SignedContributionAndProof - struct | 2.3540 us/op | 2.1560 us/op | 1.09 |
deserialize SignedBeaconBlock - tree | 220.27 us/op | 207.08 us/op | 1.06 |
deserialize SignedBeaconBlock - struct | 115.14 us/op | 114.47 us/op | 1.01 |
BeaconState vc 300000 - deserialize tree | 616.33 ms/op | 585.06 ms/op | 1.05 |
BeaconState vc 300000 - serialize tree | 112.53 ms/op | 136.42 ms/op | 0.82 |
BeaconState.historicalRoots vc 300000 - deserialize tree | 823.00 ns/op | 691.00 ns/op | 1.19 |
BeaconState.historicalRoots vc 300000 - serialize tree | 695.00 ns/op | 650.00 ns/op | 1.07 |
BeaconState.validators vc 300000 - deserialize tree | 576.79 ms/op | 566.56 ms/op | 1.02 |
BeaconState.validators vc 300000 - serialize tree | 45.117 ms/op | 97.294 ms/op | 0.46 |
BeaconState.balances vc 300000 - deserialize tree | 18.977 ms/op | 23.164 ms/op | 0.82 |
BeaconState.balances vc 300000 - serialize tree | 3.2652 ms/op | 3.5235 ms/op | 0.93 |
BeaconState.previousEpochParticipation vc 300000 - deserialize tree | 339.03 us/op | 334.74 us/op | 1.01 |
BeaconState.previousEpochParticipation vc 300000 - serialize tree | 269.61 us/op | 271.56 us/op | 0.99 |
BeaconState.currentEpochParticipation vc 300000 - deserialize tree | 348.84 us/op | 335.55 us/op | 1.04 |
BeaconState.currentEpochParticipation vc 300000 - serialize tree | 266.74 us/op | 274.71 us/op | 0.97 |
BeaconState.inactivityScores vc 300000 - deserialize tree | 23.292 ms/op | 23.268 ms/op | 1.00 |
BeaconState.inactivityScores vc 300000 - serialize tree | 3.0084 ms/op | 3.3465 ms/op | 0.90 |
hashTreeRoot Attestation - struct | 12.917 us/op | 19.610 us/op | 0.66 |
hashTreeRoot Attestation - tree | 8.8670 us/op | 9.0690 us/op | 0.98 |
hashTreeRoot SignedAggregateAndProof - struct | 16.185 us/op | 24.121 us/op | 0.67 |
hashTreeRoot SignedAggregateAndProof - tree | 13.173 us/op | 12.891 us/op | 1.02 |
hashTreeRoot SyncCommitteeMessage - struct | 3.8730 us/op | 6.0570 us/op | 0.64 |
hashTreeRoot SyncCommitteeMessage - tree | 3.1990 us/op | 3.1200 us/op | 1.03 |
hashTreeRoot SignedContributionAndProof - struct | 9.8520 us/op | 16.609 us/op | 0.59 |
hashTreeRoot SignedContributionAndProof - tree | 9.0040 us/op | 8.8530 us/op | 1.02 |
hashTreeRoot SignedBeaconBlock - struct | 897.97 us/op | 1.2562 ms/op | 0.71 |
hashTreeRoot SignedBeaconBlock - tree | 783.51 us/op | 759.84 us/op | 1.03 |
hashTreeRoot Validator - struct | 4.7090 us/op | 7.5450 us/op | 0.62 |
hashTreeRoot Validator - tree | 6.3310 us/op | 6.3470 us/op | 1.00 |
BeaconState vc 300000 - hashTreeRoot tree | 2.0114 s/op | 2.0804 s/op | 0.97 |
BeaconState vc 300000 - batchHashTreeRoot tree | 3.2718 s/op | 3.2767 s/op | 1.00 |
BeaconState.historicalRoots vc 300000 - hashTreeRoot tree | 956.00 ns/op | 950.00 ns/op | 1.01 |
BeaconState.validators vc 300000 - hashTreeRoot tree | 2.0597 s/op | 2.0641 s/op | 1.00 |
BeaconState.balances vc 300000 - hashTreeRoot tree | 34.595 ms/op | 32.841 ms/op | 1.05 |
BeaconState.previousEpochParticipation vc 300000 - hashTreeRoot tree | 4.2911 ms/op | 4.2966 ms/op | 1.00 |
BeaconState.currentEpochParticipation vc 300000 - hashTreeRoot tree | 4.2921 ms/op | 4.1245 ms/op | 1.04 |
BeaconState.inactivityScores vc 300000 - hashTreeRoot tree | 37.102 ms/op | 33.347 ms/op | 1.11 |
hash64 x18 | 10.304 us/op | 9.0690 us/op | 1.14 |
hashTwoObjects x18 | 8.6080 us/op | 8.7140 us/op | 0.99 |
hash64 x1740 | 903.25 us/op | 820.40 us/op | 1.10 |
hashTwoObjects x1740 | 806.08 us/op | 822.67 us/op | 0.98 |
hash64 x2700000 | 1.3970 s/op | 1.2905 s/op | 1.08 |
hashTwoObjects x2700000 | 1.2221 s/op | 1.2732 s/op | 0.96 |
get_exitEpoch - ContainerType | 223.00 ns/op | 366.00 ns/op | 0.61 |
get_exitEpoch - ContainerNodeStructType | 224.00 ns/op | 363.00 ns/op | 0.62 |
set_exitEpoch - ContainerType | 236.00 ns/op | 381.00 ns/op | 0.62 |
set_exitEpoch - ContainerNodeStructType | 233.00 ns/op | 372.00 ns/op | 0.63 |
get_pubkey - ContainerType | 841.00 ns/op | 1.3880 us/op | 0.61 |
get_pubkey - ContainerNodeStructType | 216.00 ns/op | 361.00 ns/op | 0.60 |
hashTreeRoot - ContainerType | 386.00 ns/op | 614.00 ns/op | 0.63 |
hashTreeRoot - ContainerNodeStructType | 430.00 ns/op | 645.00 ns/op | 0.67 |
createProof - ContainerType | 3.9950 us/op | 6.3600 us/op | 0.63 |
createProof - ContainerNodeStructType | 20.072 us/op | 24.930 us/op | 0.81 |
serialize - ContainerType | 1.7630 us/op | 1.9060 us/op | 0.92 |
serialize - ContainerNodeStructType | 1.1180 us/op | 1.4160 us/op | 0.79 |
set_exitEpoch_and_hashTreeRoot - ContainerType | 3.1490 us/op | 2.7860 us/op | 1.13 |
set_exitEpoch_and_hashTreeRoot - ContainerNodeStructType | 6.7540 us/op | 7.1890 us/op | 0.94 |
ValidatorViewDU hashTreeRoot | 8.3440 us/op | ||
ContainerNodeStructViewDU hashTreeRoot | 23.857 us/op | ||
Array - for of | 5.4820 us/op | 6.5340 us/op | 0.84 |
Array - for(;;) | 5.4330 us/op | 6.3540 us/op | 0.86 |
basicListValue.readonlyValuesArray() | 3.9917 ms/op | 4.1882 ms/op | 0.95 |
basicListValue.readonlyValuesArray() + loop all | 4.1465 ms/op | 4.3806 ms/op | 0.95 |
compositeListValue.readonlyValuesArray() | 29.444 ms/op | 30.292 ms/op | 0.97 |
compositeListValue.readonlyValuesArray() + loop all | 28.950 ms/op | 29.651 ms/op | 0.98 |
Number64UintType - get balances list | 4.1746 ms/op | 5.5447 ms/op | 0.75 |
Number64UintType - set balances list | 10.035 ms/op | 10.067 ms/op | 1.00 |
Number64UintType - get and increase 10 then set | 40.161 ms/op | 39.157 ms/op | 1.03 |
Number64UintType - increase 10 using applyDelta | 15.742 ms/op | 16.529 ms/op | 0.95 |
Number64UintType - increase 10 using applyDeltaInBatch | 15.827 ms/op | 17.171 ms/op | 0.92 |
tree_newTreeFromUint64Deltas | 16.065 ms/op | 16.725 ms/op | 0.96 |
unsafeUint8ArrayToTree | 31.302 ms/op | 32.415 ms/op | 0.97 |
bitLength(50) | 224.00 ns/op | 222.00 ns/op | 1.01 |
bitLengthStr(50) | 211.00 ns/op | 210.00 ns/op | 1.00 |
bitLength(8000) | 214.00 ns/op | 218.00 ns/op | 0.98 |
bitLengthStr(8000) | 258.00 ns/op | 251.00 ns/op | 1.03 |
bitLength(250000) | 219.00 ns/op | 220.00 ns/op | 1.00 |
bitLengthStr(250000) | 294.00 ns/op | 289.00 ns/op | 1.02 |
merkleizeInto 4 chunks | 1.3550 us/op | ||
merkleize 4 chunks | 1.9230 us/op | ||
merkleizeInto 8 chunks | 1.9010 us/op | ||
merkleize 8 chunks | 4.0130 us/op | ||
merkleizeInto 16 chunks | 2.5520 us/op | ||
merkleize 16 chunks | 8.1090 us/op | ||
merkleizeInto 32 chunks | 3.5040 us/op | ||
merkleize 32 chunks | 16.208 us/op | ||
floor - Math.floor (53) | 1.2391 ns/op | 1.2430 ns/op | 1.00 |
floor - << 0 (53) | 1.2368 ns/op | 1.2366 ns/op | 1.00 |
floor - Math.floor (512) | 1.2402 ns/op | 1.2373 ns/op | 1.00 |
floor - << 0 (512) | 1.2393 ns/op | 1.2400 ns/op | 1.00 |
fnIf(0) | 1.5467 ns/op | 1.5538 ns/op | 1.00 |
fnSwitch(0) | 2.1649 ns/op | 2.1668 ns/op | 1.00 |
fnObj(0) | 1.5518 ns/op | 1.5568 ns/op | 1.00 |
fnArr(0) | 1.5471 ns/op | 1.5465 ns/op | 1.00 |
fnIf(4) | 2.1650 ns/op | 2.1743 ns/op | 1.00 |
fnSwitch(4) | 2.1653 ns/op | 2.1669 ns/op | 1.00 |
fnObj(4) | 1.5591 ns/op | 1.5487 ns/op | 1.01 |
fnArr(4) | 1.5479 ns/op | 1.5480 ns/op | 1.00 |
fnIf(9) | 3.0920 ns/op | 3.0924 ns/op | 1.00 |
fnSwitch(9) | 2.1649 ns/op | 2.1665 ns/op | 1.00 |
fnObj(9) | 1.5470 ns/op | 1.5475 ns/op | 1.00 |
fnArr(9) | 1.5505 ns/op | 1.5461 ns/op | 1.00 |
Container {a,b,vec} - as struct x100000 | 124.51 us/op | 123.84 us/op | 1.01 |
Container {a,b,vec} - as tree x100000 | 340.73 us/op | 340.09 us/op | 1.00 |
Container {a,vec,b} - as struct x100000 | 154.86 us/op | 154.32 us/op | 1.00 |
Container {a,vec,b} - as tree x100000 | 371.62 us/op | 371.18 us/op | 1.00 |
get 2 props x1000000 - rawObject | 310.52 us/op | 309.88 us/op | 1.00 |
get 2 props x1000000 - proxy | 73.639 ms/op | 72.741 ms/op | 1.01 |
get 2 props x1000000 - customObj | 309.30 us/op | 307.92 us/op | 1.00 |
Simple object binary -> struct | 549.00 ns/op | 567.00 ns/op | 0.97 |
Simple object binary -> tree_backed | 1.0490 us/op | 986.00 ns/op | 1.06 |
Simple object struct -> tree_backed | 1.5280 us/op | 1.5440 us/op | 0.99 |
Simple object tree_backed -> struct | 1.5170 us/op | 1.4670 us/op | 1.03 |
Simple object struct -> binary | 789.00 ns/op | 789.00 ns/op | 1.00 |
Simple object tree_backed -> binary | 1.2240 us/op | 1.2600 us/op | 0.97 |
aggregationBits binary -> struct | 445.00 ns/op | 444.00 ns/op | 1.00 |
aggregationBits binary -> tree_backed | 1.9620 us/op | 1.9550 us/op | 1.00 |
aggregationBits struct -> tree_backed | 2.2560 us/op | 2.3230 us/op | 0.97 |
aggregationBits tree_backed -> struct | 900.00 ns/op | 926.00 ns/op | 0.97 |
aggregationBits struct -> binary | 683.00 ns/op | 686.00 ns/op | 1.00 |
aggregationBits tree_backed -> binary | 860.00 ns/op | 871.00 ns/op | 0.99 |
List(uint8) 100000 binary -> struct | 1.5329 ms/op | 1.6812 ms/op | 0.91 |
List(uint8) 100000 binary -> tree_backed | 87.294 us/op | 93.101 us/op | 0.94 |
List(uint8) 100000 struct -> tree_backed | 1.1366 ms/op | 1.1024 ms/op | 1.03 |
List(uint8) 100000 tree_backed -> struct | 1.1154 ms/op | 1.0546 ms/op | 1.06 |
List(uint8) 100000 struct -> binary | 1.0284 ms/op | 989.83 us/op | 1.04 |
List(uint8) 100000 tree_backed -> binary | 88.758 us/op | 89.572 us/op | 0.99 |
List(uint64Number) 100000 binary -> struct | 1.1597 ms/op | 1.1648 ms/op | 1.00 |
List(uint64Number) 100000 binary -> tree_backed | 3.1350 ms/op | 2.5600 ms/op | 1.22 |
List(uint64Number) 100000 struct -> tree_backed | 4.6010 ms/op | 4.0852 ms/op | 1.13 |
List(uint64Number) 100000 tree_backed -> struct | 2.1656 ms/op | 2.0743 ms/op | 1.04 |
List(uint64Number) 100000 struct -> binary | 1.2460 ms/op | 1.3325 ms/op | 0.94 |
List(uint64Number) 100000 tree_backed -> binary | 826.27 us/op | 853.53 us/op | 0.97 |
List(Uint64Bigint) 100000 binary -> struct | 3.2557 ms/op | 3.5539 ms/op | 0.92 |
List(Uint64Bigint) 100000 binary -> tree_backed | 2.6327 ms/op | 3.2431 ms/op | 0.81 |
List(Uint64Bigint) 100000 struct -> tree_backed | 5.0095 ms/op | 5.5215 ms/op | 0.91 |
List(Uint64Bigint) 100000 tree_backed -> struct | 4.5832 ms/op | 4.5138 ms/op | 1.02 |
List(Uint64Bigint) 100000 struct -> binary | 2.0761 ms/op | 2.0407 ms/op | 1.02 |
List(Uint64Bigint) 100000 tree_backed -> binary | 875.32 us/op | 934.11 us/op | 0.94 |
Vector(Root) 100000 binary -> struct | 29.240 ms/op | 32.244 ms/op | 0.91 |
Vector(Root) 100000 binary -> tree_backed | 29.506 ms/op | 27.173 ms/op | 1.09 |
Vector(Root) 100000 struct -> tree_backed | 42.111 ms/op | 45.955 ms/op | 0.92 |
Vector(Root) 100000 tree_backed -> struct | 49.262 ms/op | 48.127 ms/op | 1.02 |
Vector(Root) 100000 struct -> binary | 2.8951 ms/op | 2.6239 ms/op | 1.10 |
Vector(Root) 100000 tree_backed -> binary | 9.5454 ms/op | 8.2375 ms/op | 1.16 |
List(Validator) 100000 binary -> struct | 105.58 ms/op | 104.93 ms/op | 1.01 |
List(Validator) 100000 binary -> tree_backed | 286.65 ms/op | 285.76 ms/op | 1.00 |
List(Validator) 100000 struct -> tree_backed | 314.90 ms/op | 309.56 ms/op | 1.02 |
List(Validator) 100000 tree_backed -> struct | 200.70 ms/op | 210.31 ms/op | 0.95 |
List(Validator) 100000 struct -> binary | 28.797 ms/op | 26.723 ms/op | 1.08 |
List(Validator) 100000 tree_backed -> binary | 105.99 ms/op | 115.26 ms/op | 0.92 |
List(Validator-NS) 100000 binary -> struct | 102.85 ms/op | 98.543 ms/op | 1.04 |
List(Validator-NS) 100000 binary -> tree_backed | 147.38 ms/op | 144.42 ms/op | 1.02 |
List(Validator-NS) 100000 struct -> tree_backed | 170.11 ms/op | 185.09 ms/op | 0.92 |
List(Validator-NS) 100000 tree_backed -> struct | 147.03 ms/op | 165.67 ms/op | 0.89 |
List(Validator-NS) 100000 struct -> binary | 7.4331 ms/op | 27.056 ms/op | 0.27 |
List(Validator-NS) 100000 tree_backed -> binary | 12.354 ms/op | 31.887 ms/op | 0.39 |
get epochStatuses - MutableVector | 112.82 us/op | 106.38 us/op | 1.06 |
get epochStatuses - ViewDU | 205.51 us/op | 203.15 us/op | 1.01 |
set epochStatuses - ListTreeView | 2.1111 ms/op | 2.3891 ms/op | 0.88 |
set epochStatuses - ListTreeView - set() | 446.11 us/op | 459.01 us/op | 0.97 |
set epochStatuses - ListTreeView - commit() | 1.0159 ms/op | 555.56 us/op | 1.83 |
bitstring | 648.28 ns/op | 641.14 ns/op | 1.01 |
bit mask | 13.923 ns/op | 13.521 ns/op | 1.03 |
struct - increase slot to 1000000 | 928.09 us/op | 927.98 us/op | 1.00 |
UintNumberType - increase slot to 1000000 | 21.407 ms/op | 21.987 ms/op | 0.97 |
UintBigintType - increase slot to 1000000 | 163.82 ms/op | 159.36 ms/op | 1.03 |
UintBigint8 x 100000 tree_deserialize | 4.3076 ms/op | 4.6560 ms/op | 0.93 |
UintBigint8 x 100000 tree_serialize | 1.0921 ms/op | 1.0943 ms/op | 1.00 |
UintBigint16 x 100000 tree_deserialize | 4.3735 ms/op | 4.7401 ms/op | 0.92 |
UintBigint16 x 100000 tree_serialize | 1.2139 ms/op | 1.2154 ms/op | 1.00 |
UintBigint32 x 100000 tree_deserialize | 4.7990 ms/op | 5.0010 ms/op | 0.96 |
UintBigint32 x 100000 tree_serialize | 1.2434 ms/op | 1.2239 ms/op | 1.02 |
UintBigint64 x 100000 tree_deserialize | 5.5438 ms/op | 5.2174 ms/op | 1.06 |
UintBigint64 x 100000 tree_serialize | 1.6034 ms/op | 1.5732 ms/op | 1.02 |
UintBigint8 x 100000 value_deserialize | 433.24 us/op | 521.80 us/op | 0.83 |
UintBigint8 x 100000 value_serialize | 708.54 us/op | 666.63 us/op | 1.06 |
UintBigint16 x 100000 value_deserialize | 464.97 us/op | 462.13 us/op | 1.01 |
UintBigint16 x 100000 value_serialize | 748.95 us/op | 724.99 us/op | 1.03 |
UintBigint32 x 100000 value_deserialize | 433.80 us/op | 433.07 us/op | 1.00 |
UintBigint32 x 100000 value_serialize | 731.81 us/op | 698.01 us/op | 1.05 |
UintBigint64 x 100000 value_deserialize | 496.92 us/op | 495.96 us/op | 1.00 |
UintBigint64 x 100000 value_serialize | 873.80 us/op | 874.42 us/op | 1.00 |
UintBigint8 x 100000 deserialize | 2.9569 ms/op | 2.9727 ms/op | 0.99 |
UintBigint8 x 100000 serialize | 1.7383 ms/op | 1.6040 ms/op | 1.08 |
UintBigint16 x 100000 deserialize | 3.0451 ms/op | 3.0156 ms/op | 1.01 |
UintBigint16 x 100000 serialize | 1.4740 ms/op | 1.6335 ms/op | 0.90 |
UintBigint32 x 100000 deserialize | 3.0718 ms/op | 3.0710 ms/op | 1.00 |
UintBigint32 x 100000 serialize | 2.6968 ms/op | 2.7979 ms/op | 0.96 |
UintBigint64 x 100000 deserialize | 3.7599 ms/op | 3.9455 ms/op | 0.95 |
UintBigint64 x 100000 serialize | 1.5457 ms/op | 1.5557 ms/op | 0.99 |
UintBigint128 x 100000 deserialize | 5.0731 ms/op | 5.4218 ms/op | 0.94 |
UintBigint128 x 100000 serialize | 14.872 ms/op | 13.879 ms/op | 1.07 |
UintBigint256 x 100000 deserialize | 8.2260 ms/op | 8.0535 ms/op | 1.02 |
UintBigint256 x 100000 serialize | 43.641 ms/op | 41.441 ms/op | 1.05 |
Slice from Uint8Array x25000 | 1.2849 ms/op | 1.2979 ms/op | 0.99 |
Slice from ArrayBuffer x25000 | 15.123 ms/op | 15.006 ms/op | 1.01 |
Slice from ArrayBuffer x25000 + new Uint8Array | 16.064 ms/op | 16.000 ms/op | 1.00 |
Copy Uint8Array 100000 iterate | 1.7022 ms/op | 1.6460 ms/op | 1.03 |
Copy Uint8Array 100000 slice | 111.88 us/op | 116.05 us/op | 0.96 |
Copy Uint8Array 100000 Uint8Array.prototype.slice.call | 112.07 us/op | 117.91 us/op | 0.95 |
Copy Buffer 100000 Uint8Array.prototype.slice.call | 113.68 us/op | 117.93 us/op | 0.96 |
Copy Uint8Array 100000 slice + set | 193.63 us/op | 178.57 us/op | 1.08 |
Copy Uint8Array 100000 subarray + set | 113.56 us/op | 117.59 us/op | 0.97 |
Copy Uint8Array 100000 slice arrayBuffer | 112.35 us/op | 116.92 us/op | 0.96 |
Uint64 deserialize 100000 - iterate Uint8Array | 1.9181 ms/op | 1.8024 ms/op | 1.06 |
Uint64 deserialize 100000 - by Uint32A | 2.1155 ms/op | 1.7828 ms/op | 1.19 |
Uint64 deserialize 100000 - by DataView.getUint32 x2 | 1.9633 ms/op | 1.7841 ms/op | 1.10 |
Uint64 deserialize 100000 - by DataView.getBigUint64 | 4.7816 ms/op | 5.2553 ms/op | 0.91 |
Uint64 deserialize 100000 - by byte | 39.779 ms/op | 39.228 ms/op | 1.01 |
by benchmarkbot/action
This was referenced Jul 9, 2024
twoeths
reviewed
Oct 10, 2024
twoeths
reviewed
Oct 10, 2024
twoeths
reviewed
Oct 10, 2024
twoeths
reviewed
Oct 10, 2024
twoeths
reviewed
Oct 10, 2024
twoeths
reviewed
Oct 10, 2024
twoeths
reviewed
Oct 10, 2024
twoeths
reviewed
Oct 10, 2024
twoeths
reviewed
Oct 10, 2024
twoeths
reviewed
Oct 10, 2024
twoeths
reviewed
Oct 10, 2024
twoeths
reviewed
Oct 15, 2024
This reverts commit e0e3173.
* feat: implement merkleizeBlockArray * fix: support padFor=1 for merkleizeBlockArray * feat: add blockLimit param to merkleizeBlockArray() api * feat: implement ByteListType.hashTreeRoot() using merkleizeBlockArray() * fix: assign this.blocksBuffer in a more straightforward way * chore: refactor chunkBytes to blockBytes * fix: blockLimit usage in doMerkleizeBlockArray * feat: implement ListComposite.hashTreeRoot() using merkleizeBlockArray api
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
Created on behalf of @twoeths. Keeping in draft for now but using to store some metrics from different iterations of the changes.
Update system to allow for batch hashing of trees. SIMD instruction set allows for parallel processing of hashes when running on a single core. Add the idea of levels to the hashing process to allow deeper sections of a tree to be processed before higher levels and then computes the hashes according to depth of the tree.
Description
New
hashtree
hasher for batch hashMinimal memory allocation:
For
type.hashTreeRoot()
:type.hashTreeRoot()
almost does not need to allocate any temporary Uint8ArraysmerkleizeInto()
in hasher, this also does not allocate memory, and it uses batch hashFor ViewDU:
HashComputation = {left: Node; right: Node; dest: Node}
For BeaconState, it contains validators as ContainerNodeStructViewDUs
Benchmark result on Mac M1
BeaconBlock.hashTreeRoot()
call, data comes from a typical mainnet block with 200 transactionsTODOs
TODO - batch
as-sha256
,persistent-merkle-tree
andssz
part of #355
closes #78