[DO NOT MERGE] benchmarks for hashing single subtrees recursively #334

Qyriad · 2024-10-16T18:08:38Z

This code is a benchmark for hashing a single, depth-8 subtree of a sparse Merkle tree by recursively establishing child hashes, adapted from work in progress code using this method to compute subtrees in parallel.

Raw results, the number of the left indicating the number of new key-value pairs:

subtree8/32             time:   [72.467 µs 75.138 µs 77.577 µs]
subtree8/128            time:   [76.635 µs 78.540 µs 80.337 µs]
subtree8/512            time:   [104.26 µs 107.99 µs 111.24 µs]
subtree8/1024           time:   [137.81 µs 142.94 µs 148.87 µs]
subtree8/8192           time:   [629.44 µs 708.05 µs 797.79 µs]

The time it takes to hash a subtree increases linearly with respect to the amount of key-value pairs being added to the tree as a whole.

bobbinth · 2024-10-17T02:54:50Z

Not a review yet, but looking briefly through the code I think I had something way simpler in mind. What I think we need as a basic building block of the algorithm is constructing a logical SMT (i.e., not our specific Smt) of depth 8 from a given set of nodes. This can probably be just a single function that looks something like this:

/// Builds a set of nodes for a Merkle tree of depth 8 from the specified set of leaves. The nodes are
/// appended to the `inner_nodes` map. The leaves are assumed to be located at the specified depth.
pub fn build_subtree(
    leaves: impl IntoIterator<Item = (u64, Digest)>
    leaf_depth: u8,
    &mut inner_nodes: BTreeMap<NodeIndex, InnerNode>,
)

Once we have this working (and assuming it is efficient), we can use it to build various levels of the actual SMTs. For example, for our Smt, the process could look like so:

Compute and sort all the leaves (we should be able to do most of this in parallel).
Call build_subtree() for each set of leaves forming a depth 8 subtree (here, leaf_depth = 64).
Use the results of the previous step to get a new set of leaves and call build_subtree() on them again (now, leaf_depth = 48).
Use the results of the previous step to get a new set of leaves and call build_subtree() on them again (now, leaf_depth = 32).
Use the results of the previous step to get a new set of leaves and call build_subtree() on them again (now, leaf_depth = 16).
Use the results of the previous step to get a new set of leaves and call build_subtree() on them again (now, leaf_depth = 8).

bobbinth · 2024-10-17T03:04:40Z

/// Builds a set of nodes for a Merkle tree of depth 8 from the specified set of leaves. The nodes are
/// appended to the `inner_nodes` map. The leaves are assumed to be located at the specified depth.
pub fn build_subtree(
    leaves: impl IntoIterator<Item = (u64, Digest)>
    leaf_depth: u8,
    &mut inner_nodes: BTreeMap<NodeIndex, InnerNode>,
)

Actually, this may not be very parallelizable since BTreeMap cannot be mutated in parallel. An alternative could look something like this:

pub fn build_subtree(
    leaves: impl IntoIterator<Item = (u64, Digest)>
    leaf_depth: u8,
) -> BTreeMap<NodeIndex, InnerNode>

And then we can merge BTreeMap's in a single thread (assuming this is a relatively fast process).

Qyriad · 2024-10-24T02:33:34Z

Alright, I've pushed a simpler implementation much closer to what you suggested, and micro-benchmarks for it. This implementation takes the leaves pre-sorted, since presumably we'll want to only sort at the beginning. The benchmarks don't include the sort time, though I can easily change that. The benchmarks look like this:

subtree8-even/64        time:   [2.6322 ms 2.6329 ms 2.6338 ms]
Found 4 outliers among 60 measurements (6.67%)
  2 (3.33%) low mild
  2 (3.33%) high mild
subtree8-even/128       time:   [5.1247 ms 5.1529 ms 5.1772 ms]
Found 10 outliers among 60 measurements (16.67%)
  10 (16.67%) high severe
subtree8-even/192       time:   [7.8698 ms 7.8721 ms 7.8750 ms]
subtree8-even/256       time:   [10.500 ms 10.504 ms 10.507 ms]

subtree8-rand/64        time:   [1.0604 ms 1.0683 ms 1.0765 ms]
Found 9 outliers among 60 measurements (15.00%)
  6 (10.00%) low severe
  2 (3.33%) high mild
  1 (1.67%) high severe
subtree8-rand/128       time:   [2.1299 ms 2.1355 ms 2.1397 ms]
subtree8-rand/192       time:   [3.4609 ms 3.4727 ms 3.4815 ms]
subtree8-rand/256       time:   [5.0838 ms 5.0890 ms 5.0942 ms]
Found 2 outliers among 60 measurements (3.33%)
  1 (1.67%) low mild
  1 (1.67%) high mild

There seems to always be several outliers, no matter how quiet I make my system. Here's the output without the outlier diagnostic-noise, for easier reading:

subtree8-even/64        time:   [2.6322 ms 2.6329 ms 2.6338 ms]
subtree8-even/128       time:   [5.1247 ms 5.1529 ms 5.1772 ms]
subtree8-even/192       time:   [7.8698 ms 7.8721 ms 7.8750 ms]
subtree8-even/256       time:   [10.500 ms 10.504 ms 10.507 ms]

subtree8-rand/64        time:   [1.0604 ms 1.0683 ms 1.0765 ms]
subtree8-rand/128       time:   [2.1299 ms 2.1355 ms 2.1397 ms]
subtree8-rand/192       time:   [3.4609 ms 3.4727 ms 3.4815 ms]
subtree8-rand/256       time:   [5.0838 ms 5.0890 ms 5.0942 ms]

It also turns out that I did the math for roughly-evenly distributed leaves incorrectly, for the benchmarks this PR had originally. I at first made this mistake in this new benchmark too, and was astonished to see the performance jump from microsecond figures to millisecond figures going from supposedly evenly distributed data to random data. I was accidentally generating far too many leaves with the same index, which were then getting de-duplicated. After fixing that, the even benchmarks are now in the same order of magnitude as the random ones.

bobbinth · 2024-10-24T06:49:18Z

Thank you! A couple of follow up questions:

How much time does it take to build a tree for a single leaf? The reason I'm asking is that vast majority of the time we'd be building trees that have just one leaf in them. For example, assuming the leaves are randomly distributed, if we have 100M leaves, the subtrees up until depth 24 are very likely to be just single-leaf trees.

How does the timing for building a tree from 256 leaves compare to the timing for building a fully balanced MerkleTree with 256 leaves? I'm curious because the fully-balanced case should give us the lower bound on performance as most of the time there should be spent hashing.

Qyriad · 2024-10-25T18:25:57Z

Good questions! I'll find out!

sonarcloud · 2024-10-25T21:00:46Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

Qyriad · 2024-10-25T21:01:28Z

And here are the results:

balanced-merkle-even    time:   [1.2906 ms 1.2907 ms 1.2909 ms]
balanced-merkle-rand    time:   [1.2834 ms 1.2859 ms 1.2877 ms]

subtree8-even/1         time:   [40.945 µs 40.948 µs 40.951 µs]
subtree8-even/64        time:   [2.6020 ms 2.6023 ms 2.6026 ms]
subtree8-even/128       time:   [5.1623 ms 5.1783 ms 5.1898 ms]
subtree8-even/192       time:   [7.7192 ms 7.7501 ms 7.7729 ms]
subtree8-even/256       time:   [10.127 ms 10.180 ms 10.233 ms]

subtree8-rand/1         time:   [40.610 µs 40.733 µs 40.820 µs]
subtree8-rand/64        time:   [1.0627 ms 1.0637 ms 1.0647 ms]
subtree8-rand/128       time:   [2.1227 ms 2.1256 ms 2.1283 ms]
subtree8-rand/192       time:   [3.4584 ms 3.4629 ms 3.4672 ms]
subtree8-rand/256       time:   [5.0221 ms 5.0341 ms 5.0430 ms]

bobbinth · 2024-10-25T21:09:51Z

41 microseconds for a single-leaf case is pretty good!

A bit surprising though that hashing a fully-balanced 256-leaf tree is about 4x more efficient than building a subtree with 256 leaves (I was thinking it'd be closer to 2x). I think this is fine for now and we can definitely optimize this more in the future (let's create an issue for this).

The next step would be to use this method as a building block for building a full tree.

Qyriad added 2 commits October 22, 2024 17:16

merkle: add parent() helper function on NodeIndex

04dbbf9

WIP(smt): impl simple subtree8 hashing and benchmarks for it

9638969

Qyriad force-pushed the wip/qyriad/bench-subtree branch from 56087c7 to 9638969 Compare October 24, 2024 01:30

Qyriad added 2 commits October 25, 2024 12:45

bench(smt-subtree): add a benchmark for single-leaf subtrees

fb0ff72

merkle: add a benchmark for constructing 256-leaf balanced trees

64a2d7a

Qyriad mentioned this pull request Oct 30, 2024

Optimize sparse Merkle subtree hashing #339

Open

Qyriad closed this Nov 4, 2024

Qyriad mentioned this pull request Nov 4, 2024

[WIP] implement subtree-based SMT computations #341

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DO NOT MERGE] benchmarks for hashing single subtrees recursively #334

[DO NOT MERGE] benchmarks for hashing single subtrees recursively #334

Qyriad commented Oct 16, 2024

bobbinth commented Oct 17, 2024

bobbinth commented Oct 17, 2024

Qyriad commented Oct 24, 2024

bobbinth commented Oct 24, 2024

Qyriad commented Oct 25, 2024

sonarcloud bot commented Oct 25, 2024

Qyriad commented Oct 25, 2024

bobbinth commented Oct 25, 2024

[DO NOT MERGE] benchmarks for hashing single subtrees recursively #334

[DO NOT MERGE] benchmarks for hashing single subtrees recursively #334

Conversation

Qyriad commented Oct 16, 2024

bobbinth commented Oct 17, 2024

bobbinth commented Oct 17, 2024

Qyriad commented Oct 24, 2024

bobbinth commented Oct 24, 2024

Qyriad commented Oct 25, 2024

sonarcloud bot commented Oct 25, 2024

Quality Gate passed

Qyriad commented Oct 25, 2024

bobbinth commented Oct 25, 2024