-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implements concurrent Smt::compute_mutations
#365
base: next
Are you sure you want to change the base?
Implements concurrent Smt::compute_mutations
#365
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me! I think the logic itself looks good. My comments are mostly about naming, docs and deduplication. I might have to take another look anyway, since I first had to understand how the Smt is implemented in sequential code 😅, so I'll just comment for now.
In general, I think adding comments to code parts that are not easy to understand would improve readability and understandability.
Regarding the approach, please correct me if I have misunderstandings, but my understanding of the approach is the following.
Assuming a tree of depth 64 with subtrees of depth 8 and mutations of just two (for example's sake) leaves at indices 0
and 65536
, compute_mutations
would do this, on a high-level and making some simple assumptions about how rayon assigns threads:
- Compute subtrees that were modified. This happens in
sorted_pairs_to_mutated_leaves
. This would yield two subtrees, covering the column ranges0..256
and65536..65792
. - Then in
build_subtree_mutations
, the subtrees are updated in parallel.- 1st iteration:
- Thread 0: Compute updates for leaves with indices
0..256
at depth 64. Then updates for leaves at depth 63 within this subtree, and so on, until it eventually results in new root at depth 56, column 0. - Thread 1: Compute updates for leaves with indices
65536..65792
at depth 64. Then updates for leaves at depth 63 within this subtree, and so on, until it eventually results in new root at depth 56, column 256 (= 65536 >> 8).
- Thread 0: Compute updates for leaves with indices
- 2nd iteration:
- Thread 0: Compute updates for leaves with indices
0..256
at depth 56 (only root 0 has changed). Eventually this results in a new root at depth 48, column 0. - Thread 1: Compute updates for leaves with indices
256..512
at depth 56 (only root 256 has changed). Eventually this results in a new root at depth 48, column 1.
- Thread 0: Compute updates for leaves with indices
- 3rd iteration:
- Thread 0: Compute updates for leaves with indices
0..256
at depth 48 (only root 0 has changed). Eventually this results in a new root at depth 40, column 0.
- Thread 0: Compute updates for leaves with indices
- More iterations like the 3rd until the root at depth 0 has been reached.
- 1st iteration:
Is this accurate? Would it make sense to add something like this as a doc comment to compute_mutations_subtree
(with corrections if it's inaccurate)?
10M entries tree. batch insertions (10k inserts): batch updates (10k updates): |
Co-authored-by: Philipp Gackstatter <PhilippGackstatter@users.noreply.github.com>
Hey @krushimir, quick question: Is this still Work-In-Progress or can it be marked as ready for review? |
Hi @PhilippGackstatter, I'll push some more changes today and then I'll mark it ready. |
Smt::compute_mutations
Smt::compute_mutations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
Co-authored-by: Philipp Gackstatter <PhilippGackstatter@users.noreply.github.com>
9242cff
to
e89daa9
Compare
11ad605
to
17b03a8
Compare
src/merkle/smt/mod.rs
Outdated
|
||
// Collect and sort key-value pairs by their corresponding leaf index | ||
let mut sorted_kv_pairs: Vec<_> = kv_pairs.into_iter().collect(); | ||
sorted_kv_pairs.sort_unstable_by_key(|(key, _)| Self::key_to_leaf_index(key).value()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use parallel sorting here? par_sort_unstable_by_key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tip. I did some benchmarking on EPYC 7662 (128 threads) and M1 Pro (10 threads), and for a batch of 10K elements, the benefits are unclear. Nevertheless, I pushed the change.
Also, I've implemented the same improvement into parallel tree construction - currently
tracked here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thank you! I left a couple of comments inline. The main one is about code organization - i.e., potentially moving the parallel mutation functions to the Smt
struct.
fn compute_mutations_concurrent( | ||
&self, | ||
kv_pairs: impl IntoIterator<Item = (Self::Key, Self::Value)>, | ||
) -> MutationSet<DEPTH, Self::Key, Self::Value> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I'm wondering (and maybe this is not a good idea): should we move all the concurrent mutation logic into Smt
struct? The thinking is that it applies primarily to Smt
use case and while we do support it for SimpleSmt
s with depth of multiples of 8, arguably, such support is more confusing than helpful.
On the other hand, if we do move the logic to the Smt
struct, it will make code comprehension a bit easier. Specifically, compute_mutations_concurrent()
, sorted_pairs_to_mutated_subtree_leaves()
, build_subtree_mutations()
, and fetch_sibling_pair()
would move there. The compute_mutations()
method in this module would look like:
fn compute_mutations(
&self,
kv_pairs: impl IntoIterator<Item = (Self::Key, Self::Value)>,
) -> MutationSet<DEPTH, Self::Key, Self::Value>
where
Self: Sized + Sync,
Self::Key: Send + Sync,
Self::Value: Send + Sync,
{
self.compute_mutations_sequential(kv_pairs)
}
And then in the Smt
struct we'd have the conditional compilation based on the feature flag. We could maybe encapsulate all methods related to concurrency in a separate impl
block - e.g.,:
#[cfg(feature = "concurrent")]
impl Smt {
fn compute_mutations_concurrent(
&self,
kv_pairs: impl IntoIterator<Item = (Self::Key, Self::Value)>,
) -> MutationSet<DEPTH, Self::Key, Self::Value>
where
Self: Sized + Sync,
Self::Key: Send + Sync,
Self::Value: Send + Sync,
{
...
}
fn sorted_pairs_to_mutated_subtree_leaves(
...
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with your suggestion. Moving the concurrent mutation logic to the Smt
struct makes sense if we're okay with the 8-depth limitation for now.
Would you like me to do the same for parallel construction methods, or leave that for another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's simple enough, we could move them in this PR as well.
Quality Gate passedIssues Measures |
This PR introduces a concurrent implementation of
Smt::compute_mutations
, leveraging an approach similar to the existing parallel construction logic.Benchmark results were collected on a 64-core (128-thread) AMD EPYC 7662 processor, with Rayon’s thread pool explicitly limited to the specified thread counts.
For context, construction benchmarks are also included for performance comparison.
1. Construction Benchmark
10k key-value pairs
2. Batched Insertion Benchmark
10k key-value pairs
3. Batched Update Benchmark
10k key-value pairs