[WIP] smt: implement parallel mutation computations #336
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Describe your changes
This is a draft providing a cursory implementation of recomputing SMT nodes in parallel, by splitting the tree into depth-8 subtree tasks which each recursively process their nodes.
Some benchmark results on my Ryzen 7950X:
The total time increases exponentially as the number of key-value pairs being inserted increases. With some added
eprintln!()
s, we can also see each individual task's time also increases as we move up the tree:A profile indicates that we're spending almost half our time just determining if a node needs to be recomputed or not, in
is_index_dirty()
. I'm not sure how to mitigate this and still compute fixed subtrees. We could walk up from each modified leaf and mark each node in the path as dirty, upfront. However, a similar up-front computation to identify node indices of note was a considerable bottleneck in the previous approach to this parallelization. There may also be some heuristic we could apply to node indices to quickly estimate if they are ancestors of a modified index and sink a few duplicate calculations for when it fails.