-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Sequential insertion in indexed trees #10111
Conversation
sirasistant
commented
Nov 21, 2024
•
edited
Loading
edited
- Adds sequential (no subtree) insertion in indexed trees
- If witnesses are requested, the response includes 2*N witnesses, N for the low leaves and N for the insertions
- If no witness is requested, it directly does sparse_batch_update
- Updating an item multiple times in the same call is allowed
- Uses sequential insertion with witnesses to avoid doing N batch updates of 1 item for the base rollup
- Uses sequential insertion without witnesses for syncing the public data tree, to avoid doing 1 batch insertion per TX in the block
…c-packages into arv/insert_sequential
@@ -27,24 +27,79 @@ const size_t MAX_BATCH_SIZE = 64; | |||
template <typename TreeType> void add_values(TreeType& tree, const std::vector<NullifierLeafValue>& values) | |||
{ | |||
Signal signal(1); | |||
typename TreeType::AddCompletionCallback completion = [&](const auto&) -> void { signal.signal_level(0); }; | |||
bool success = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the sequential insertion compare with the batch insertion?
The mutli-threaded batch insertion doesn't perform as well as I had hoped. It was much better when we used pedersen as a much greater portion of the time was spent hashing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With witnesses, it's worse (that's expected since you're fetching more witnesses)
However, without witnesses, it's better single threaded (yay) but worse multithreaded. That one is a weird one, since I expected sparse_batch_update (which is the only thing sequential insertion without witnesses does) to perform better than sparse_batch_update and then append. Do you know why this could be the case? Should I profile this thing?
Here is the bench results I got on mainframe (not the best place for benching but good enough I hope)
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/2/iterations:1000 1.84 ms 0.028 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/4/iterations:1000 2.99 ms 0.031 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/8/iterations:1000 5.48 ms 0.029 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/16/iterations:1000 10.2 ms 0.027 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/32/iterations:1000 20.2 ms 0.034 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/64/iterations:1000 39.4 ms 0.041 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/512/iterations:10 303 ms 0.081 ms 10
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/1024/iterations:10 647 ms 0.093 ms 10
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/2048/iterations:10 1271 ms 0.125 ms 10
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/4096/iterations:10 2471 ms 0.237 ms 10
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/8192/iterations:10 4966 ms 0.388 ms 10
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/2/iterations:1000 2.22 ms 0.018 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/4/iterations:1000 4.92 ms 0.029 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/8/iterations:1000 9.24 ms 0.029 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/16/iterations:1000 18.2 ms 0.028 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/32/iterations:1000 36.2 ms 0.033 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/64/iterations:1000 73.7 ms 0.044 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/512/iterations:10 575 ms 0.076 ms 10
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/1024/iterations:10 1187 ms 0.091 ms 10
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/2048/iterations:10 2406 ms 0.106 ms 10
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/4096/iterations:10 5038 ms 0.148 ms 10
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/8192/iterations:10 11375 ms 0.942 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/2/iterations:1000 1.75 ms 0.022 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/4/iterations:1000 1.69 ms 0.027 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/8/iterations:1000 2.32 ms 0.023 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/16/iterations:1000 4.16 ms 0.033 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/32/iterations:1000 6.99 ms 0.029 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/64/iterations:1000 13.8 ms 0.030 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/512/iterations:10 82.9 ms 0.057 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/1024/iterations:10 143 ms 0.083 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/2048/iterations:10 341 ms 0.115 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/4096/iterations:10 663 ms 0.174 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/8192/iterations:10 1489 ms 0.226 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/2/iterations:1000 1.03 ms 0.020 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/4/iterations:1000 1.43 ms 0.019 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/8/iterations:1000 2.74 ms 0.020 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/16/iterations:1000 4.95 ms 0.022 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/32/iterations:1000 10.0 ms 0.025 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/64/iterations:1000 20.6 ms 0.024 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/512/iterations:10 126 ms 0.053 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/1024/iterations:10 279 ms 0.072 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/2048/iterations:10 658 ms 0.104 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/4096/iterations:10 1641 ms 0.213 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/8192/iterations:10 4417 ms 0.286 ms 10
single_thread_indexed_tree_bench<Poseidon2, BATCH>/2/iterations:1000 1.32 ms 0.019 ms 1000
single_thread_indexed_tree_bench<Poseidon2, BATCH>/4/iterations:1000 1.80 ms 0.023 ms 1000
single_thread_indexed_tree_bench<Poseidon2, BATCH>/8/iterations:1000 2.54 ms 0.025 ms 1000
single_thread_indexed_tree_bench<Poseidon2, BATCH>/16/iterations:1000 4.04 ms 0.025 ms 1000
single_thread_indexed_tree_bench<Poseidon2, BATCH>/32/iterations:1000 7.11 ms 0.028 ms 1000
single_thread_indexed_tree_bench<Poseidon2, BATCH>/64/iterations:1000 13.4 ms 0.037 ms 1000
single_thread_indexed_tree_bench<Poseidon2, BATCH>/512/iterations:10 64.9 ms 0.077 ms 10
single_thread_indexed_tree_bench<Poseidon2, BATCH>/1024/iterations:10 120 ms 0.099 ms 10
single_thread_indexed_tree_bench<Poseidon2, BATCH>/2048/iterations:10 211 ms 0.089 ms 10
single_thread_indexed_tree_bench<Poseidon2, BATCH>/4096/iterations:10 389 ms 0.159 ms 10
single_thread_indexed_tree_bench<Poseidon2, BATCH>/8192/iterations:10 777 ms 0.696 ms 10
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/2/iterations:1000 1.01 ms 0.018 ms 1000
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/4/iterations:1000 1.47 ms 0.019 ms 1000
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/8/iterations:1000 2.20 ms 0.024 ms 1000
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/16/iterations:1000 3.84 ms 0.023 ms 1000
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/32/iterations:1000 6.64 ms 0.030 ms 1000
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/64/iterations:1000 12.5 ms 0.032 ms 1000
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/512/iterations:10 63.7 ms 0.074 ms 10
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/1024/iterations:10 112 ms 0.079 ms 10
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/2048/iterations:10 205 ms 0.108 ms 10
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/4096/iterations:10 374 ms 0.152 ms 10
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/8192/iterations:10 730 ms 0.402 ms 10
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/2/iterations:1000 1.43 ms 0.026 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/4/iterations:1000 1.68 ms 0.033 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/8/iterations:1000 1.98 ms 0.028 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/16/iterations:1000 2.55 ms 0.029 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/32/iterations:1000 3.75 ms 0.034 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/64/iterations:1000 6.05 ms 0.034 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/512/iterations:100 31.7 ms 0.060 ms 100
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/1024/iterations:100 71.2 ms 0.095 ms 100
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/2048/iterations:100 124 ms 0.106 ms 100
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/4096/iterations:100 248 ms 0.153 ms 100
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/8192/iterations:100 594 ms 0.245 ms 100
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/2/iterations:1000 0.859 ms 0.020 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/4/iterations:1000 0.987 ms 0.022 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/8/iterations:1000 1.30 ms 0.024 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/16/iterations:1000 2.46 ms 0.034 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/32/iterations:1000 3.68 ms 0.040 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/64/iterations:1000 5.84 ms 0.031 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/512/iterations:100 40.1 ms 0.069 ms 100
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/1024/iterations:100 62.7 ms 0.086 ms 100
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/2048/iterations:100 139 ms 0.115 ms 100
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/4096/iterations:100 246 ms 0.153 ms 100
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/8192/iterations:100 808 ms 0.301 ms 100
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed. Looks like the multi-threaded performs a little better in most scenarios. I think we should wait until the compete use case of the world state has settled further and then optimise.
@@ -94,10 +157,14 @@ template <typename TreeType> void single_thread_indexed_tree_bench(State& state) | |||
|
|||
const size_t initial_size = 1024 * 16; | |||
std::vector<NullifierLeafValue> initial_batch(initial_size); | |||
for (size_t i = 0; i < batch_size; ++i) { | |||
for (size_t i = 0; i < initial_size; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That look like a bug, nice catch
// block inside this scope in order to keep current_batch/completion alive until the end of all operations | ||
signal.wait_for_level(); | ||
PublicDataTree::AddCompletionCallback completion = [&](const auto&) -> void { signal.signal_decrement(); }; | ||
wrapper.tree->add_or_update_values_sequentially(public_writes, completion); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, so we don't have to do the sync in batches.
…c-packages into arv/insert_sequential
8a442fd
to
f4586ca
Compare
This should be ready for review now |
@@ -1144,7 +1195,7 @@ void ContentAddressedIndexedTree<Store, HashingPolicy>::update_leaf_and_hash_to_ | |||
// 3. Write the new node value | |||
index_t index = leaf_index; | |||
uint32_t level = depth_; | |||
fr new_hash = HashingPolicy::hash(leaf.get_hash_inputs()); | |||
fr new_hash = leaf.value.is_empty() ? 0 : HashingPolicy::hash(leaf.get_hash_inputs()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use fr::zero() instead of 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think recently there was a group of PRs to change fr::zero() to 0. The rationale is that the constructor of fr is constexpr, so the cost is the same (compile-time), while readability is (subjectively) worse. I have no preference personally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me neither, but since in other places of this file fr::zero is used, I don't mind making it the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the one comment
🤖 I have created a release *beep* *boop* --- <details><summary>aztec-package: 0.65.2</summary> ## [0.65.2](aztec-package-v0.65.1...aztec-package-v0.65.2) (2024-11-28) ### Features * New proving broker ([#10174](#10174)) ([6fd5fc1](6fd5fc1)) </details> <details><summary>barretenberg.js: 0.65.2</summary> ## [0.65.2](barretenberg.js-v0.65.1...barretenberg.js-v0.65.2) (2024-11-28) ### Miscellaneous * **barretenberg.js:** Synchronize aztec-packages versions </details> <details><summary>aztec-packages: 0.65.2</summary> ## [0.65.2](aztec-packages-v0.65.1...aztec-packages-v0.65.2) (2024-11-28) ### Features * Fee foresight support ([#10262](#10262)) ([9e19244](9e19244)) * New proving broker ([#10174](#10174)) ([6fd5fc1](6fd5fc1)) * Sequential insertion in indexed trees ([#10111](#10111)) ([bfd9fa6](bfd9fa6)) * Swap polys to facilitate dynamic trace overflow ([#9976](#9976)) ([b7b282c](b7b282c)) ### Bug Fixes * Don't store indices of zero leaves. ([#10270](#10270)) ([c22be8b](c22be8b)) * Expect proper duplicate nullifier error patterns in e2e tests ([#10256](#10256)) ([4ee8344](4ee8344)) ### Miscellaneous * Check artifact consistency ([#10271](#10271)) ([6a49405](6a49405)) * Dont import things that themselves import jest in imported functions ([#10260](#10260)) ([9440c1c](9440c1c)) * Fix bad merge in integration l1 publisher ([#10272](#10272)) ([b5a6aa4](b5a6aa4)) * Fixing sol warnings ([#10276](#10276)) ([3d113b2](3d113b2)) * Pull out sync changes ([#10274](#10274)) ([391a6b7](391a6b7)) * Pull value merger code from sync ([#10080](#10080)) ([3392629](3392629)) * Remove default gas settings ([#10163](#10163)) ([c9a4d88](c9a4d88)) * Replace relative paths to noir-protocol-circuits ([654d801](654d801)) * Teardown context in prover coordination test ([#10257](#10257)) ([7ea3888](7ea3888)) </details> <details><summary>barretenberg: 0.65.2</summary> ## [0.65.2](barretenberg-v0.65.1...barretenberg-v0.65.2) (2024-11-28) ### Features * Sequential insertion in indexed trees ([#10111](#10111)) ([bfd9fa6](bfd9fa6)) * Swap polys to facilitate dynamic trace overflow ([#9976](#9976)) ([b7b282c](b7b282c)) ### Bug Fixes * Don't store indices of zero leaves. ([#10270](#10270)) ([c22be8b](c22be8b)) ### Miscellaneous * Pull value merger code from sync ([#10080](#10080)) ([3392629](3392629)) </details> --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
🤖 I have created a release *beep* *boop* --- <details><summary>aztec-package: 0.65.2</summary> ## [0.65.2](AztecProtocol/aztec-packages@aztec-package-v0.65.1...aztec-package-v0.65.2) (2024-11-28) ### Features * New proving broker ([#10174](AztecProtocol/aztec-packages#10174)) ([6fd5fc1](AztecProtocol/aztec-packages@6fd5fc1)) </details> <details><summary>barretenberg.js: 0.65.2</summary> ## [0.65.2](AztecProtocol/aztec-packages@barretenberg.js-v0.65.1...barretenberg.js-v0.65.2) (2024-11-28) ### Miscellaneous * **barretenberg.js:** Synchronize aztec-packages versions </details> <details><summary>aztec-packages: 0.65.2</summary> ## [0.65.2](AztecProtocol/aztec-packages@aztec-packages-v0.65.1...aztec-packages-v0.65.2) (2024-11-28) ### Features * Fee foresight support ([#10262](AztecProtocol/aztec-packages#10262)) ([9e19244](AztecProtocol/aztec-packages@9e19244)) * New proving broker ([#10174](AztecProtocol/aztec-packages#10174)) ([6fd5fc1](AztecProtocol/aztec-packages@6fd5fc1)) * Sequential insertion in indexed trees ([#10111](AztecProtocol/aztec-packages#10111)) ([bfd9fa6](AztecProtocol/aztec-packages@bfd9fa6)) * Swap polys to facilitate dynamic trace overflow ([#9976](AztecProtocol/aztec-packages#9976)) ([b7b282c](AztecProtocol/aztec-packages@b7b282c)) ### Bug Fixes * Don't store indices of zero leaves. ([#10270](AztecProtocol/aztec-packages#10270)) ([c22be8b](AztecProtocol/aztec-packages@c22be8b)) * Expect proper duplicate nullifier error patterns in e2e tests ([#10256](AztecProtocol/aztec-packages#10256)) ([4ee8344](AztecProtocol/aztec-packages@4ee8344)) ### Miscellaneous * Check artifact consistency ([#10271](AztecProtocol/aztec-packages#10271)) ([6a49405](AztecProtocol/aztec-packages@6a49405)) * Dont import things that themselves import jest in imported functions ([#10260](AztecProtocol/aztec-packages#10260)) ([9440c1c](AztecProtocol/aztec-packages@9440c1c)) * Fix bad merge in integration l1 publisher ([#10272](AztecProtocol/aztec-packages#10272)) ([b5a6aa4](AztecProtocol/aztec-packages@b5a6aa4)) * Fixing sol warnings ([#10276](AztecProtocol/aztec-packages#10276)) ([3d113b2](AztecProtocol/aztec-packages@3d113b2)) * Pull out sync changes ([#10274](AztecProtocol/aztec-packages#10274)) ([391a6b7](AztecProtocol/aztec-packages@391a6b7)) * Pull value merger code from sync ([#10080](AztecProtocol/aztec-packages#10080)) ([3392629](AztecProtocol/aztec-packages@3392629)) * Remove default gas settings ([#10163](AztecProtocol/aztec-packages#10163)) ([c9a4d88](AztecProtocol/aztec-packages@c9a4d88)) * Replace relative paths to noir-protocol-circuits ([654d801](AztecProtocol/aztec-packages@654d801)) * Teardown context in prover coordination test ([#10257](AztecProtocol/aztec-packages#10257)) ([7ea3888](AztecProtocol/aztec-packages@7ea3888)) </details> <details><summary>barretenberg: 0.65.2</summary> ## [0.65.2](AztecProtocol/aztec-packages@barretenberg-v0.65.1...barretenberg-v0.65.2) (2024-11-28) ### Features * Sequential insertion in indexed trees ([#10111](AztecProtocol/aztec-packages#10111)) ([bfd9fa6](AztecProtocol/aztec-packages@bfd9fa6)) * Swap polys to facilitate dynamic trace overflow ([#9976](AztecProtocol/aztec-packages#9976)) ([b7b282c](AztecProtocol/aztec-packages@b7b282c)) ### Bug Fixes * Don't store indices of zero leaves. ([#10270](AztecProtocol/aztec-packages#10270)) ([c22be8b](AztecProtocol/aztec-packages@c22be8b)) ### Miscellaneous * Pull value merger code from sync ([#10080](AztecProtocol/aztec-packages#10080)) ([3392629](AztecProtocol/aztec-packages@3392629)) </details> --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).