Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(avm): calldatacopy gadget #7367

Closed
wants to merge 7 commits into from
Closed

Conversation

jeanmon
Copy link
Contributor

@jeanmon jeanmon commented Jul 5, 2024

Resolves #7211

@jeanmon jeanmon force-pushed the jm/7211-calldatacopy-gadget branch from bb81850 to 3d84fae Compare July 5, 2024 15:44
@jeanmon jeanmon marked this pull request as ready for review July 5, 2024 15:45
@AztecBot
Copy link
Collaborator

AztecBot commented Jul 5, 2024

Benchmark results

Metrics with a significant change:

  • proof_construction_time_sha256_100_ms (16): 6,473 (+19%)
  • avm_simulation_time_ms (Token:transfer_public): 27.8 (-41%)
  • avm_simulation_time_ms (Token:_increase_public_balance): 14.9 (-77%)
  • avm_simulation_time_ms (FPC:pay_refund_with_shielded_rebate): 163 (+46%)
Detailed results

All benchmarks are run on txs on the Benchmarking contract on the repository. Each tx consists of a batch call to create_note and increment_balance, which guarantees that each tx has a private call, a nested private call, a public call, and a nested public call, as well as an emitted private note, an unencrypted log, and public storage read and write.

This benchmark source data is available in JSON format on S3 here.

Proof generation

Each column represents the number of threads used in proof generation.

Metric 1 threads 4 threads 16 threads 32 threads 64 threads
proof_construction_time_sha256_ms 5,709 (-1%) 1,546 (-1%) 704 752 764 (-1%)
proof_construction_time_sha256_30_ms 11,675 (-1%) 3,141 (-1%) 1,407 1,437 (-1%) 1,457 (-1%)
proof_construction_time_sha256_100_ms 43,667 (-1%) 11,762 ⚠️ 6,473 (+19%) 5,402 5,352 (-1%)
proof_construction_time_poseidon_hash_ms 78.0 34.0 34.0 59.0 (+2%) 87.0 (-1%)
proof_construction_time_poseidon_hash_30_ms 1,515 (-1%) 416 201 (+1%) 223 (-1%) 266 (-1%)
proof_construction_time_poseidon_hash_100_ms 5,614 (-1%) 1,515 (-1%) 676 734 741 (-1%)

L2 block published to L1

Each column represents the number of txs on an L2 block published to L1.

Metric 4 txs 8 txs 16 txs
l1_rollup_calldata_size_in_bytes 1,412 1,412 1,412
l1_rollup_calldata_gas 9,476 9,474 9,464
l1_rollup_execution_gas 613,504 613,653 613,794
l2_block_processing_time_in_ms 755 (+1%) 1,418 2,719 (+1%)
l2_block_building_time_in_ms 13,071 (-1%) 26,537 (+3%) 50,530
l2_block_rollup_simulation_time_in_ms 13,071 (-1%) 26,537 (+3%) 50,530
l2_block_public_tx_process_time_in_ms 11,012 (-1%) 23,939 (+2%) 48,285

L2 chain processing

Each column represents the number of blocks on the L2 chain where each block has 8 txs.

Metric 3 blocks 5 blocks
node_history_sync_time_in_ms 7,026 (-1%) 9,951
node_database_size_in_bytes 12,439,632 16,207,952
pxe_database_size_in_bytes 16,254 26,813

Circuits stats

Stats on running time and I/O sizes collected for every kernel circuit run across all benchmarks.

Circuit simulation_time_in_ms witness_generation_time_in_ms proving_time_in_ms input_size_in_bytes output_size_in_bytes proof_size_in_bytes num_public_inputs size_in_gates
private-kernel-init 110 (+1%) 426 (-2%) 13,939 20,002 55,022 76,256 2,316 524,288
private-kernel-inner 227 766 46,440 (-1%) 82,134 55,022 76,256 2,316 2,097,152
private-kernel-tail 1,087 (+2%) 2,620 (+3%) 45,000 (+1%) 62,409 62,089 14,944 400 2,097,152
base-parity 6.55 (+2%) 897 2,424 (-2%) 160 96.0 2,240 3.00 131,072
root-parity 74.1 (+4%) 62.5 (+2%) 34,071 (+1%) 27,868 96.0 2,752 19.0 2,097,152
base-rollup 4,208 (+1%) 5,051 (-1%) 78,504 172,028 632 3,552 44.0 4,194,304
root-rollup 128 (-1%) 74.5 (-3%) 19,889 (+2%) 25,053 652 3,488 42.0 1,048,576
public-kernel-setup 194 (+1%) 2,337 36,582 103,911 80,310 108,992 3,339 2,097,152
public-kernel-app-logic 152 (+2%) 3,302 (-1%) 37,643 103,911 80,310 108,992 3,339 2,097,152
public-kernel-tail 914 (+2%) 22,625 (-5%) 155,964 400,808 10,046 14,944 400 8,388,608
private-kernel-reset-small 300 (+2%) 1,206 (+2%) 27,610 (-2%) 79,273 55,022 76,256 2,316 1,048,576
public-kernel-teardown 138 (+1%) 3,311 (+1%) 36,866 (+1%) 103,911 80,310 108,992 3,339 2,097,152
merge-rollup 40.2 N/A N/A 16,094 632 N/A N/A N/A
private-kernel-tail-to-public N/A 8,777 (+1%) 48,713 N/A N/A 108,992 3,339 2,097,152

Stats on running time collected for app circuits

Function input_size_in_bytes output_size_in_bytes witness_generation_time_in_ms proof_size_in_bytes proving_time_in_ms size_in_gates num_public_inputs
ContractClassRegisterer:register 1,312 9,344 413 (+4%) N/A N/A N/A N/A
ContractInstanceDeployer:deploy 1,376 9,344 26.1 (+2%) N/A N/A N/A N/A
MultiCallEntrypoint:entrypoint 1,888 9,344 661 (+5%) N/A N/A N/A N/A
GasToken:deploy 1,344 9,344 567 N/A N/A N/A N/A
SchnorrAccount:constructor 1,280 9,344 437 N/A N/A N/A N/A
SchnorrAccount:entrypoint 2,272 9,344 777 16,352 4,671 (-7%) 131,072 444
Token:privately_mint_private_note 1,248 9,344 522 (-1%) N/A N/A N/A N/A
FPC:fee_entrypoint_public 1,312 9,344 105 (-4%) 16,352 1,874 (-1%) 65,536 444
Token:transfer 1,280 9,344 1,612 16,352 10,686 524,288 444
AuthRegistry:set_authorized (avm) 19,222 N/A N/A 94,048 (+3%) 1,813 (+4%) N/A N/A
FPC:prepare_fee (avm) 26,664 N/A N/A 94,112 (+3%) 3,361 (+13%) N/A N/A
Token:transfer_public (avm) 42,914 N/A N/A 94,112 (+3%) 4,063 (-5%) N/A N/A
AuthRegistry:consume (avm) 33,100 N/A N/A 94,048 (+3%) 3,185 (+6%) N/A N/A
FPC:pay_refund (avm) 36,829 N/A N/A 94,080 (+3%) 19,391 (+8%) N/A N/A
Benchmarking:create_note 1,312 9,344 424 N/A N/A N/A N/A
SchnorrAccount:verify_private_authwit 1,248 9,344 41.5 N/A N/A N/A N/A
Token:unshield 1,344 9,344 1,328 (+1%) N/A N/A N/A N/A
FPC:fee_entrypoint_private 1,344 9,344 1,699 (+1%) N/A N/A N/A N/A

AVM Simulation

Time to simulate various public functions in the AVM.

Function time_ms bytecode_size_in_bytes
GasToken:_increase_public_balance 78.8 (+1%) 13,790
GasToken:set_portal 13.1 (+16%) 3,339
Token:constructor 102 23,692
FPC:constructor 73.3 (+2%) 13,592
GasToken:mint_public 62.5 (-6%) 10,158
Token:mint_public 342 (-6%) 19,034
Token:assert_minter_and_mint 40.8 (+1%) 12,925
AuthRegistry:set_authorized 35.1 (-12%) 7,812
FPC:prepare_fee 99.5 (+1%) 15,062
Token:transfer_public ⚠️ 27.8 (-41%) 31,218
FPC:pay_refund 130 (+1%) 25,260
Benchmarking:increment_balance 1,335 15,267
Token:_increase_public_balance ⚠️ 14.9 (-77%) 15,006
FPC:pay_refund_with_shielded_rebate ⚠️ 163 (+46%) 26,347

Public DB Access

Time to access various public DBs.

Function time_ms
get-nullifier-index 0.148 (-3%)

Tree insertion stats

The duration to insert a fixed batch of leaves into each tree type.

Metric 1 leaves 16 leaves 64 leaves 128 leaves 256 leaves 512 leaves 1024 leaves
batch_insert_into_append_only_tree_16_depth_ms 10.4 16.8 N/A N/A N/A N/A N/A
batch_insert_into_append_only_tree_16_depth_hash_count 16.8 31.7 N/A N/A N/A N/A N/A
batch_insert_into_append_only_tree_16_depth_hash_ms 0.601 0.517 N/A N/A N/A N/A N/A
batch_insert_into_append_only_tree_32_depth_ms N/A N/A 48.8 (+1%) 76.4 (+1%) 131 245 (-1%) 472 (+1%)
batch_insert_into_append_only_tree_32_depth_hash_count N/A N/A 95.9 159 287 543 1,055
batch_insert_into_append_only_tree_32_depth_hash_ms N/A N/A 0.499 (+1%) 0.470 (+1%) 0.450 0.444 0.440
batch_insert_into_indexed_tree_20_depth_ms N/A N/A 60.4 (+1%) 111 (-1%) 182 353 (-1%) 691
batch_insert_into_indexed_tree_20_depth_hash_count N/A N/A 109 207 355 691 1,363
batch_insert_into_indexed_tree_20_depth_hash_ms N/A N/A 0.511 (+1%) 0.499 (-1%) 0.483 0.479 0.474
batch_insert_into_indexed_tree_40_depth_ms N/A N/A 73.5 N/A N/A N/A N/A
batch_insert_into_indexed_tree_40_depth_hash_count N/A N/A 133 N/A N/A N/A N/A
batch_insert_into_indexed_tree_40_depth_hash_ms N/A N/A 0.523 N/A N/A N/A N/A

Miscellaneous

Transaction sizes based on how many contract classes are registered in the tx.

Metric 0 registered classes 1 registered classes
tx_size_in_bytes 74,105 667,868

Transaction size based on fee payment method

| Metric | |
| - | |

Copy link
Contributor

@fcarreiro fcarreiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo some naming/conceptual changes.

Otherwise, I'd really like to get my stack of PRs in first if possible (since they are all green): https://app.graphite.dev/github/pr/AztecProtocol/aztec-packages/7357/chore-avm-smaller-transcript

struct SliceTraceEntry {
uint32_t clk = 0;
uint8_t space_id = 0;
FF addr_ff = 0; // Should normally be uint32_t but the last witness addr of a calldatacopy operation row might
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Should normally be uint32_t but the last witness addr of a calldatacopy operation row might
// be FF(2^32).

What does this mean? IIUC our memory goes from 0 to 2^32-1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The relations proving the correct calldatacopy trace requires an extra row where addr might be equal to 2^32. This happens if calldatacopy writes data at the highest memory address. In the .cpp, I have added some explanations.

@@ -0,0 +1,39 @@
include "../main.pil";

namespace slice(256);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I get what this does, my only comment is naming:
calling it slice (or better, mem_slice) makes it look like it's generic but then even the core relations like "addresses should be decreasing" and "cnt management" do use calldatacopy-specific selectors. Should this just be called the calldata gadget?

(I do understand that even if the gadget was a generic slice read/write gadget then there would be some mention of calldata, just like it happens with normal memory... we need at least some selectors for lookups/perms)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen for sure is that return opcode will be performed in the same trace. So, there will be specific selectors for calldatacopy and return opcode. Other columns like cnt, and addr, can be re-used. That is why I named it slice. I chose a short prefix, as the namespace is prefixed for any field of this trace.

#[CD_OFFSET_INCREMENT]
sel_cd_cpy * (cd_offset + 1 - cd_offset') = 0;

#[LOOKUP_CD_VALUE]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this one shouldn't be in main.pil? It really depends. If this is a calldata-specific gadget, then no. If this is a generic slice gadget, maybe yes? (very optional, I'm ok if things stay as they are)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very calldata-specific. Calladata column being defined in the main trace, we could have defined this permutation there as well. I thought due to the specificity, we keep it here.

barretenberg/cpp/pil/avm/mem.pil Outdated Show resolved Hide resolved
@jeanmon jeanmon force-pushed the jm/7211-calldatacopy-gadget branch from 5161869 to ff0b167 Compare July 8, 2024 11:13
@jeanmon jeanmon force-pushed the jm/7211-calldatacopy-gadget branch from ff0b167 to 29a06fa Compare July 10, 2024 11:01
@jeanmon
Copy link
Contributor Author

jeanmon commented Jul 10, 2024

Obsoleted by the following PR: #7415

@jeanmon jeanmon closed this Jul 10, 2024
@jeanmon jeanmon deleted the jm/7211-calldatacopy-gadget branch July 10, 2024 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AVM: CALLDATACOPY gadget
3 participants