From a9537fb40db7ed20b32cc7f90b36174eed52e782 Mon Sep 17 00:00:00 2001 From: David Banks <47112877+dbanks12@users.noreply.github.com> Date: Thu, 11 Jan 2024 15:26:45 -0500 Subject: [PATCH] docs(yellowpaper): update AVM spec for with "daGasLeft", some cleanup (#3956) --- yellow-paper/docs/public-vm/avm.md | 108 ++++++++++++++++------------- 1 file changed, 60 insertions(+), 48 deletions(-) diff --git a/yellow-paper/docs/public-vm/avm.md b/yellow-paper/docs/public-vm/avm.md index a116ed5f803..0426a3ff979 100644 --- a/yellow-paper/docs/public-vm/avm.md +++ b/yellow-paper/docs/public-vm/avm.md @@ -42,7 +42,7 @@ Many terms and definitions here are borrowed from the [Ethereum Yellow Paper](ht An **execution context** includes the information necessary to initiate AVM execution along with the state maintained by the AVM throughout execution: ``` -AVMContext { +AvmContext { environment: ExecutionEnvironment, machineState: MachineState, worldState: WorldState, @@ -64,15 +64,16 @@ ExecutionEnvironment { origin, l1GasPrice, l2GasPrice, - calldata: [], + daGasPrice, sender, portal, - bytecode: [], blockHeader: BlockHeader, globalVariables: PublicGlobalVariables, messageCallDepth, isStaticCall, isDelegateCall, + calldata: [], + bytecode: [], } ``` @@ -82,8 +83,10 @@ ExecutionEnvironment { MachineState { l1GasLeft, l2GasLeft, + daGasLeft, pc, - memory: offset => value, // uninitialized at start + internalCallStack, + memory: offset => value // all entries are zero at start } ``` @@ -91,24 +94,24 @@ MachineState { ``` WorldState { - publicStorage: (address, slot) => value, // read/write - noteHashes: (address, index) => noteHash, // read & append only - nullifiers: (address, index) => nullifier, // read & append only - l1l2messageHashes: (address, key) => messageHash, // read only - contracts: (address) => {bytecode, portalAddress}, // read only + publicStorage: (address, slot) => value, // read/write + contracts: (address) => {bytecode, portalAddress}, // read-only + newNoteHashes: [], // append-only + newNullifiers: [], // append-only + l1l2messageHashes: (address, key) => messageHash, // read-only } ``` > Note: the notation `key => value` describes a mapping from `key` to `value`. -> Note: each member of the world state is implemented as an independent merkle tree with different properties. +> Note: `contracts` is read-only because contract deployments are not handled by the AVM The **accrued substate**, as coined in the [Ethereum Yellow Paper](https://ethereum.github.io/yellowpaper/paper), contains information that is accrued throughout transaction execution to be "acted upon immediately following the transaction." These are append-only arrays containing state that is not relevant to other calls or transactions. Similar to world state, if a message call succeeds, its substate is appended to its calling context, but if it fails its substate is dropped by its caller. ``` AccruedSubstate { - logs: [], // append-only - l2toL1Messages: [], // append-only + unencryptedLogs: [], + l2ToL1Messages: [], } ``` @@ -127,7 +130,7 @@ This section outlines AVM context initialization specifically for a **public exe When AVM execution is initiated for a public execution request, the AVM context is initialized as follows: ``` -context = AVMContext { +context = AvmContext { environment: INITIAL_EXECUTION_ENVIRONMENT, machineState: INITIAL_MACHINE_STATE, accruedSubstate: empty, @@ -147,23 +150,25 @@ INITIAL_EXECUTION_ENVIRONMENT = ExecutionEnvironment { origin: TxRequest.origin, l1GasPrice: TxRequest.l1GasPrice, l2GasPrice: TxRequest.l2GasPrice, - calldata: PublicCallRequest.args, + daGasPrice: TxRequest.daGasPrice, sender: PublicCallRequest.CallContext.msgSender, portal: PublicCallRequest.CallContext.portalContractAddress, - bytecode: worldState.contracts[PublicCallRequest.contractAddress], blockHeader: , globalVariables: messageCallDepth: 0, isStaticCall: PublicCallRequest.CallContext.isStaticCall, isDelegateCall: PublicCallRequest.CallContext.isDelegateCall, + calldata: PublicCallRequest.args, + bytecode: worldState.contracts[PublicCallRequest.contractAddress], } INITIAL_MACHINE_STATE = MachineState { l1GasLeft: TxRequest.l1GasLimit, l2GasLeft: TxRequest.l2GasLimit, + daGasLeft: TxRequest.daGasLimit, pc: 0, - memory: uninitialized, internalCallStack: empty, + memory: empty, } INITIAL_MESSAGE_CALL_RESULTS = MessageCallResults { @@ -172,8 +177,6 @@ INITIAL_MESSAGE_CALL_RESULTS = MessageCallResults { } ``` -> Note: unlike memory in the Ethereum Virtual Machine, uninitialized memory in the AVM is not readable! A memory cell must be written (and therefore [type-tagged](./state-model#types-and-tagged-memory)) before it can be read. - ## Execution With an initialized context (and therefore an initial program counter of 0), the AVM can execute a message call starting with the very first instruction in its bytecode. @@ -189,12 +192,13 @@ The `INTERNALCALL` instruction jumps to the destination specified by its input ( > Jump destinations can only be constants from the contract bytecode, or destinations popped from `machineState.internalCallStack`. A jump destination will never originate from main memory. ### Gas limits and tracking +> Note: see ["Gas and Fees"](../gas-and-fees) for a deeper dive into Aztec's gas model and for definitions of each type of gas. -Each instruction has an associated `l1GasCost` and `l2GasCost`. Before an instruction is executed, the VM enforces that there is sufficient gas remaining via the following assertions: - +Each instruction has an associated `l1GasCost`, `l2GasCost`, and `daGasCost`. Before an instruction is executed, the VM enforces that there is sufficient gas remaining via the following assertions: ``` assert machineState.l1GasLeft - instr.l1GasCost > 0 assert machineState.l2GasLeft - instr.l2GasCost > 0 +assert machineState.daGasLeft - instr.daGasCost > 0 ``` > Note: many instructions (like arithmetic operations) have 0 `l1GasCost`. Instructions only incur an L1 cost if they modify world state or accrued substate. @@ -204,6 +208,7 @@ If these assertions pass, the machine state's gas left is decreased prior to the ``` machineState.l1GasLeft -= instr.l1GasCost machineState.l2GasLeft -= instr.l2GasCost +machineState.daGasLeft -= instr.daGasCost ``` If either of these assertions _fail_ for an instruction, this triggers an exceptional halt. The gas left is set to 0 and execution reverts. @@ -211,6 +216,7 @@ If either of these assertions _fail_ for an instruction, this triggers an except ``` machineState.l1GasLeft = 0 machineState.l2GasLeft = 0 +machineState.daGasLeft = 0 ``` > Reverting and exceptional halts will be covered in more detail [in a later section](#halting). @@ -218,11 +224,10 @@ machineState.l2GasLeft = 0 ### Gas cost notes and examples A instruction's gas cost is loosely derived from its complexity. Execution complexity of some instructions changes based on inputs. Here are some examples and important notes: - -- [`JUMP`](./instruction-set/#isa-section-jump) is an example of an instruction with constant gas cost. Regardless of its inputs, the instruction always incurs the same `l1GasCost` and `l2GasCost`. +- [`JUMP`](./instruction-set/#isa-section-jump) is an example of an instruction with constant gas cost. Regardless of its inputs, the instruction always incurs the same `l1GasCost`, `l2GasCost`, and `daGasCost`. - The [`SET`](./instruction-set/#isa-section-set) instruction operates on a different sized constant (based on its `dst-type`). Therefore, this instruction's gas cost increases with the size of its input. -- Instructions that operate on a data range of a specified "size" scale in cost with that size. An example of this is the [`CALLDATACOPY`](./instruction-set/#isa-section-calldatacopy) argument which copies `copySize` words from `environment.calldata` to memory. -- The [`CALL`](./instruction-set/#isa-section-call)/[`STATICCALL`](./instruction-set/#isa-section-call)/`DELEGATECALL` instruction's gas cost is determined by its `l*Gas` arguments, but any gas unused by the triggered message call is refunded after its completion ([more on this later](#updating-the-calling-context-after-nested-call-halts)). +- Instructions that operate on a data range of a specified "size" scale in cost with that size. An example of this is the [`CALLDATACOPY`](./instruction-set/#isa-section-calldatacopy) argument which copies `copySize` words from `environment.calldata` to `machineState.memory`. +- The [`CALL`](./instruction-set/#isa-section-call)/[`STATICCALL`](./instruction-set/#isa-section-call)/`DELEGATECALL` instruction's gas cost is determined by its `*Gas` arguments, but any gas unused by the triggered message call is refunded after its completion ([more on this later](#updating-the-calling-context-after-nested-call-halts)). - An instruction with "offset" arguments (like [`ADD`](./instruction-set/#isa-section-add) and many others), has increased cost for each offset argument that is flagged as "indirect". > Implementation detail: an instruction's gas cost will roughly align with the number of rows it corresponds to in the SNARK execution trace including rows in the sub-operation table, memory table, chiplet tables, etc. @@ -240,6 +245,7 @@ A normal halt occurs when the VM encounters an explicit halting instruction ([`R ``` machineState.l1GasLeft -= instr.l1GasCost machineState.l2GasLeft -= instr.l2GasCost +machineState.daGasLeft -= instr.daGasCost // results.reverted remains false results.output = machineState.memory[instr.args.retOffset:instr.args.retOffset+instr.args.retSize] ``` @@ -253,32 +259,34 @@ results.output = machineState.memory[instr.args.retOffset:instr.args.retOffset+i An exceptional halt is not explicitly triggered by an instruction but instead occurs when one of the following halting conditions is met: 1. **Insufficient gas** - ``` - assert machineState.l1GasLeft - instr.l1GasCost > 0 - assert machineState.l2GasLeft - instr.l2GasCost > 0 - ``` + ``` + assert machineState.l1GasLeft - instr.l1GasCost > 0 + assert machineState.l2GasLeft - instr.l2GasCost > 0 + assert machineState.daGasLeft - instr.l2GasCost > 0 + ``` 1. **Invalid instruction encountered** - ``` - assert environment.bytecode[machineState.pc].opcode <= MAX_AVM_OPCODE - ``` + ``` + assert environment.bytecode[machineState.pc].opcode <= MAX_AVM_OPCODE + ``` 1. **Failed memory tag check** - - Defined per-instruction in the [Instruction Set](./instruction-set) + - Defined per-instruction in the [Instruction Set](./instruction-set) 1. **Jump destination past end of bytecode** - ``` - assert machineState.pc >= environment.bytecode.length - ``` + ``` + assert machineState.pc >= environment.bytecode.length + ``` 1. **World state modification attempt during a static call** - ``` - assert !environment.isStaticCall - OR environment.bytecode[machineState.pc].opcode not in WS_MODIFYING_OPS - ``` - > Definition: `WS_MODIFYING_OPS` represents the list of all opcodes corresponding to instructions that modify world state. + ``` + assert !environment.isStaticCall + OR environment.bytecode[machineState.pc].opcode not in WS_MODIFYING_OPS + ``` + > Definition: `WS_MODIFYING_OPS` represents the list of all opcodes corresponding to instructions that modify world state. When an exceptional halt occurs, the context is flagged as consuming all off its allocated gas and marked as `reverted` with no output data, and then execution within the current context ends. ``` machineState.l1GasLeft = 0 machineState.l2GasLeft = 0 +machineState.daGasLeft = 0 results.reverted = true // results.output remains undefined ``` @@ -292,7 +300,7 @@ During a message call's execution, an instruction may be encountered that trigge Initiation of a nested call requires the creation of a new context (or **sub-context**). ``` -subContext = AVMContext { +subContext = AvmContext { environment: nestedExecutionEnvironment, // defined below machineState: nestedMachineState, // defined below worldState: callingContext.worldState, @@ -312,6 +320,8 @@ The environment and machine state for the new sub-context are initialized as sho isStaticCall = instr.opcode == STATICCALL_OP isDelegateCall = instr.opcode == DELEGATECALL_OP contract = callingContext.worldState.contracts[instr.args.addr] +calldataStart = instr.args.argsOffset +calldataEnd = calldataStart + instr.args.argsSize nestedExecutionEnvironment = ExecutionEnvironment { address: instr.args.addr, @@ -319,27 +329,28 @@ nestedExecutionEnvironment = ExecutionEnvironment { origin: callingContext.origin, l1GasPrice: callingContext.l1GasPrice, l2GasPrice: callingContext.l2GasPrice, - calldata: instr.args.calldata, + daGasPrice: callingContext.daGasPrice, sender: callingContext.address, portal: contract.portal, - bytecode: contract.bytecode, blockHeader: callingContext.blockHeader, globalVariables: callingContext.globalVariables, messageCallDepth: callingContext.messageCallDepth + 1, isStaticCall: isStaticCall, isDelegateCall: isDelegateCall, + calldata: callingContext.memory[calldataStart:calldataEnd], + bytecode: contract.bytecode, } nestedMachineState = MachineState { l1GasLeft: callingContext.machineState.memory[instr.args.gasOffset], l2GasLeft: callingContext.machineState.memory[instr.args.gasOffset+1], + daGasLeft: callingContext.machineState.memory[instr.args.gasOffset+2], pc: 0, - memory: uninitialized, internalCallStack: empty, + memory: empty, } ``` - -> Note: the sub-context machine state's `l*GasLeft` is initialized based on the call instruction's `gasOffset` argument. The caller allocates some amount of L1 and L2 gas to the nested call. It does so using the instruction's `gasOffset` argument. In particular, prior to the message call instruction, the caller populates `M[gasOffset]` with the sub-context's initial `l1GasLeft`. Likewise it populates `M[gasOffset+1]` with `l2GasLeft`. +> Note: the sub-context machine state's `*GasLeft` is initialized based on the call instruction's `gasOffset` argument. The caller allocates some amount of L1, L2, and DA gas to the nested call. It does so using the instruction's `gasOffset` argument. In particular, prior to the message call instruction, the caller populates `M[gasOffset]` with the sub-context's initial `l1GasLeft`. Likewise it populates `M[gasOffset+1]` with `l2GasLeft` and `M[gasOffset+2]` with `daGasLeft`. > Note: recall that `INITIAL_MESSAGE_CALL_RESULTS` is the same initial value used during [context initialization for a public execution request's initial message call](#context-initialization-for-initial-call). > `STATICCALL_OP` and `DELEGATECALL_OP` refer to the 8-bit opcode values for the `STATICCALL` and `DELEGATECALL` instructions respectively. @@ -354,15 +365,16 @@ The success or failure of the nested call is captured into memory at the offset context.machineState.memory[instr.args.successOffset] = !subContext.results.reverted ``` -Recall that a nested call is allocated some gas. In particular, the call instruction's `gasOffset` input points to an L1 and L2 gas allocation for the nested call. As shown in the [section above](#context-initialization-for-a-nested-call), a nested call's `subContext.machineState.l1GasLeft` is initialized to `context.machineState.memory[instr.args.gasOffset]`. Likewise, `l2GasLeft` is initialized from `gasOfffset+1`. +Recall that a nested call is allocated some gas. In particular, the call instruction's `gasOffset` input points to an L1, L2, and DA gas allocation for the nested call. As shown in the [section above](#context-initialization-for-a-nested-call), a nested call's `subContext.machineState.l1GasLeft` is initialized to `context.machineState.memory[instr.args.gasOffset]`. Likewise, `l2GasLeft` is initialized from `gasOffset+1` and `daGasLeft` from `gasOffset+2`. -As detailed in [the gas section above](#gas-cost-notes-and-examples), every instruction has an associated `instr.l1GasCost` and `instr.l2GasCost`. A nested call instruction's cost is the same as its initial `l*GasLeft` and `l2GasLeft`. Prior to the nested call's execution, this cost is subtracted from the calling context's remaining gas. +As detailed in [the gas section above](#gas-cost-notes-and-examples), every instruction has an associated `instr.l1GasCost`, `instr.l2GasCost`, and `instr.daGasCost`. A nested call instruction's cost is the same as its initial `*GasLeft`. Prior to the nested call's execution, this cost is subtracted from the calling context's remaining gas. When a nested call completes, any of its allocated gas that remains unused is refunded to the caller. ``` context.l1GasLeft += subContext.machineState.l1GasLeft context.l2GasLeft += subContext.machineState.l2GasLeft +context.daGasLeft += subContext.machineState.daGasLeft ``` If a nested call halts normally with a [`RETURN`](./instruction-set/#isa-section-return) or [`REVERT`](./instruction-set/#isa-section-revert), it may have some output data (`subContext.results.output`). The caller's `retOffset` and `retSize` arguments to the nested call instruction specify a region in memory to place output data when the nested call returns.