From a1ff047fab03b90da0022cdb1451c9fbbde5c99d Mon Sep 17 00:00:00 2001 From: Brooklyn Zelenka Date: Fri, 17 May 2019 09:31:59 -0700 Subject: [PATCH] Automatically merged updates to draft EIP(s) 615 (#2044) Hi, I'm a bot! This change was automatically merged because: - It only modifies existing Draft or Last Call EIP(s) - The PR was approved or written by at least one author of each modified EIP - The build is passing --- EIPS/eip-615.md | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/EIPS/eip-615.md b/EIPS/eip-615.md index 3db860cfbe3c6..b584498d8a025 100644 --- a/EIPS/eip-615.md +++ b/EIPS/eip-615.md @@ -69,9 +69,9 @@ Especially important is efficient translation to and from [eWasm](https://github These forms > *`INSTRUCTION`* > -> *`INSTRUCTION x`* +> *`INSTRUCTION x`* > -> *`INSTRUCTION x, y`* +> *`INSTRUCTION x, y`* name an *`INSTRUCTION`* with no, one and two arguments, respectively. An instruction is represented in the bytecode as a single-byte opcode. Any arguments are laid out as immediate data bytes following the opcode inline, interpreted as fixed length, MSB-first, two's-complement, two-byte positive integers. (Negative values are reserved for extensions.) @@ -102,7 +102,7 @@ To support subroutines, `BEGINSUB`, `JUMPSUB`, and `RETURNSUB` are provided. Br #### Switches, Callbacks, and Virtual Functions -Dynamic jumps are also used for `O(1)` indirection: an address to jump to is selected to push on the stack and be jumped to. So we also propose two more instructions to provide for constrained indirection. We support these with vectors of `JUMPDEST` or `BEGINSUB` offsets stored inline, which can be selected with an index on the stack. That constrains validation to a specified subset of all possible destinations. The danger of quadratic blow up is avoided because it takes as much space to store the jump vectors as it does to code the worst case exploit. +Dynamic jumps are also used for `O(1)` indirection: an address to jump to is selected to push on the stack and be jumped to. So we also propose two more instructions to provide for constrained indirection. We support these with vectors of `JUMPDEST` or `BEGINSUB` offsets stored inline, which can be selected with an index on the stack. That constrains validation to a specified subset of all possible destinations. The danger of quadratic blow up is avoided because it takes as much space to store the jump vectors as it does to code the worst case exploit. Dynamic jumps to a `JUMPDEST` are used to implement `O(1)` jumptables, which are useful for dense switch statements. Wasm and most CPUs provide similar instructions. @@ -193,7 +193,7 @@ frame | 21 ______|___________ 22 <- SP ``` -and after pushing two arguments and branching with `JUMPSUB` to a `BEGINSUB 2, 3` +and after pushing two arguments and branching with `JUMPSUB` to a `BEGINSUB 2, 3` ``` PUSH 10 PUSH 11 @@ -256,7 +256,7 @@ _Execution_ is as defined in the [Yellow Paper](https://ethereum.github.io/yello >**5** Invalid instruction -We propose to expand and extend the Yellow Paper conditions to handle the new instructions we propose. +We propose to expand and extend the Yellow Paper conditions to handle the new instructions we propose. To handle the return stack we expand the conditions on stack size: >**2a** The size of the data stack does not exceed 1024. @@ -290,7 +290,7 @@ All of the remaining conditions we validate statically. #### Costs & Codes -All of the instructions are `O(1)` with a small constant, requiring just a few machine operations each, whereas a `JUMP` or `JUMPI` must do an O(log n) binary search of an array of `JUMPDEST` offsets before every jump. With the cost of `JUMPI` being _high_ and the cost of `JUMP` being _mid_, we suggest the cost of `JUMPV` and `JUMPSUBV` should be _mid_, `JUMPSUB` and `JUMPIF` should be _low_, and`JUMPTO` and the rest should be _verylow_. Measurement will tell. +All of the instructions are `O(1)` with a small constant, requiring just a few machine operations each, whereas a `JUMP` or `JUMPI` must do an `O(log n)` binary search of an array of `JUMPDEST` offsets before every jump. With the cost of `JUMPI` being _high_ and the cost of `JUMP` being _mid_, we suggest the cost of `JUMPV` and `JUMPSUBV` should be _mid_, `JUMPSUB` and `JUMPIF` should be _low_, and`JUMPTO` and the rest should be _verylow_. Measurement will tell. We suggest the following opcodes: ``` @@ -315,7 +315,7 @@ These changes would need to be implemented in phases at decent intervals: If desired, the period of deprecation can be extended indefinitely by continuing to accept code not versioned as new—but without validation. That is, by delaying or canceling phase 2. -Regardless, we will need a versioning scheme like [EIP-1702](https://github.com/ethereum/EIPs/pull/1702) to allow current code and EIP-615 code to coexist on the same blockchain. +Regardless, we will need a versioning scheme like [EIP-1702](https://github.com/ethereum/EIPs/pull/1702) to allow current code and EIP-615 code to coexist on the same blockchain. ## Rationale @@ -325,7 +325,7 @@ As described above, the approach was simply to deprecate the problematic dynamic ## Implementation -Implementation of this proposal need not be difficult. At the least, interpreters can simply be extended with the new opcodes and run unchanged otherwise. The new opcodes require only stacks for the frame pointers and return offsets and the few pushes, pops, and assignments described above. The bulk of the effort is the validator, which in most languages can almost be transcribed from the pseudocode above. +Implementation of this proposal need not be difficult. At the least, interpreters can simply be extended with the new opcodes and run unchanged otherwise. The new opcodes require only stacks for the frame pointers and return offsets and the few pushes, pops, and assignments described above. The bulk of the effort is the validator, which in most languages can almost be transcribed from the pseudocode above. A lightly tested C++ reference implementation is available in [Greg Colvin's Aleth fork.](https://github.com/gcolvin/aleth/tree/master/libaleth-interpreter) This version required circa 110 lines of new interpreter code and a well-commented, 178-line validator. @@ -346,7 +346,7 @@ Validating that jumps are to valid addresses takes two sequential passes over th is_sub[code_size] // is there a BEGINSUB at PC? is_dest[code_size] // is there a JUMPDEST at PC? sub_for_pc[code_size] // which BEGINSUB is PC in? - + bool validate_jumps(PC) { current_sub = PC @@ -366,7 +366,7 @@ Validating that jumps are to valid addresses takes two sequential passes over th is_dest[PC] = true sub_for_pc[PC] = current_sub } - + // check that targets are in subroutine for (PC = 0; instruction = bytecode[PC]; PC = advance_pc(PC)) { @@ -390,7 +390,7 @@ Note that code like this is already run by EVMs to check dynamic jumps, includin #### Subroutine Validation -This function can be seen as a symbolic execution of a subroutine in the EVM code, where only the effect of the instructions on the state being validated is computed. Thus the structure of this function is very similar to an EVM interpreter. This function can also be seen as an acyclic traversal of the directed graph formed by taking instructions as vertexes and sequential and branching connections as edges, checking conditions along the way. The traversal is accomplished via recursion, and cycles are broken by returning when a vertex which has already been visited is reached. The time complexity of this traversal is `O(|E|+|V|): The sum of the number of edges and number of verticies in the graph. +This function can be seen as a symbolic execution of a subroutine in the EVM code, where only the effect of the instructions on the state being validated is computed. Thus the structure of this function is very similar to an EVM interpreter. This function can also be seen as an acyclic traversal of the directed graph formed by taking instructions as vertexes and sequential and branching connections as edges, checking conditions along the way. The traversal is accomplished via recursion, and cycles are broken by returning when a vertex which has already been visited is reached. The time complexity of this traversal is `O(|E|+|V|)`: The sum of the number of edges and number of verticies in the graph. The basic approach is to call `validate_subroutine(i, 0, 0)`, for `i` equal to the first instruction in the EVM code through each `BEGINDATA` offset. `validate_subroutine()` traverses instructions sequentially, recursing when `JUMP` and `JUMPI` instructions are encountered. When a destination is reached that has been visited before it returns, thus breaking cycles. It returns true if the subroutine is valid, false otherwise. @@ -440,7 +440,7 @@ The basic approach is to call `validate_subroutine(i, 0, 0)`, for `i` equal to t return false if instruction is STOP, RETURN, or SUICIDE - return true + return true // violates single entry if instruction is BEGINSUB @@ -463,10 +463,10 @@ The basic approach is to call `validate_subroutine(i, 0, 0)`, for `i` equal to t if instruction is JUMPTO { PC = jump_target(PC) - continue + continue } - // recurse to jump to code to validate + // recurse to jump to code to validate if instruction is JUMPIF { if not validate_subroutine(jump_target(PC), return_pc, SP)