Skip to content

Commit

Permalink
Ready for last call. (ethereum#2189)
Browse files Browse the repository at this point in the history
* Update eip-615.md

* Update discussion-to link.

* Update eip-615.md

* Update eip-615.md

* Update eip-615.md

* Update eip-615.md

* Update eip-615.md

Suggested review-period-end in 14 days

* Update eip-615.md
  • Loading branch information
gcolvin authored and ilanolkies committed Nov 12, 2019
1 parent 67f2686 commit 0170c8c
Showing 1 changed file with 26 additions and 24 deletions.
50 changes: 26 additions & 24 deletions EIPS/eip-615.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,28 @@
---
eip: 615
title: Subroutines and Static Jumps for the EVM
status: Draft
status: Last Call
review-period-end: 2019-07-29
type: Standards Track
category: Core
author: Greg Colvin <greg@colvin.org>, Brooklyn Zelenka (@expede) , Paweł Bylica (@chfast), Christian Reitwiessner(@chriseth)
discussions-to: https://ethereum-magicians.org/t/eip-615-subroutines-and-static-jumps-for-the-evm/2728
discussions-to: https://ethereum-magicians.org/t/eip-615-subroutines-and-static-jumps-for-the-evm-last-call/3472
created: 2016-12-10
---

## Simple Summary

In the 21st century, on a blockchain circulating billions of ETH, formal specification and verification are an essential tool against loss. Yet the design of the EVM makes this unnecessarily difficult. Further, the design of the EVM makes low-gas-cost, high-performance execution difficult. We propose to move forward with proposals to resolve these problems by tightening the security guarantees and pushing the performance limits of the EVM.
In the 21st century, on a blockchain circulating billions of ETH, formal specification and verification are an essential tool against loss. Yet the design of the EVM makes this unnecessarily difficult. Further, the design of the EVM makes near-linear-time compilation to machine code difficult. We propose to move forward with proposals to resolve these problems by tightening EVM security guarantees and reducing barriers to performance.

## Abstract

EVM code is currently difficult to statically analyze, hobbling critical tools for preventing the many expensive bugs our blockchain has experienced. Further, none of the current implementations of the Ethereum Virtual Machine—including the compilers—are sufficiently performant to reduce the need for precompiles and otherwise meet the network's long-term demands. This proposal identifies dynamic jumps as a major reason for these issues, and proposes changes to the EVM specification to address the problem, making further efforts towards a safer and more performant the EVM possible.

We also propose to validate—in linear time—that EVM contracts correctly use subroutines, avoid misuse of the stack, and meet other safety conditions _before_ placing them on the blockchain. Validated code precludes most runtime exceptions and the need to test for them. And well-behaved control flow and use of the stack makes life easier for interpreters, compilers, formal analysis, and other tools.
We also propose to validate—in near-linear time—that EVM contracts correctly use subroutines, avoid misuse of the stack, and meet other safety conditions _before_ placing them on the blockchain. Validated code precludes most runtime exceptions and the need to test for them. And well-behaved control flow and use of the stack makes life easier for interpreters, compilers, formal analysis, and other tools.

## Motivation

Currently the EVM supports only dynamic jumps, where the address to jump to is an argument on the stack. Worse, the EVM fails to provide ordinary, alternative control flow facilities like subroutines and switches provided by Wasm and most CPUs. So dynamic jumps cannot be avoided, yet they obscure the structure of the code and thus mostly inhibit control- and data-flow analysis. This puts the quality and speed of optimized compilation fundamentally at odds. Further, since many jumps can potentially be to any jump destination in the code, the number of possible paths through the code can go up as the product of the number of jumps by the number of destinations, as does the time complexity of static analysis. Many of these cases are undecidable at deployment time, further inhibiting static and formal analyses.
Currently the EVM supports only dynamic jumps, where the address to jump to is an argument on the stack. Worse, the EVM fails to provide ordinary, alternative control flow facilities like subroutines and switches provided by Wasm and most CPUs. So dynamic jumps cannot be avoided, yet they obscure the structure of the code and thus mostly inhibit control- and data-flow analysis. This puts the quality and speed of optimized compilation fundamentally at odds. Further, since many jumps can potentially be to any jump destination in the code, the number of possible paths through the code can go up as the product of the number of jumps by the number of destinations, as does the time complexity of static analysis. Many of these cases are undecidable at deployment time, further inhibiting static and formal analyses.

However, given Ethereum's security requirements, **near-linear** **`n log n`** **time complexity** is essential. Otherwise, Contracts can be crafted or discovered with quadratic complexity to use as denial of service attack vectors against validations and optimizations.

Expand All @@ -36,33 +37,34 @@ And absent dynamic jumps, and with proper subroutines the EVM is a better target
The result is that all of the following validations and optimizations can be done at deployment time with near-linear `(n log n)` time complexity.
* The absence of most exceptional halting states can be validated.
* The maximum use of resources can be sometimes be calculated.
* Bytecode can be compiled to machine code.
* Compilation can optimize use of smaller registers.
* Compilation can optimize injection of gas metering.
* Bytecode can be compiled to machine code in near-linear time.
* Compilation can more effectively optimize use of smaller registers.
* Compilation can more effectively optimize injection of gas metering.

## Specification

### Dependencies

> **[EIP-1702](https://github.com/ethereum/EIPs/pull/1702). Generalized Account Versioning Scheme.** This proposal needs a versioning scheme to allow for its bytecode (and eventually eWasm bytecode) to be deployed with existing bytecode on the same blockchain.
> **[EIP-1702](http://eips.ethereum.org/EIPS/eip-1702). Generalized Account Versioning Scheme.** This proposal needs a versioning scheme to allow for its bytecode (and eventually eWasm bytecode) to be deployed with existing bytecode on the same blockchain.
### Proposal

We propose to deprecate two existing instructions—`JUMP` and `JUMPI`—and propose new instructions to support their legitimate uses. In particular, it must remain possible to compile Solidity and Vyper code to EVM bytecode, with no significant loss of performance or increase in gas price.

Especially important is efficient translation to and from [eWasm](https://github.com/ewasm/design) and to machine code. To that end we maintain a close correspondence between [Wasm](https://webassembly.github.io/spec/core/_download/WebAssembly.pdf), [x86](https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf), [ARM](https://static.docs.arm.com/100076/0100/arm_instruction_set_reference_guide_100076_0100_00_en.pdf) and proposed EVM instructions.

| EIP-615 | Wasm | x86 | ARM
| -------- | -------- | ----- | -----
| JUMPTO | br | JMP | B
| JUMPIF | br_if | JE | BNZ
| JUMPV | br_table | JMP | TBH
| JUMPSUB | call | CALL | BL
| JUMPSUBV | call_indirect | CALL | BLX
| RETURN | return | RET | RET
| GETLOCAL | local.get | PUSH | POP
| PUTLOCAL | local.put | PUSH | POP
| BEGINDATA | tables | .DATA | .DATA
| EIP-615 | Wasm | x86 | ARM
| --------- | ------------- | ---- | ---- |
| JUMPTO | br | JMP | B |
| JUMPIF | br_if | JE | BEQ |
| JUMPV | br_table | JMP | TBH |
| JUMPSUB | call | CALL | BL |
| JUMPSUBV | call_indirect | CALL | BL |
| RETURN | return | RET | RET |
| GETLOCAL | local.get | POP | POP |
| PUTLOCAL | local.put | PUSH | PUSH |
| BEGINSUB | func | | |
| BEGINDATA | tables | | |

#### Preliminaries

Expand Down Expand Up @@ -140,7 +142,7 @@ There needs to be a way to place unreachable data into the bytecode that will be
#### Structure

Valid EIP-615 EVM bytecode begins with a valid header. This is the magic number ‘\evm’ followed by the semantic versioning number '\1\5\0\0'. (For Wasm the header is '\0asm\1').
Valid EIP-615 EVM bytecode begins with a valid header. This is the magic number ‘\0evm’ followed by the semantic versioning number '\1\5\0'. (For Wasm the header is '\0asm\1').

Following the header is the BEGINSUB opcode for the _main_ routine. It takes no arguments and returns no values. Other subroutines may follow the _main_ routine, and an optional BEGINDATA opcode may mark the start of a data section.

Expand Down Expand Up @@ -176,14 +178,14 @@ Execution of a subroutine begins with `JUMPSUB` or `JUMPSUBV`, which
* sets `PC` to the specified `BEGINSUB` address
* thus beginning execution of the new subroutine.

The _main_ routine is not addressable by `JUMPSUB` instructions. Execution of a subroutine is suspended during and resumed after execution of nested subroutines, and ends upon encountering a `RETURNSUB`, which
Execution of a subroutine is suspended during and resumed after execution of nested subroutines, and ends upon encountering a `RETURNSUB`, which

* sets `FP` to the top of the virtual frame stack and pops the stack,
* sets `SP` to `FP + n_results`,
* sets `PC` to top of the return stack and pops the stack, and
* advances `PC` to the next instruction

thus resuming execution of the enclosing subroutine or _main_ program. A `STOP` or `RETURN` also ends the execution of a subroutine.
thus resuming execution of the enclosing subroutine or _main_ routine. A `STOP` or `RETURN` also ends the execution of a subroutine.

For example, starting from this stack,
```
Expand Down Expand Up @@ -290,7 +292,7 @@ All of the remaining conditions we validate statically.

#### Costs & Codes

All of the instructions are `O(1)` with a small constant, requiring just a few machine operations each, whereas a `JUMP` or `JUMPI` must do an `O(log n)` binary search of an array of `JUMPDEST` offsets before every jump. With the cost of `JUMPI` being _high_ and the cost of `JUMP` being _mid_, we suggest the cost of `JUMPV` and `JUMPSUBV` should be _mid_, `JUMPSUB` and `JUMPIF` should be _low_, and`JUMPTO` and the rest should be _verylow_. Measurement will tell.
All of the instructions are `O(1)` with a small constant, requiring just a few machine operations each, whereas a `JUMP` or `JUMPI` typically does an `O(log n)` binary search of an array of `JUMPDEST` offsets before every jump. With the cost of `JUMPI` being _high_ and the cost of `JUMP` being _mid_, we suggest the cost of `JUMPV` and `JUMPSUBV` should be _mid_, `JUMPSUB` and `JUMPIF` should be _low_, and`JUMPTO` and the rest should be _verylow_. Measurement will tell.

We suggest the following opcodes:
```
Expand Down

0 comments on commit 0170c8c

Please sign in to comment.