Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update #10

Merged
merged 5 commits into from
May 18, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion EIPS/eip-1679.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,22 +28,25 @@ This meta-EIP specifies the changes included in the Ethereum hardfork named Ista

### Proposed EIPs

- [EIP-615](https://eips.ethereum.org/EIPS/eip-615): Subroutines and Static Jumps for the EVM
- [EIP-1057](https://eips.ethereum.org/EIPS/eip-1057): ProgPoW, a Programmatic
Proof-of-Work
- There is a
[pending audit](https://medium.com/ethereum-cat-herders/progpow-audit-goals-expectations-75bb902a1f01),
above and beyond standard security considerations, that should be evaluated
prior to inclusion.
- [EIP-1108](https://eips.ethereum.org/EIPS/eip-1108): Reduce alt_bn128 precompile gas costs
- [EIP-1108](https://eips.ethereum.org/EIPS/eip-1108): Reduce alt_bn128 precompile gas costs
- [EIP-1283](https://eips.ethereum.org/EIPS/eip-1283): Net gas metering for SSTORE without dirty maps
- [EIP-1344](https://eips.ethereum.org/EIPS/eip-1344): Add ChainID opcode
- [EIP-1352](https://eips.ethereum.org/EIPS/eip-1352): Specify restricted address range for precompiles/system contracts
- [EIP-1380](https://eips.ethereum.org/EIPS/eip-1380): Reduced gas cost for call to self
- [EIP-1559](https://eips.ethereum.org/EIPS/eip-1559): Fee market change for ETH 1.0 chain
- [EIP-1702](https://eips.ethereum.org/EIPS/eip-1702): Generalized account versioning scheme
- [EIP-1706](https://eips.ethereum.org/EIPS/eip-1706): Disable SSTORE with gasleft lower than call stipend
- [EIP-1803](https://eips.ethereum.org/EIPS/eip-1803): Rename opcodes for clarity
- [EIP-1829](https://eips.ethereum.org/EIPS/eip-1829): Precompile for Elliptic Curve Linear Combinations
- [EIP-1884](https://eips.ethereum.org/EIPS/eip-1884): Repricing for trie-size-dependent opcodes
- [EIP-2028](https://eips.ethereum.org/EIPS/eip-2028): Calldata gas cost reduction

## Timeline

Expand Down
75 changes: 75 additions & 0 deletions EIPS/eip-2028.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
---
eip: 2028
title: Calldata gas cost reduction
author: Alexey Akhunov(@AlexeyAkhunov), Eli Ben Sasson (eli@starkware.co), Tom Brand (tom@starkware.co), Avihu Levy (avihu@starkware.co)
discussions-to: https://ethereum-magicians.org/t/eip-2028-calldata-gas-cost-reduction/3280
status: Draft
type: Standards Track
category: Core
created: 2019-05-03
---

## Simple Summary
We propose to reduce the gas cost of Calldata (`GTXDATANONZERO`) from its current value of 68 gas per byte to a lower cost, to be backed by mathematical modeling and empirical estimates. The mathematical model is the one used in the works of Sompolinsky and Zohar [1] and Pass, Seeman and Shelat [2], which relates network security to network delay. We shall (1) evaluate the theoretical impact of lower Calldata gas cost on network delay using this model, (2) validate the model empirically, and (3) base the proposed gas cost on our findings.

## Motivation
There are a couple of main benefits to accepting this proposal and lowering gas cost of Calldata
On-Chain Scalability: Generally speaking, higher bandwidth of Calldata improves scalability, as more data can fit within a single block.
* Layer two scalability: Layer two scaling solutions can improve scalability by moving storage and computation off-chain, but often introduce data transmission instead.
- Proof systems such as STARKs and SNARKs use a single proof that attests to the computational integrity of a large computation, say, one that processes a large batch of transactions.
- Some solutions use fraud proofs which requires a transmission of merkle proofs.
- Moreover, one optional data availability solution to layer two is to place data on the main chain, via Calldata.
* Stateless clients: The same model will be used to determine the price of the state access for the stateless client regime, which will be proposed in the State Rent (from version 4). There, it is expected that the gas cost of state accessing operation will increase roughly proportional to the extra bandwidth required to transmit the “block proofs” as well as extra processing required to verify those block proofs.

## Specification
The gas per non-zero byte is reduced from 68 to TBD. Gas cost of zero bytes is unchanged.

## Rationale
Roughly speaking, reducing the gas cost of Calldata leads to potentially larger blocks, which increases the network delay associated with data transmission over the network. This is only part of the full network delay, other factors are block processing time (and storage access, as part of it). Increasing network delay affects security by lowering the cost of attacking the network, because at any given point in time fewer nodes are updated on the latest state of the blockchain.

Yonatan Sompolinsky and Aviv Zohar suggested in [1] an elegant model to relate network delay to network security, and this model is also used in the work of Rafael Pass, Lior Seeman and Abhi Shelat [2]. We briefly explain this model below, because we shall study it theoretically and validate it by empirical measurements to reach the suggested lower gas cost for Calldata.

The model uses the following natural parameters:
* _lambda_ denotes the block creation rate [1/s]: We treat the process of finding a PoW
solution as a poisson process with rate _lambda_.
* _beta_ - chain growth rate [1/s]: the rate at which new blocks are added to
the heaviest chain.
* _D_ - block delay [s]: The time that elapses between the mining of a new block and its acceptance by all the miners (all miners switched to mining on top of that block).

### _Beta_ Lower Bound
Notice that _lambda_ => _beta_, because not all blocks that are found will enter the main chain (as is the case with uncles). In [1] it was shown that for a blockchain using the longest chain rule, one may bound _beta_ from below by _lambda_/ (1+ D * _lambda_). This lower bound holds in the extremal case where the topology of the network is a clique in which the delay between each pair of nodes is D, the maximal possible delay. Recording both the lower and upper bounds on _beta_ we get

_lambda_ >= _beta_ >= _lambda_ / (1 + D * _lambda_) (*)

Notice, as a sanity check, that when there is no delay (D=0) then _beta_ equals _lambda_, as expected.

### Security of the network
An attacker attempting to reorganize the main chain needs to generate blocks at a rate that is greater than _beta_.
Fixing the difficulty level of the PoW puzzle, the total hash rate in the system is correlated to _lambda_. Thus, _beta_ / _lambda_ is defined as the the *efficiency* of the system, as it measures the fraction of total hash power that is used to generate the main chain of the network.

Rearranging (*) gives the following lower bound on efficiency in terms of delay:

_beta_ / _lambda_ >= 1 / (1 + D * _lambda_) (**)

### The _delay_ parameter D
The network delay depends on the location of the mining node within the network and on the current network topology (which changes dynamically), and consequently is somewhat difficult to measure directly.
Previously, Christian Decker and Roger Wattenhofer [3] showed that propagation time scales with blocksize, and Vitalik Buterin showed that uncle rate, which is tightly related to efficiency (**) measure, also scales with block size [4].

However, the delay function can be decomposed into two parts D = *D_t* + *D_p*, where _D_t_ is the delay caused by the transmission of the block and _D_p_ is the delay caused by the processing of the block by the node. Our model and tests will examine the effect of Calldata on each of _D_t_ and _D_p_, postulating that their effect is different. This may be particularly relevant for Layer 2 Scalability and for Stateless Clients (Rationales 2, 3 above) because most of the Calldata associated with these goals are Merkle authentication paths that have a large _D_t_ component but relatively small _D_p_ values.

## Test Cases
To suggest the gas cost of calldata we shall conduct two types of tests:
1. Network tests, conducted on the Ethereum mainnet, used to estimate the effect on increasing block size on _D_p_ and _D_t_, on the overall network delay D and the efficiency ratio (**), as well as delays between different mining pools. Those tests will include regression tests on existing data, and stress tests to introduce extreme scenarios.
2. Local tests, conducted on a single node and measuring the processing time as a function of Calldata amount and general computation limits.

## References
[1] Yonatan Sompolinsky, Aviv Zohar: [Secure High-Rate Transaction Processing in Bitcoin](https://eprint.iacr.org/2013/881.pdf). Financial Cryptography 2015: 507-527

[2] Rafael Pass, Lior Seeman, Abhi Shelat: [Analysis of the Blockchain Protocol in Asynchronous Networks](https://eprint.iacr.org/2016/454.pdf), ePrint report 2016/454

[3] Christian Decker, Roger Wattenhofer: [Information propagation in the Bitcoin network](http://www.gsd.inesc-id.pt/~ler/docencia/rcs1314/papers/P2P2013_041.pdf). P2P 2013: 1-10

[4] Vitalik Buterin: [Uncle Rate and Transaction Fee Analysis](https://blog.ethereum.org/2016/10/31/uncle-rate-transaction-fee-analysis/)

## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).
28 changes: 14 additions & 14 deletions EIPS/eip-615.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,9 @@ Especially important is efficient translation to and from [eWasm](https://github
These forms
> *`INSTRUCTION`*
>
> *`INSTRUCTION x`*
> *`INSTRUCTION x`*
>
> *`INSTRUCTION x, y`*
> *`INSTRUCTION x, y`*

name an *`INSTRUCTION`* with no, one and two arguments, respectively. An instruction is represented in the bytecode as a single-byte opcode. Any arguments are laid out as immediate data bytes following the opcode inline, interpreted as fixed length, MSB-first, two's-complement, two-byte positive integers. (Negative values are reserved for extensions.)

Expand Down Expand Up @@ -102,7 +102,7 @@ To support subroutines, `BEGINSUB`, `JUMPSUB`, and `RETURNSUB` are provided. Br

#### Switches, Callbacks, and Virtual Functions

Dynamic jumps are also used for `O(1)` indirection: an address to jump to is selected to push on the stack and be jumped to. So we also propose two more instructions to provide for constrained indirection. We support these with vectors of `JUMPDEST` or `BEGINSUB` offsets stored inline, which can be selected with an index on the stack. That constrains validation to a specified subset of all possible destinations. The danger of quadratic blow up is avoided because it takes as much space to store the jump vectors as it does to code the worst case exploit.
Dynamic jumps are also used for `O(1)` indirection: an address to jump to is selected to push on the stack and be jumped to. So we also propose two more instructions to provide for constrained indirection. We support these with vectors of `JUMPDEST` or `BEGINSUB` offsets stored inline, which can be selected with an index on the stack. That constrains validation to a specified subset of all possible destinations. The danger of quadratic blow up is avoided because it takes as much space to store the jump vectors as it does to code the worst case exploit.

Dynamic jumps to a `JUMPDEST` are used to implement `O(1)` jumptables, which are useful for dense switch statements. Wasm and most CPUs provide similar instructions.

Expand Down Expand Up @@ -193,7 +193,7 @@ frame | 21
______|___________ 22
<- SP
```
and after pushing two arguments and branching with `JUMPSUB` to a `BEGINSUB 2, 3`
and after pushing two arguments and branching with `JUMPSUB` to a `BEGINSUB 2, 3`
```
PUSH 10
PUSH 11
Expand Down Expand Up @@ -256,7 +256,7 @@ _Execution_ is as defined in the [Yellow Paper](https://ethereum.github.io/yello

>**5** Invalid instruction

We propose to expand and extend the Yellow Paper conditions to handle the new instructions we propose.
We propose to expand and extend the Yellow Paper conditions to handle the new instructions we propose.

To handle the return stack we expand the conditions on stack size:
>**2a** The size of the data stack does not exceed 1024.
Expand Down Expand Up @@ -290,7 +290,7 @@ All of the remaining conditions we validate statically.

#### Costs & Codes

All of the instructions are `O(1)` with a small constant, requiring just a few machine operations each, whereas a `JUMP` or `JUMPI` must do an O(log n) binary search of an array of `JUMPDEST` offsets before every jump. With the cost of `JUMPI` being _high_ and the cost of `JUMP` being _mid_, we suggest the cost of `JUMPV` and `JUMPSUBV` should be _mid_, `JUMPSUB` and `JUMPIF` should be _low_, and`JUMPTO` and the rest should be _verylow_. Measurement will tell.
All of the instructions are `O(1)` with a small constant, requiring just a few machine operations each, whereas a `JUMP` or `JUMPI` must do an `O(log n)` binary search of an array of `JUMPDEST` offsets before every jump. With the cost of `JUMPI` being _high_ and the cost of `JUMP` being _mid_, we suggest the cost of `JUMPV` and `JUMPSUBV` should be _mid_, `JUMPSUB` and `JUMPIF` should be _low_, and`JUMPTO` and the rest should be _verylow_. Measurement will tell.

We suggest the following opcodes:
```
Expand All @@ -315,7 +315,7 @@ These changes would need to be implemented in phases at decent intervals:

If desired, the period of deprecation can be extended indefinitely by continuing to accept code not versioned as new—but without validation. That is, by delaying or canceling phase 2.

Regardless, we will need a versioning scheme like [EIP-1702](https://github.com/ethereum/EIPs/pull/1702) to allow current code and EIP-615 code to coexist on the same blockchain.
Regardless, we will need a versioning scheme like [EIP-1702](https://github.com/ethereum/EIPs/pull/1702) to allow current code and EIP-615 code to coexist on the same blockchain.

## Rationale

Expand All @@ -325,7 +325,7 @@ As described above, the approach was simply to deprecate the problematic dynamic

## Implementation

Implementation of this proposal need not be difficult. At the least, interpreters can simply be extended with the new opcodes and run unchanged otherwise. The new opcodes require only stacks for the frame pointers and return offsets and the few pushes, pops, and assignments described above. The bulk of the effort is the validator, which in most languages can almost be transcribed from the pseudocode above.
Implementation of this proposal need not be difficult. At the least, interpreters can simply be extended with the new opcodes and run unchanged otherwise. The new opcodes require only stacks for the frame pointers and return offsets and the few pushes, pops, and assignments described above. The bulk of the effort is the validator, which in most languages can almost be transcribed from the pseudocode above.

A lightly tested C++ reference implementation is available in [Greg Colvin's Aleth fork.](https://github.com/gcolvin/aleth/tree/master/libaleth-interpreter) This version required circa 110 lines of new interpreter code and a well-commented, 178-line validator.

Expand All @@ -346,7 +346,7 @@ Validating that jumps are to valid addresses takes two sequential passes over th
is_sub[code_size] // is there a BEGINSUB at PC?
is_dest[code_size] // is there a JUMPDEST at PC?
sub_for_pc[code_size] // which BEGINSUB is PC in?

bool validate_jumps(PC)
{
current_sub = PC
Expand All @@ -366,7 +366,7 @@ Validating that jumps are to valid addresses takes two sequential passes over th
is_dest[PC] = true
sub_for_pc[PC] = current_sub
}

// check that targets are in subroutine
for (PC = 0; instruction = bytecode[PC]; PC = advance_pc(PC))
{
Expand All @@ -390,7 +390,7 @@ Note that code like this is already run by EVMs to check dynamic jumps, includin

#### Subroutine Validation

This function can be seen as a symbolic execution of a subroutine in the EVM code, where only the effect of the instructions on the state being validated is computed. Thus the structure of this function is very similar to an EVM interpreter. This function can also be seen as an acyclic traversal of the directed graph formed by taking instructions as vertexes and sequential and branching connections as edges, checking conditions along the way. The traversal is accomplished via recursion, and cycles are broken by returning when a vertex which has already been visited is reached. The time complexity of this traversal is `O(|E|+|V|): The sum of the number of edges and number of verticies in the graph.
This function can be seen as a symbolic execution of a subroutine in the EVM code, where only the effect of the instructions on the state being validated is computed. Thus the structure of this function is very similar to an EVM interpreter. This function can also be seen as an acyclic traversal of the directed graph formed by taking instructions as vertexes and sequential and branching connections as edges, checking conditions along the way. The traversal is accomplished via recursion, and cycles are broken by returning when a vertex which has already been visited is reached. The time complexity of this traversal is `O(|E|+|V|)`: The sum of the number of edges and number of verticies in the graph.

The basic approach is to call `validate_subroutine(i, 0, 0)`, for `i` equal to the first instruction in the EVM code through each `BEGINDATA` offset. `validate_subroutine()` traverses instructions sequentially, recursing when `JUMP` and `JUMPI` instructions are encountered. When a destination is reached that has been visited before it returns, thus breaking cycles. It returns true if the subroutine is valid, false otherwise.

Expand Down Expand Up @@ -440,7 +440,7 @@ The basic approach is to call `validate_subroutine(i, 0, 0)`, for `i` equal to t
return false

if instruction is STOP, RETURN, or SUICIDE
return true
return true

// violates single entry
if instruction is BEGINSUB
Expand All @@ -463,10 +463,10 @@ The basic approach is to call `validate_subroutine(i, 0, 0)`, for `i` equal to t
if instruction is JUMPTO
{
PC = jump_target(PC)
continue
continue
}

// recurse to jump to code to validate
// recurse to jump to code to validate
if instruction is JUMPIF
{
if not validate_subroutine(jump_target(PC), return_pc, SP)
Expand Down
Loading