Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bytecode compression #91

Closed
axic opened this issue Apr 7, 2016 · 11 comments
Closed

bytecode compression #91

axic opened this issue Apr 7, 2016 · 11 comments
Labels

Comments

@axic
Copy link
Member

axic commented Apr 7, 2016

I've run a small experiment to see how compressible EVM bytecode is and the result is: very.

The two examples I've taken is the Multisig Wallet contract and Shapeshiftbot:

  • Wallet: 6993 bytes to 2706 bytes
  • Shapeshift: 6018 bytes to 1486 bytes

The compressor used was zlib on level 1 (best speed). Level 9 (slowest) only gives an improvement of around 10% in the case of the Wallet.

It might be early discussing this and may become more important when blockchain rent comes into effect. Using compression would optimise from storage costs to paying decompression costs.

Would that make sense? In two cases it might:

  • very compressible code or
  • something which is rarely executed

This could implemented either:

  • on the blockchain level
  • as a self-decompressing contract (i.e. in bytecode only). However this would require jumping to a memory location, which is prohibited in EVM.
@chriseth
Copy link
Contributor

chriseth commented Apr 8, 2016

Haha, interesting. If you fancy, take a look at the very last item on the Solidity backlog:
https://www.pivotaltracker.com/n/projects/1189488

@wanderer wanderer added the ERC label Apr 14, 2016
@wanderer wanderer changed the title RFC: bytecode compression bytecode compression Apr 14, 2016
@Arachnid
Copy link
Contributor

I've done some experiments with doing LZF decompression in the EVM. Here's a simple decompressor: https://gist.github.com/Arachnid/4bf0dc27432e325505ca7b06f6366114

It averages about 40 gas per byte decompressing bytecode, and the compressed bytecode is about 60% of the plaintext size. That's not quite enough to make it more efficient to compress transaction data (which costs 68 gas per byte), unfortunately.

More optimal decompressors are probably possible by reading the input data directly from calldata, instead of memory, but I wouldn't expect more than perhaps another 25% improvement in speed from that.

@chriseth
Copy link
Contributor

Oh and by the way, @axic: "Jumping to a memory location is prohibited by the EVM" - not quite: You can create a contract and then delegatecall into it.

@axic
Copy link
Member Author

axic commented Aug 29, 2016

You can create a contract and then delegatecall into it.

True, but that is a second instance of a contract. You cannot do it properly within a single instance AFAIK.

@gcolvin
Copy link
Contributor

gcolvin commented Jul 11, 2017

Can't a client store the blockchain data any way it likes?

@nicksavers
Copy link
Contributor

@axic Is this still relevant with #706 Snappy compression for DEVp2p?
Like @gcolvin said, for storage, each client can solve this in their own way.

@poemm
Copy link

poemm commented Jan 20, 2020

There is new interest stateless blocks, where bytecode is transmitted with each block that executes it. Recent experiments show that bytecode is one of the size bottlenecks (but not the biggest bottleneck). A proposal to improve the bytecode size bottleneck is to merkleize bytecode.

In curiosity, I compressed 124 EVM contracts, including CryptoKitties, Uniswap, and other dapps which I greedily found on rankings lists and in recent transactions, until I had ~1 MB total. Then I measured sizes before and after compression.

Total size without compression: 1004806 bytes
Total size with off-the-shelf zstd compression: 364759 bytes
Total size compressed with zstd using custom dictionary: 235683 bytes

So EVM bytecode compressed to 36% without much effort, and 23% with a little more effort. Decompression of everything took around 250ms on my slow computer.

Compression may improve further by doing some of the following.

  1. Separately compress opcodes and immediates, i.e. split-stream compression.
  2. Tune the dictionary for popular contracts.
  3. Use many dictionaries, each tuned for different opcode frequency histograms.
  4. Tune algorithms to be more aggressive, being aware of space-time trade-offs.
  5. Compress leaves of merkleized bytecode.

@Arachnid
Copy link
Contributor

Be careful when considering compression as a solution; an attacker can easily make a maximally-uncompressible contract in order to cause a DoS.

@poemm
Copy link

poemm commented Jan 21, 2020

Yes, good point. Depending on how you define "valid block", there could be a DoS attack.

Merkleizeing bytecode only gives 50% bytecode size reduction (early estimate, ref: see section: Code "chunking"). My post was to remind people that compression can give closer to 30% on average, sometime better.

Perhaps it is wise to allow individual contracts to choose whether/how to reduce their bytecode size, and the decompress/merkleize/bytecode_recover operations can be metered as EVM code or as precompiles, as wisely described by @Arachnid above.

Perhaps it is wise to allow popular contracts to be stored by every node, maybe as consensus cache of recent blocks, or with storage deposit/rent costs. Then only infrequently used bytecode will need to be transmitted with blocks.

@github-actions
Copy link

There has been no activity on this issue for two months. It will be closed in a week if no further activity occurs. If you would like to move this EIP forward, please respond to any outstanding feedback or add a comment indicating that you have addressed all required feedback and are ready for a review.

@github-actions github-actions bot added the stale label Jan 16, 2022
@github-actions
Copy link

This issue was closed due to inactivity. If you are still pursuing it, feel free to reopen it and respond to any feedback or request a review in a comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants