Skip to content
This repository has been archived by the owner on Oct 28, 2021. It is now read-only.

Investigate slow BLOCKHASH #5615

Open
chfast opened this issue May 29, 2019 · 8 comments
Open

Investigate slow BLOCKHASH #5615

chfast opened this issue May 29, 2019 · 8 comments
Assignees

Comments

@chfast
Copy link
Member

chfast commented May 29, 2019

The report https://arxiv.org/pdf/1905.00553.pdf show that BLOCKHASH is very slow comparing to its price.

This is also confirmed by @holiman's report https://github.com/holiman/vmstats (raw data: https://docs.google.com/spreadsheets/d/1zjF6YWrVxfNoj0fH3WRek-G4Bw0fbbOnRusC0gMtmL0/edit#gid=852228130)

grapth

Is GASPRICE something we should also investigate?

cc @cdetrio

@chfast
Copy link
Member Author

chfast commented May 29, 2019

Considering Aleth's implementation where block hashes are copied up front for EVM execution (I'm considering that inefficient, but there was never time to anything about it) the problem might be getting the current block number. Although the timings for NUMBER do not confirm this.

@gumb0
Copy link
Member

gumb0 commented May 29, 2019

Looking at the code, I see we look up the hashes the first time BLOCKHASH is called in the block (with any valid argument)

Maybe there's a place where we do this call (bc().lastBlockHashes().precedingHashes(number)) to warm it up before transaction proccessing, but I'm missing it. Have you seen such place?

@cdetrio
Copy link
Member

cdetrio commented May 29, 2019

As @holiman pointed out, the paper Empirically Analyzing Ethereum’s Gas Mechanism analyzed aleth using the "Blockhash refactoring" EIP: ethereum/EIPs#210

From the paper:

The main bottleneck for this particular opcode is caused by the fact that the BLOCKHASH EVM operation is implemented as a smart contract instead of a built-in routine within the EVM. Therefore its execution requires traversing the state trie to load the contract code into RAM for execution. On top of that, in the reference implementation, there is also the overhead of creating an embedded EVM instance within the runtime of the base EVM executing the contract, incurring more overhead. We note that this analysis applies only to the C++ reference implementation.

I guess we don't know whether standard aleth with BLOCKHASH as an opcode (not a system contract as in EIP 210) is slow or not. We do know the standard BLOCKHASH opcode is slow in geth, though, so its worth checking aleth.

@chfast
Copy link
Member Author

chfast commented May 29, 2019

Any suggestions what kind of benchmarks should be do?

@cdetrio
Copy link
Member

cdetrio commented May 30, 2019

I suppose reproducing the ones in holiman/vmstats. Some notes from the chat:

My vm stats were basically this commit on top of some other changes: holiman/go-ethereum@195b0e5

so basically, I have 256 counters. In each loop, I stop the running timer, add the time to the counter, start a new timer. Things like CALL will stop counting once it starts executing the next opcode, because the timer is global, not context-aware.
It becomes a bit skewed since e.g. SSTORE has a post-transaction overhead which is not reflected in the run time
I add the timers and spit it out every 10K blocks
Oh and I count both the number of invocations and the total time spent

@gumb0
Copy link
Member

gumb0 commented Jun 5, 2019

I did the same thing as Martin here 932fdd9
and tried to run the sync.

So far this is what I got for the first 1M blocks:
run_aleth total-bars-0

Looks like BLOCKHASH is not called much in the first 1M. And I don't know how CALLER can be so slow.

It's hard to continuously sync even to 2M because of issues in sync (crashes/deadlocks), so not sure how useful further attempts can be.

I can also try syncing from the snapshot, to get the stats from the middle of the chain...

@holiman
Copy link
Contributor

holiman commented Jun 5, 2019

Cool! Here are the three runs for geth in the same segment:


Regarding CALLER -- I've found that the very very cheap opcodes which places a new item on the stack (without popping anything) are sometimes pretty expensive, and stands out a bit. Just the runloop itself ( lookup opcode, check stack requirements, check gas cost, etc) is difficult to manage on 2 gas, and if we also need to allocate a bit of memory, dereference a couple of pointers and copy bytes (perhaps a couple of times), the margins are slim. As you can see, the geth jumpdest is number 2 in heaviness in this segment, since 1 gas is really not sufficient for the runloop.

I'm surprised that the aleth SLOAD is so heavy at this early chain segment!

@gumb0
Copy link
Member

gumb0 commented Jul 24, 2019

Here's what I got for another run of importing first 2M blocks

run_aleth total-bars-0
run_aleth total-bars-1

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants