Block processing time estimates at scale #103

djrtwo · 2018-09-18T02:08:09Z

Issue

In the last eth2.0 implementers call, we decided it would be worthwhile to run some timing analysis on processing blocks with real-world amounts of attestations.

It would be great to get results from at least one other client. I know not everyone has a working BLS aggregate implementation yet, but anyone that does should give this a try and report results.

Proposed Implementation

Assuming 10M eth deposited puts us at ~300k validators. With 64 slots, that is ~5000 validators per slot. With 1000 shards divided across the 64 slots, that is ~16 shards per slot.

If all of the validators coordinate and vote on the same crosslink and their attestations are aggregated and include in the next slot, then there will be 16 attestations of ~300 validators each per block. This is a good place to start.

We can then make this estimate a worse case by assuming the validators split their votes across 2, 3, 4, or even 5 different crosslink candidates. If all committees split their votes across 2 candidates, then there would be 32 attestations per block each with ~150 validators each.

EDIT
My estimates on number of committees and size of committees were a bit off in practice. When using BenchmarkParams { total_validators: 312500, cycle_length: 64, shard_count: 1024, shards_per_slot: 16, validators_per_shard: 305, min_committee_size: 128 }, each slot has approximately 20 committees of size 244 (rather than 16 of ~300). This shouldn't drastically change the output, but is a better target because it reflects the actual shuffling alg
(cc: @paulhauner)

EDIT2
My original assumption was correct and the spec was incorrect! Go with the original estimates

The text was updated successfully, but these errors were encountered:

paulhauner · 2018-09-18T07:21:11Z

For the record, lighthouse is making this a priority. Thanks for the details on what to test :)

We'll come back here with any questions/comments.

djrtwo · 2018-09-27T12:24:22Z

Current python benchmarks (on standard consumer laptop with tons of tabs open and music playing)


num_attestations	validators_per_attestation	total_validators	total_deposits	block_process_seconds	crystallized_state_bytes	active_state_bytes	block_bytes
2	244	31250	1000000	0.1017	4056416	4494	562
16	305	312500	10000000	1.0099	40074336	7324	3392
16	3051	3125000	100000000	10.1837	400074336	12812	8880

raw csv here https://gist.github.com/djrtwo/663a031c984ef4796a9aff2ba68d03e5

Notes:

the active state size is after one block has been processed
Only in memory. No on-disk DB used
no concurrency used

paulhauner · 2018-09-27T13:52:29Z

Here are some timings from Lighthouse (I'll add @djrtwo's first scenario once I complete it):

Computer: Lenovo X1 Carbon 5th Gen with an Intel i5-7300U @ 2.60GHz running Arch Linux with each core idling around 2-8% before tests.


num_attestations	validators_per_attestation	total_validators	total_deposits	block_process_seconds
2	244	31250	1000000	N/A
16	305	312500	10000000	0.066773104
16	3051	3125000	100000000	0.249065226

Note: this is using an in-memory database. ~~@djrtwo were you using an on-disk DB at all?~~

Note: we're using concurrency for attestation validation.

djrtwo · 2018-09-27T19:05:24Z

Curious about if you see a closer 10x different when you remove the concurrency @paulhauner

paulhauner · 2018-09-29T02:28:43Z

Without concurrency:


num_attestations	validators_per_attestation	total_validators	total_deposits	block_process_seconds
2	244	31250	1000000	N/A
16	305	312500	10000000	0.125273217
16	3051	3125000	100000000	0.450318683

That ~4x difference still holds.

In these benches I'm starting with a SSZ serialized block and then de-serializing it (and all the AttestationRecords) inside this benchmark. Are you doing the same thing @djrtwo? If not, maybe we're seeing a constant SSZ de-serialization overhead in lighthouse that we're not seeing in beacon_chain?

djrtwo · 2018-09-29T03:04:47Z

That's it. I'm not clocking the deserialization.

Both are interesting numbers. I was looking specifically for block validity and primarily at the signatures because this was our estimated bottleneck when designing the protocol.

Let's see what it is without the deserialize.

paulhauner · 2018-09-29T03:22:55Z

Presently lighthouse is structured to do "just-in-time" deserialization, where each AttestationRecord is deserialized immediately before it is verified. The idea is that if someone sends us a bad block we de-serialize at little as possible before discovering that it's bad.

I mention this for two reasons; (a) cause it's a fun fact and (b) to indicate that it'll take some amount of hacky refactoring to make them "no deserialize" tests and therefore I can get these done later today or tomorrow morning :)

On a side note; at some point it would be useful to get "bad block" benchmarks from clients. I.e., how quickly can you reject a bad block? I'm well on a tangent now, but it would also be worth considering introducing some form of entropy into the order in which AttestationRecords are verified inside a client so that there's not some "ideal resource-consuming block" that can be formed by an attacker. (E.g., make the last attestation bad and you know they'll check each one before it). Probably just maybe doing concurrency (based on # of available cores) and maybe reversing the order (based on "coin-flip") would be enough.

kaibakker · 2018-10-02T09:54:09Z

Looks great! Would performance increase when a vote wouldn't include a source as described here: https://ethresear.ch/t/should-we-simplify-casper-votes-to-remove-the-source-param/3549 ?

djrtwo changed the title ~~Block processing estimates~~ Block processing time estimates Sep 18, 2018

djrtwo added the analysis label Sep 18, 2018

djrtwo changed the title ~~Block processing time estimates~~ Block processing time estimates at scale at scale Sep 18, 2018

ChihChengLiang changed the title ~~Block processing time estimates at scale at scale~~ Block processing time estimates at scale Sep 18, 2018

terencechain mentioned this issue Sep 26, 2018

Benchmark Block Processing Time at Scale prysmaticlabs/prysm#573

Closed

djrtwo mentioned this issue Sep 27, 2018

Eth2.0 Implementers Call 4 Agenda ethereum/eth2.0-pm#8

Closed

pggallagher mentioned this issue Sep 30, 2018

eth2.0-implementers-call_4 ethereum/eth2.0-pm#9

Closed

paulhauner mentioned this issue Oct 2, 2018

BooleanBitfield needs to be made sane sigp/lighthouse#22

Closed

hwwhww closed this as completed Nov 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Block processing time estimates at scale #103

Block processing time estimates at scale #103

djrtwo commented Sep 18, 2018 •

edited

Loading

paulhauner commented Sep 18, 2018 •

edited

Loading

djrtwo commented Sep 27, 2018 •

edited

Loading

paulhauner commented Sep 27, 2018 •

edited

Loading

djrtwo commented Sep 27, 2018

paulhauner commented Sep 29, 2018

djrtwo commented Sep 29, 2018

paulhauner commented Sep 29, 2018 •

edited

Loading

kaibakker commented Oct 2, 2018

Block processing time estimates at scale #103

Block processing time estimates at scale #103

Comments

djrtwo commented Sep 18, 2018 • edited Loading

Issue

Proposed Implementation

paulhauner commented Sep 18, 2018 • edited Loading

djrtwo commented Sep 27, 2018 • edited Loading

paulhauner commented Sep 27, 2018 • edited Loading

djrtwo commented Sep 27, 2018

paulhauner commented Sep 29, 2018

djrtwo commented Sep 29, 2018

paulhauner commented Sep 29, 2018 • edited Loading

kaibakker commented Oct 2, 2018

djrtwo commented Sep 18, 2018 •

edited

Loading

paulhauner commented Sep 18, 2018 •

edited

Loading

djrtwo commented Sep 27, 2018 •

edited

Loading

paulhauner commented Sep 27, 2018 •

edited

Loading

paulhauner commented Sep 29, 2018 •

edited

Loading