You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In a beacon block, we have N data roots r[1] ... r[n] (in the worst case, N = 256), each of which represent data d[1] ... d[n] of different sizes: on average 128 kB, in the worst case 512 kB. We want to combine these roots into a combined root R = f(r[1] ... r[n]) which is a Merkle root of some kind of concatenation of d[1] ...d[n].
The main desiderata are:
R can be computed from r[1] ... r[n] either directly or without adding much auxiliary data
Minimal "wasted space" in R (consider that if we force all r[i] to have the same height, on average there will be 75% wasted space)
Possibilities I see:
Allow r[i] to have different heights depending on its size, so there's on average ~30% maximum 49% wasted space. Then create R by combining all r[i] values of each size, adding a bit of padding if needed and then combining the different sizes. Main weakness is that variable heights are not SSZ-friendly.
Expose a maximum of 4 data sub-roots where r[i] = root(r[i][0], r[i][1], r[i][2], r[i][3]). Reduces wasted space even further, and is cleaner (including being SSZ compliant), but increases number of block hashes from 1 to average 1.5 max 4, increasing beacon chain load by 1.2x average-case and 2.33x worst case.
Expose 3 data sub-roots where r[i][0] is the root of the first half of the data if the data is more than half the max size and otherwise zero, r[i][1] is the root of the first quarter of the data not covered by r[i][0], and r[i][2] is the root of the first eighth of the data not covered by r[i][1]. Reduces wasted space heavily (maybe ~10% wasted), but increases beacon chain load ~1.89x, and is more complicated (including non-SSZ-friendliness from (1)).
For (2), here are concrete estimates of data overhead. With 64 shards, and a MAX_CATCHUP_RATIO of 4, and 4x slack for different attestations, there are a maximum of 1024 data roots in a block, plus 1024 state roots and 1024 lengths (that's 32 + 32 + 8 = 72 kB). Proposal (2) would increase the former to 1536 data roots hence 48 + 32 + 8 = 88 kB average case and 160 + 32 + 8 = 168 kB worst case. Note that average-case figures are expected to be ~8x less than these worst-case figures.
I am currently favoring (2) but open to other ideas.
The text was updated successfully, but these errors were encountered:
I think (2) sounds very reasonable. Could we save a lot of beacon chain load by making the target shard block size just a little bit smaller than 128 kB, say something in the range of 100-120 kB? I we are expecting that the distribution around the target does not have a huge variance, then this would make a lot fewer of the second data roots necessary (not sure if there is an obvious way to get the expected distribution from EIP1559?)
I do expect the distribution to have a considerable variance! Sure, if you assume it's a normal distribution, on average a block has ~120 txs so the standard deviation should be ~10, but that ignores uneven sizes of transactions, multi-transactions, etc. If I had to guess I'd expect the 25-75 confidence interval to be ~0.6-1.5 times the average.
In a beacon block, we have N data roots
r[1] ... r[n]
(in the worst case, N = 256), each of which represent datad[1] ... d[n]
of different sizes: on average 128 kB, in the worst case 512 kB. We want to combine these roots into a combined rootR = f(r[1] ... r[n])
which is a Merkle root of some kind of concatenation ofd[1] ...d[n]
.The main desiderata are:
R
can be computed fromr[1] ... r[n]
either directly or without adding much auxiliary dataR
(consider that if we force allr[i]
to have the same height, on average there will be 75% wasted space)Possibilities I see:
r[i]
to have different heights depending on its size, so there's on average ~30% maximum 49% wasted space. Then create R by combining allr[i]
values of each size, adding a bit of padding if needed and then combining the different sizes. Main weakness is that variable heights are not SSZ-friendly.r[i] = root(r[i][0], r[i][1], r[i][2], r[i][3])
. Reduces wasted space even further, and is cleaner (including being SSZ compliant), but increases number of block hashes from 1 to average 1.5 max 4, increasing beacon chain load by 1.2x average-case and 2.33x worst case.r[i][0]
is the root of the first half of the data if the data is more than half the max size and otherwise zero,r[i][1]
is the root of the first quarter of the data not covered byr[i][0]
, andr[i][2]
is the root of the first eighth of the data not covered byr[i][1]
. Reduces wasted space heavily (maybe ~10% wasted), but increases beacon chain load ~1.89x, and is more complicated (including non-SSZ-friendliness from (1)).For (1) here is an example algorithm:
For (2), here are concrete estimates of data overhead. With 64 shards, and a MAX_CATCHUP_RATIO of 4, and 4x slack for different attestations, there are a maximum of 1024 data roots in a block, plus 1024 state roots and 1024 lengths (that's 32 + 32 + 8 = 72 kB). Proposal (2) would increase the former to 1536 data roots hence 48 + 32 + 8 = 88 kB average case and 160 + 32 + 8 = 168 kB worst case. Note that average-case figures are expected to be ~8x less than these worst-case figures.
I am currently favoring (2) but open to other ideas.
The text was updated successfully, but these errors were encountered: