Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch blobs from EL prior to block verification #6600

Open
wants to merge 2 commits into
base: unstable
Choose a base branch
from

Conversation

michaelsproul
Copy link
Member

Proposed Changes

Optimise fetch_blobs significantly, by fetching blobs from the EL prior to consensus and execution verification of the block.

We had noticed that we weren't getting many hits with fetch blobs, and this was because blobs were almost always arriving on gossip prior to us requesting them. Only a few times an hour would the fetch_blobs logic actually fire.

With this change I'm seeing much more frequent hits, without a substantial increase in publication bandwidth. In the last 30 mins running on mainnet there have been 116 hits, and 156 individual blobs published (out of 395 fetched).

Data here: https://docs.google.com/spreadsheets/d/1ZJIYbOPwNGa_veqUC0ywsOdzFYvh4aqJLMYqFoisA_E/edit?usp=sharing

This does imply that we're publishing around 35% of all blobs! But this will likely come down as more nodes chip in to publishing.

@michaelsproul michaelsproul added ready-for-review The code is ready for review optimization Something to make Lighthouse run more efficiently. labels Nov 21, 2024
@dapplion
Copy link
Collaborator

Any security considerations on triggering this logic before validating the block? The most damage a proposer can do is waste bandwidth on a bad proposal. This does not seem like a big issue and can be done anyway regardless of fetch_blobs.

Else the experimental results look great

@michaelsproul
Copy link
Member Author

We're doing this after gossip validation of the block, so we know that the proposer's signature is valid and they are a legitimate proposer for the slot.

Unless the proposer slashes themselves, the blob versioned hashes in the block header are the "true" (valid) versioned hashes for this slot. Alternatively the block could be completely invalid (but not slashable), in which case we will reject it upon completion of block processing.

As part of fetch_blobs we run gossip blob validation:

// Gossip verify blobs before publishing. This prevents blobs with invalid KZG proofs from
// the EL making it into the data availability checker. We do not immediately add these
// blobs to the observed blobs/columns cache because we want to allow blobs/columns to arrive on gossip
// and be accepted (and propagated) while we are waiting to publish. Just before publishing
// we will observe the blobs/columns and only proceed with publishing if they are not yet seen.
let blobs_to_import_and_publish = fixed_blob_sidecar_list
.iter()
.filter_map(|opt_blob| {
let blob = opt_blob.as_ref()?;
match GossipVerifiedBlob::<T, DoNotObserve>::new(blob.clone(), blob.index, &chain) {
Ok(verified) => Some(Ok(verified)),
// Ignore already seen blobs.
Err(GossipBlobError::RepeatBlob { .. }) => None,
Err(e) => Some(Err(e)),
}
})
.collect::<Result<Vec<_>, _>>()
.map_err(FetchEngineBlobError::GossipBlob)?;

So if they are malformed (e.g. bad KZG proof), they will be rejected at this point.

TL;DR on the whole I think it's security-equivalent to processing blobs on gossip:

  • We have a valid signature from the proposer committing to the blobs.
  • We verify the blobs themselves before importing them.

@michaelsproul michaelsproul added v6.0.0 New major release for hierarchical state diffs waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Nov 21, 2024
self.executor.spawn(
async move {
self_clone
.fetch_engine_blobs_and_publish(block_clone, block_root, publish_blobs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is running as a task, it's no longer bound by the beacon processor queue. Could someone spam gossip blocks and cause a lot of fetch blobs work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They would need to be beacon blocks with valid signatures, and this is a linear factor, so it can't really blow up much beyond the number of threads allocated to the beacon processor. E.g. if we have 16 threads in the BP, we might end up with 32 running tasks max, which are mostly I/O bound and should be handled just fine by Tokio.

We do this in a few other places, like when we check the payload with the EL:

// Spawn the payload verification future as a new task, but don't wait for it to complete.
// The `payload_verification_future` will be awaited later to ensure verification completed
// successfully.
let payload_verification_handle = chain
.task_executor
.spawn_handle(
payload_verification_future,
"execution_payload_verification",
)
.ok_or(BeaconChainError::RuntimeShutdown)?;

@michaelsproul michaelsproul added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Nov 22, 2024
michaelsproul added a commit that referenced this pull request Nov 25, 2024
Squashed commit of the following:

commit 5f563ef
Author: Michael Sproul <michael@sigmaprime.io>
Date:   Fri Nov 22 12:33:10 2024 +1100

    Run fetch blobs in parallel with block import

commit 3cfe9df
Author: Michael Sproul <michael@sigmaprime.io>
Date:   Thu Nov 21 10:46:34 2024 +1100

    Fetch blobs from EL prior to block verification
@michaelsproul michaelsproul mentioned this pull request Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimization Something to make Lighthouse run more efficiently. ready-for-review The code is ready for review v6.0.0 New major release for hierarchical state diffs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants