Fetch blobs from EL prior to block verification #6600

michaelsproul · 2024-11-21T02:08:05Z

Proposed Changes

Optimise fetch_blobs significantly, by fetching blobs from the EL prior to consensus and execution verification of the block.

We had noticed that we weren't getting many hits with fetch blobs, and this was because blobs were almost always arriving on gossip prior to us requesting them. Only a few times an hour would the fetch_blobs logic actually fire.

With this change I'm seeing much more frequent hits, without a substantial increase in publication bandwidth. In the last 30 mins running on mainnet there have been 116 hits, and 156 individual blobs published (out of 395 fetched).

Data here: https://docs.google.com/spreadsheets/d/1ZJIYbOPwNGa_veqUC0ywsOdzFYvh4aqJLMYqFoisA_E/edit?usp=sharing

This does imply that we're publishing around 35% of all blobs! But this will likely come down as more nodes chip in to publishing.

dapplion · 2024-11-21T02:19:13Z

Any security considerations on triggering this logic before validating the block? The most damage a proposer can do is waste bandwidth on a bad proposal. This does not seem like a big issue and can be done anyway regardless of fetch_blobs.

Else the experimental results look great

michaelsproul · 2024-11-21T03:59:05Z

We're doing this after gossip validation of the block, so we know that the proposer's signature is valid and they are a legitimate proposer for the slot.

Unless the proposer slashes themselves, the blob versioned hashes in the block header are the "true" (valid) versioned hashes for this slot. Alternatively the block could be completely invalid (but not slashable), in which case we will reject it upon completion of block processing.

As part of fetch_blobs we run gossip blob validation:

lighthouse/beacon_node/beacon_chain/src/fetch_blobs.rs

Lines 131 to 148 in 6e1945f

    
           // Gossip verify blobs before publishing. This prevents blobs with invalid KZG proofs from 
        
           // the EL making it into the data availability checker. We do not immediately add these 
        
           // blobs to the observed blobs/columns cache because we want to allow blobs/columns to arrive on gossip 
        
           // and be accepted (and propagated) while we are waiting to publish. Just before publishing 
        
           // we will observe the blobs/columns and only proceed with publishing if they are not yet seen. 
        
           let blobs_to_import_and_publish = fixed_blob_sidecar_list 
        
               .iter() 
        
               .filter_map(|opt_blob| { 
        
                   let blob = opt_blob.as_ref()?; 
        
                   match GossipVerifiedBlob::<T, DoNotObserve>::new(blob.clone(), blob.index, &chain) { 
        
                       Ok(verified) => Some(Ok(verified)), 
        
                       // Ignore already seen blobs. 
        
                       Err(GossipBlobError::RepeatBlob { .. }) => None, 
        
                       Err(e) => Some(Err(e)), 
        
                   } 
        
               }) 
        
               .collect::<Result<Vec<_>, _>>() 
        
               .map_err(FetchEngineBlobError::GossipBlob)?;

So if they are malformed (e.g. bad KZG proof), they will be rejected at this point.

TL;DR on the whole I think it's security-equivalent to processing blobs on gossip:

We have a valid signature from the proposer committing to the blobs.
We verify the blobs themselves before importing them.

beacon_node/network/src/network_beacon_processor/gossip_methods.rs

dapplion · 2024-11-22T05:21:20Z

beacon_node/network/src/network_beacon_processor/gossip_methods.rs

+        self.executor.spawn(
+            async move {
+                self_clone
+                    .fetch_engine_blobs_and_publish(block_clone, block_root, publish_blobs)


Since this is running as a task, it's no longer bound by the beacon processor queue. Could someone spam gossip blocks and cause a lot of fetch blobs work?

They would need to be beacon blocks with valid signatures, and this is a linear factor, so it can't really blow up much beyond the number of threads allocated to the beacon processor. E.g. if we have 16 threads in the BP, we might end up with 32 running tasks max, which are mostly I/O bound and should be handled just fine by Tokio.

We do this in a few other places, like when we check the payload with the EL:

lighthouse/beacon_node/beacon_chain/src/block_verification.rs

Lines 1407 to 1416 in 6329042

// Spawn the payload verification future as a new task, but don't wait for it to complete.

// The `payload_verification_future` will be awaited later to ensure verification completed

// successfully.

let payload_verification_handle = chain

.task_executor

.spawn_handle(

payload_verification_future,

"execution_payload_verification",

)

.ok_or(BeaconChainError::RuntimeShutdown)?;

Squashed commit of the following: commit 5f563ef Author: Michael Sproul <michael@sigmaprime.io> Date: Fri Nov 22 12:33:10 2024 +1100 Run fetch blobs in parallel with block import commit 3cfe9df Author: Michael Sproul <michael@sigmaprime.io> Date: Thu Nov 21 10:46:34 2024 +1100 Fetch blobs from EL prior to block verification

Fetch blobs from EL prior to block verification

3cfe9df

michaelsproul added ready-for-review The code is ready for review optimization Something to make Lighthouse run more efficiently. labels Nov 21, 2024

michaelsproul added v6.0.0 New major release for hierarchical state diffs waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Nov 21, 2024

michaelsproul commented Nov 21, 2024

View reviewed changes

beacon_node/network/src/network_beacon_processor/gossip_methods.rs Outdated Show resolved Hide resolved

Run fetch blobs in parallel with block import

5f563ef

dapplion reviewed Nov 22, 2024

View reviewed changes

michaelsproul added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Nov 22, 2024

michaelsproul mentioned this pull request Nov 25, 2024

Release v6.0.0 #6605

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fetch blobs from EL prior to block verification #6600

Fetch blobs from EL prior to block verification #6600

michaelsproul commented Nov 21, 2024

dapplion commented Nov 21, 2024

michaelsproul commented Nov 21, 2024

dapplion Nov 22, 2024

michaelsproul Nov 22, 2024

	// Spawn the payload verification future as a new task, but don't wait for it to complete.
	// The `payload_verification_future` will be awaited later to ensure verification completed
	// successfully.
	let payload_verification_handle = chain
	.task_executor
	.spawn_handle(
	payload_verification_future,
	"execution_payload_verification",
	)
	.ok_or(BeaconChainError::RuntimeShutdown)?;

Fetch blobs from EL prior to block verification #6600

Are you sure you want to change the base?

Fetch blobs from EL prior to block verification #6600

Conversation

michaelsproul commented Nov 21, 2024

Proposed Changes

dapplion commented Nov 21, 2024

michaelsproul commented Nov 21, 2024

dapplion Nov 22, 2024

Choose a reason for hiding this comment

michaelsproul Nov 22, 2024

Choose a reason for hiding this comment