fix(l1): fix hangs on the tx spammer #1413

rodrigo-o · 2024-12-05T13:18:20Z

Motivation

The spammer hanged in one of 2 forms, either it wait for tx that weren't added in the second phase of each spam and timeout after 5 minutes resuming the execution or it hangs ad infinitum if the issue happens in the first phase. This indicate that we were losing transactions.

Description

This is a mid-term solution, the explanation about what happened is done on the issue, in this particular comment, but TLDR: we were receiving engine_getPayload request with old payload_ids and we filled AGAIN the block but with new transactions. the block was discarded as orphan by consensus and we lost those txs from mempool on the immediate subsequent request. This was caused because we were missing the update of the payload when it was modified in the second step of our build process.

The current solution stores the whole payload, i.e. the block, block value and the blobs bunde. Given our current implementation (a naive 2-step build process) we either filled the transaction and ended the build process or not, a simple closed flag was added to the payload store to signal this and avoid refilling transactions, this is clearer than just check if there are any but could be changed if preferred.

Spammer logs	Dora explorer

Next Steps
Enhance the build process to make it async and interactive instead of the naive 2-step all-or-nothing implementation we had right now.

Resolves #1120

…ons on double getPayloads

rodrigo-o · 2024-12-09T11:58:31Z

Now the behaviour we've seen between lighthouse and geth is replicated with our node. Here are some logs showing it:

CL Logs

Dec 09 01:38:16.998 WARN Error signalling fork choice waiter     slot: 1237, error: ForkChoiceSignalOutOfOrder { current: Slot(1238), latest: Slot(1237) }, service: beacon
Dec 09 01:38:17.003 ERRO Producing block at incorrect slot       message: check clock sync, this block may be orphaned, current_slot: 1238, block_slot: 1237, service: beacon
Dec 09 01:38:17.003 WARN Producing block that conflicts with head, slot: 1237, message: this block is more likely to be orphaned, service: beacon
Dec 09 01:38:17.004 ERRO No valid eth1_data votes, `votes_to_consider` empty, outcome: casting `state.eth1_data` as eth1 vote, genesis_time: 1733693441, earliest_block_timestamp: 1733693421, lowest_block_number: 0, service: deposit_contract_rpc
Dec 09 01:38:17.006 WARN Duplicate payload cached, this might indicate redundant proposal attempts., service: exec
Dec 09 01:38:17.011 ERRO No valid eth1_data votes, `votes_to_consider` empty, outcome: casting `state.eth1_data` as eth1 vote, genesis_time: 1733693441, earliest_block_timestamp: 1733693421, lowest_block_number: 0, service: deposit_contract_rpc
Dec 09 01:38:17.102 INFO Signed block published to network via HTTP API, publish_delay_ms: 1, slot: 1238
Dec 09 01:38:17.140 INFO Valid block from HTTP API               slot: 1238, proposer_index: 16, root: 0xbfd8…a1ab, block_delay: 101.260668ms
Dec 09 01:38:22.998 INFO Synced                                  slot: 1238, block: 0xbfd8…a1ab, epoch: 38, finalized_epoch: 36, finalized_root: 0x1b45…15a9, exec_hash: 0xef5a…eca4 (verified), peers: 2, service: slot_notifier

El Logs

2024-12-09T01:38:17.005811Z  INFO ethrex_rpc::engine::payload: Requested payload with id: 0x030d7e0143865573
2024-12-09T01:38:17.005850Z  INFO ethrex_rpc::engine::payload: Current Fork: Cancun
2024-12-09T01:38:17.005900Z  INFO ethrex_rpc: RPC req method: "engine_getPayloadV3", req params: Some([String("0x030d7e0143865573")])
2024-12-09T01:38:17.005904Z  INFO ethrex_rpc: RPC response parent hash: String("0xd7e73b12be05a0524d10dcf508d52631af958a2b0d2dd86a6aff14993a437170"), blockhash: String("0x244d7ff371958f6943735adfa0617009c823768f411acc3e42ad31226f60bedb")
2024-12-09T01:38:17.012249Z  INFO ethrex_rpc::engine::payload: Requested payload with id: 0x039150aa7bf22245
2024-12-09T01:38:17.012297Z  INFO ethrex_rpc::engine::payload: Current Fork: Cancun
2024-12-09T01:38:17.012621Z  INFO ethrex_blockchain::payload: Fetching transactions from mempool
2024-12-09T01:38:17.012638Z  INFO ethrex_blockchain::payload: Fetching plain transactions from mempool
2024-12-09T01:38:17.012643Z  INFO ethrex_storage: Mempool tx count: 10
2024-12-09T01:38:17.012660Z  INFO ethrex_storage: Filtered tx count: 10
2024-12-09T01:38:17.012669Z  INFO ethrex_blockchain::payload: Fetching blob transactions from mempool
2024-12-09T01:38:17.012671Z  INFO ethrex_storage: Mempool tx count: 10
2024-12-09T01:38:17.012672Z  INFO ethrex_storage: Filtered tx count: 0
2024-12-09T01:38:17.051459Z  INFO ethrex_rpc: RPC req method: "engine_getPayloadV3", req params: Some([String("0x039150aa7bf22245")])
2024-12-09T01:38:17.051478Z  INFO ethrex_rpc: RPC response parent hash: String("0x244d7ff371958f6943735adfa0617009c823768f411acc3e42ad31226f60bedb"), blockhash: String("0xef5ae0e7c8534482302e5df9a4dddcb7d4407fbc3edb90ddba61679e3275eca4")

As we've seen with geth, 2 payload request were received, one with the old payload which was discarded by the consensus client due to being the same as a previous one, and the next one correct, not missing the block and publishing the correct one. There is also possible to just receive the old payload (which we've also seen in geth) and miss the block entirely.

rodrigo-o · 2024-12-09T14:44:44Z

Makefile

+ENCLAVE ?= lambdanet
+


This change was made to been able to compare two different executions at the same time using make localnet with different enclaves.

crates/common/types/blobs_bundle.rs

crates/networking/rpc/engine/payload.rs

…d process ended

rodrigo-o added 11 commits December 4, 2024 18:49

Implement delete payload for testing a solution to the lost transacti…

5025185

…ons on double getPayloads

Delete the payload when its already completed

2790f4e

Fixed a couple of issues and try to test the error is not there anymore

a0e09cb

Merge branch 'main' into check-transactions-hanged-from-the-spammer

84ba236

Added payload upload and check it working on incorrect slot errors

34a0efe

Applied format

7f8d8f0

Implemented payload_update on redb

b513430

Removed the now unused delete_payload

7faf91e

Removed unnedded diffs

d25e977

Merge branch 'main' into check-transactions-hanged-from-the-spammer

b8b304e

Removed some debugging code

4046433

rodrigo-o marked this pull request as ready for review December 9, 2024 11:53

rodrigo-o requested a review from a team as a code owner December 9, 2024 11:53

rodrigo-o commented Dec 9, 2024

View reviewed changes

Arkenan approved these changes Dec 9, 2024

View reviewed changes

mpaulucci reviewed Dec 9, 2024

View reviewed changes

crates/common/types/blobs_bundle.rs Show resolved Hide resolved

mpaulucci reviewed Dec 9, 2024

View reviewed changes

crates/networking/rpc/engine/payload.rs Outdated Show resolved Hide resolved

mpaulucci reviewed Dec 9, 2024

View reviewed changes

crates/networking/rpc/engine/payload.rs Outdated Show resolved Hide resolved

mpaulucci approved these changes Dec 9, 2024

View reviewed changes

rodrigo-o added 2 commits December 9, 2024 15:20

Renamed closed to completed as the variable for knowing that the buil…

65d33fd

…d process ended

Use default instead of manually calling new for every type in the struct

1ae230d

rodrigo-o added this pull request to the merge queue Dec 10, 2024

Merged via the queue into main with commit 3c05f02 Dec 10, 2024
12 checks passed

rodrigo-o deleted the check-transactions-hanged-from-the-spammer branch December 10, 2024 13:46

rodrigo-o added bug Something isn't working and removed bug Something isn't working labels Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(l1): fix hangs on the tx spammer #1413

fix(l1): fix hangs on the tx spammer #1413

rodrigo-o commented Dec 5, 2024 •

edited

Loading

rodrigo-o commented Dec 9, 2024 •

edited

Loading

rodrigo-o Dec 9, 2024 •

edited

Loading

		ENCLAVE ?= lambdanet

fix(l1): fix hangs on the tx spammer #1413

fix(l1): fix hangs on the tx spammer #1413

Conversation

rodrigo-o commented Dec 5, 2024 • edited Loading

rodrigo-o commented Dec 9, 2024 • edited Loading

rodrigo-o Dec 9, 2024 • edited Loading

Choose a reason for hiding this comment

rodrigo-o commented Dec 5, 2024 •

edited

Loading

rodrigo-o commented Dec 9, 2024 •

edited

Loading

rodrigo-o Dec 9, 2024 •

edited

Loading