Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

op-batcher: issues recovering from an expired sequencing window #13150

Open
zhiqiangxu opened this issue Nov 30, 2024 · 6 comments
Open

op-batcher: issues recovering from an expired sequencing window #13150

zhiqiangxu opened this issue Nov 30, 2024 · 6 comments
Labels
A-op-batcher Area: op-batcher

Comments

@zhiqiangxu
Copy link
Contributor

zhiqiangxu commented Nov 30, 2024

Currently if the batch is already expired(the distance between safe head and unsafe head is more than one sequence window), op-batcher will still publish it, which will be dropped by op-node here.

Then op-batcher will detect it here since the safe head isn't advanced.(this feature is introduced here)

Then op-batcher will re-publish the expired batch again...

In the end, op-batcher will never catch up with the sequencing window and the balance will be exhausted very quickly as it's constantly sending transactions.

Is there any document on how op-batcher is expected to catch up in this case?

@zhiqiangxu zhiqiangxu changed the title op-batcher will publish an already expired batch liveness issue of op-batcher Nov 30, 2024
@geoknee geoknee added the A-op-batcher Area: op-batcher label Dec 2, 2024
@sebastianst sebastianst changed the title liveness issue of op-batcher op-batcher: issues recovering from an expired sequencing window Dec 5, 2024
@sebastianst
Copy link
Member

Related #11228

@emilianobonassi
Copy link
Contributor

@zhiqiangxu workaround we've seen working in these situations are disabling span batches and setting very short max channel duration

@zhiqiangxu
Copy link
Contributor Author

zhiqiangxu commented Dec 7, 2024

@zhiqiangxu workaround we've seen working in these situations are disabling span batches and setting very short max channel duration

Hey @emilianobonassi , thanks for the suggestion!

The trick still works for Holocene HF, but it needs to be done manually.

Here is a PR trying to make it automatic.

@emilianobonassi
Copy link
Contributor

yeah i know @zhiqiangxu - thanks for implementing the automation!

very short max channel duration is what helps here, so channels contains just a few batches and when discarded you minimize the drop/rewrites.

it would be great implementing a "recovery status" of the sequencer as mentioned #11228 and a good feedback loop with the batcher. so its aware and doesnt post batches related to empty blocks that will be discarded anyway (during reorg safe head moves anyway even without reads)

in the past batcher was decoupled and very simple, now with holocene, given the simplification of the derivation pipeline, we should be a lil bit smart on how we write to minimize the probability of these situations.

@zhiqiangxu
Copy link
Contributor Author

it would be great implementing a "recovery status" of the sequencer as mentioned #11228 and a good feedback loop with the batcher. so its aware and doesnt post batches related to empty blocks that will be discarded anyway (during reorg safe head moves anyway even without reads)

Understood, that would require a spec change though, which is not feasible at this point. The above pr might already be good enough in practice, and it's definitely better than current behavior: which will exhaust balance doing nothing if not interfered.

@emilianobonassi
Copy link
Contributor

oh @zhiqiangxu - my was just a proposal not direct call to action, more for op team (@sebastianst et al) for the next iterations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-op-batcher Area: op-batcher
Projects
None yet
Development

No branches or pull requests

4 participants