Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fix task stuck and not reassign bug in concurrent-fetch logic #50

Merged

Conversation

krish-nr
Copy link
Contributor

@krish-nr krish-nr commented Jan 19, 2024

Description

This PR aims to address the issue where the engine-sync synchronization process gets stuck in certain scenarios and cannot recover.

Rationale

Under the current scenarios, concurrentfetch may get stuck under certain conditions. Specifically, the trigger condition is as follows: when the data fetched by a task does not exist among all current peers (a situation that typically occurs when the TPS on L2 is high and an op-node has broadcasted a block, but the geth of the same node has not finished writing), the task will not be triggered for retry or reassignment until an event occurs (such as a peer joining or leaving, or receiving a cancel signal). Otherwise, it will remain in a deadlock. The modification made by this PR is to retry such tasks. If the task exceeds the maximum number of retries, it will proactively cancel the current task, allowing the scheduler to reschedule. In addition, this PR fixes a bug in the existing code where the pregressed state was not reset after unreserve.

Example

N/A

Changes

Notable changes:

  • handle failed task in beacon mode
  • reset progressed tag after task unreserve

@krish-nr krish-nr force-pushed the krish/enginesync-stuckbug-fix branch from 45b956b to 09b03e8 Compare January 30, 2024 09:03
@krish-nr krish-nr marked this pull request as ready for review January 30, 2024 10:09
@github-actions github-actions bot requested review from bnoieh and redhdx January 30, 2024 10:09
@owen-reorg owen-reorg merged commit 8377b0e into bnb-chain:develop Feb 1, 2024
2 checks passed
andyzhang2023 pushed a commit to andyzhang2023/op-geth that referenced this pull request Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants