-
Notifications
You must be signed in to change notification settings - Fork 283
fix(rollup verifier): nil pointer due to missing CommittedBatchMeta
#1188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…nd add rewind of L1 sync height to recover from missing CommittedBatchMeta
""" WalkthroughThe patch increments the version constant and introduces new error handling in the rollup sync service. Specifically, it adds a sentinel error and logic to rewind the sync height by 100 blocks when missing batch events are detected, aiming to address missing or nil batch metadata scenarios during rollup event synchronization. Changes
Sequence Diagram(s)sequenceDiagram
participant RollupSyncService
participant L1Chain
RollupSyncService->>L1Chain: fetchRollupEvents()
alt updateRollupEvents returns ErrMissingBatchEvent
RollupSyncService->>RollupSyncService: Rewind L1 sync height by 100
RollupSyncService->>L1Chain: fetchRollupEvents() (recursive)
else Other errors
RollupSyncService->>RollupSyncService: Handle error as before
end
Assessment against linked issues
Poem
Note ⚡️ AI Code Reviews for VS Code, Cursor, WindsurfCodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback. Note ⚡️ Faster reviews with cachingCodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 30th. To opt out, configure 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (2)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
rollup/rollup_sync_service/rollup_sync_service.go (2)
230-238
: Consider adding a maximum rewind limit.While the rewind mechanism is effective, there's no limit to how many times it can be applied recursively. Consider adding a maximum rewind counter to prevent potential infinite rewinding if the batch event cannot be recovered after several attempts.
+const ( + // maxRewindAttempts is the maximum number of times we'll attempt to rewind the L1 sync height + // before giving up and reporting a more serious error. + maxRewindAttempts = 5 +) func (s *RollupSyncService) fetchRollupEvents() error { s.stateMu.Lock() defer s.stateMu.Unlock() + // Keep track of rewind attempts within the current fetchRollupEvents call + rewindAttempts := 0 + for { prevL1Height := s.callDataBlobSource.L1Height() daEntries, err := s.callDataBlobSource.NextData() if err != nil { if errors.Is(err, da.ErrSourceExhausted) { log.Trace("Sync service exhausted data source, waiting for next data") return nil } return fmt.Errorf("failed to get next data: %w", err) } if err = s.updateRollupEvents(daEntries); err != nil { if errors.Is(err, ErrShouldResetSyncHeight) { log.Warn("Resetting sync height to L1 block 7892668 to fix L1 message queue hash calculation") s.callDataBlobSource.SetL1Height(7892668) return nil } if errors.Is(err, ErrMissingBatchEvent) { + rewindAttempts++ + if rewindAttempts > maxRewindAttempts { + return fmt.Errorf("maximum rewind attempts (%d) exceeded, still missing batch events: %w", maxRewindAttempts, err) + } // If there's a missing batch event, rewind the L1 sync height by some blocks to re-fetch from L1 RPC and // replay creating corresponding CommittedBatchMeta in local DB. // This happens recursively until the missing event has been recovered as we will call fetchRollupEvents again // with the `L1Height = prevL1Height - rewindL1Height`. s.callDataBlobSource.SetL1Height(prevL1Height - rewindL1Height) return fmt.Errorf("missing batch event, rewinding L1 sync height by %d blocks: %w", rewindL1Height, err) }
230-238
: Enhance logging for rewind operations.Consider adding more detailed logging about the rewind operation to aid in debugging and tracking recovery attempts.
if errors.Is(err, ErrMissingBatchEvent) { // If there's a missing batch event, rewind the L1 sync height by some blocks to re-fetch from L1 RPC and // replay creating corresponding CommittedBatchMeta in local DB. // This happens recursively until the missing event has been recovered as we will call fetchRollupEvents again // with the `L1Height = prevL1Height - rewindL1Height`. + rewoundHeight := prevL1Height - rewindL1Height + log.Warn("Missing batch event detected, rewinding L1 sync height", + "current_height", prevL1Height, + "rewound_height", rewoundHeight, + "rewind_amount", rewindL1Height, + "error", err) s.callDataBlobSource.SetL1Height(prevL1Height - rewindL1Height) return fmt.Errorf("missing batch event, rewinding L1 sync height by %d blocks: %w", rewindL1Height, err) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
📒 Files selected for processing (2)
params/version.go
(1 hunks)rollup/rollup_sync_service/rollup_sync_service.go
(4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: test
- GitHub Check: Analyze (go)
🔇 Additional comments (7)
params/version.go (1)
27-27
: Version bump aligns with fix forCommittedBatchMeta
nil pointer issue.The version increment from 48 to 49 correctly reflects the protocol changes introduced by the new error handling and L1 sync height rewind mechanism.
rollup/rollup_sync_service/rollup_sync_service.go (6)
46-47
: New constant for rewind height provides clear recovery mechanism.The constant defines how many L1 blocks to rewind when a missing batch event is detected. The value of 10 should provide sufficient backtracking to recover from temporary L1 RPC inconsistencies.
53-53
: New sentinel error correctly identifies missing batch event conditions.The
ErrMissingBatchEvent
sentinel error allows for specific handling of missing batch metadata scenarios throughout the error chain.
230-238
: Robust recovery mechanism for missing batch events.The new error handling correctly identifies
ErrMissingBatchEvent
cases and implements a rewind mechanism to re-fetch potentially missed events from the L1 chain. This directly addresses the nil pointer issue mentioned in the PR description.
314-318
: Properly identify and handle missing parent batch metadata.The error handling for parent
CommittedBatchMeta
now explicitly checks for nil values and properly wraps errors to propagate theErrMissingBatchEvent
signal up the call stack.
322-326
: Properly identify and handle missing current batch metadata.Similar to the parent batch metadata handling, this ensures that missing current batch metadata is properly identified and signals the need for a rewind.
470-474
: Consistent error handling in getCommittedBatchMeta function.The error handling for missing parent committed batch metadata is consistently implemented across all relevant functions in the codebase.
e6deac3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved but might make a few changes before we merge. (Let's tag on this branch first.)
Pushed a tag |
1. Purpose or design rationale of this PR
fix nil pointer due to
CommittedBatchMeta
not being found in the DB and add rewind of L1 sync height to recover from missingCommittedBatchMeta
. This is most likely caused by a faulty L1 RPC (missing or out-of-order events). With this mechanism rollup verifier should be able to handle and recover from such a case.This should also fix: #1142
2. PR title
Your PR title must follow conventional commits (as we are doing squash merge for each PR), so it must start with one of the following types:
3. Deployment tag versioning
Has the version in
params/version.go
been updated?4. Breaking change label
Does this PR have the
breaking-change
label?Summary by CodeRabbit
Bug Fixes
Chores