feat(statesync): introduce Finalizer interface for syncer cleanup #4623

powerslider · 2025-11-26T18:55:38Z

Why this should be merged

How this works

Add a Finalizer interface to provide explicit cleanup operations for syncers. This ensures cleanup (like flushing batches to disk) is performed reliably even on cancellation or early returns.

Add Finalizer interface to sync/types.go for explicit cleanup.
Attach Finalize() in CodeQueue that finalizes code fetching to this new interface.
Gather finalization logic in a Finalize() for StateSyncer to flush in-progress trie batches.
Implement Finalize() for AtomicSyncer to commit pending database changes.
Add FinalizeAll() to SyncerRegistry with defer to ensure cleanup runs.
Remove OnFailure callback mechanism (replaced by Finalizer).

How this was tested

existing UT

Need to be documented?

no

Need to update RELEASES.md?

no

resolves #4603

Signed-off-by: Tsvetan Dimitrov (tsvetan.dimitrov23@gmail.com)

… shutdown During graceful shutdown, syncers cancelled via context cancellation were being logged as ERROR level. This is misleading since cancellation during shutdown is expected behavior, not an error condition. - Use `errors.Is()` to detect `context.Canceled` and `context.DeadlineExceeded` (handles wrapped errors) and log as INFO instead of ERROR - Separate `RunSyncerTasks()` logic into a synchronous wrapper and `StartAsync()` method for async execution to gain more flexibility and handle more use cases. - Add early return optimization when context is already cancelled. Test improvements: - Add tests for cancellation scenarios (`Canceled`, `DeadlineExceeded`, wrapped errors, early return). - Fix flakiness by adding WaitGroup synchronization and replacing channel-based coordination. - Refactor tests to use `t.Context()` and extract common helpers. resolves #1410

During graceful shutdown, the State Syncer was hanging because multiple blocking operations did not check context cancellation. When shutdown occurred, these operations would block indefinitely, preventing syncers from detecting cancellation and exiting gracefully. - Add context.Context parameter to LeafSyncTask.OnLeafs() interface to enable context propagation through the leaf processing call chain. - Update CodeQueue.AddCode() to accept context and check ctx.Done() before blocking on channel sends, preventing indefinite blocking when Code Syncer stops consuming during shutdown. - Update all OnLeafs implementations (mainTrieTask, storageTrieTask, trieSegment, atomic syncer) to accept and pass context through the call chain. - Add context parameter to startSyncing() and createSegments() methods, checking cancellation before blocking channel sends to the segments work queue. - Add context cancellation check in BlockSyncer before checking blocks on disk, ensuring it responds during the initial scan phase. - Update sync/client/leaf_syncer.go to pass context to OnLeafs() callbacks. This ensures all syncers detect cancellation immediately and exit gracefully instead of hanging until timeout.

Add a `Finalizer` interface to provide explicit cleanup operations for syncers. This ensures cleanup (like flushing batches to disk) is performed reliably even on cancellation or early returns. - Add `Finalizer` interface to `sync/types.go` for explicit cleanup. - Attach `Finalize()` in `CodeQueue` that finalizes code fetching to this new interface. - Gather finalization logic in a `Finalize()` for StateSyncer to flush in-progress trie batches. - Implement `Finalize()` for AtomicSyncer to commit pending database changes. - Add `FinalizeAll()` to SyncerRegistry with defer to ensure cleanup runs. - Remove `OnFailure` callback mechanism (replaced by `Finalizer`). resolves #1089 Signed-off-by: Tsvetan Dimitrov (tsvetan.dimitrov23@gmail.com)

ceyonur · 2025-11-28T16:40:41Z

graft/coreth/plugin/evm/atomic/sync/syncer.go

+func (s *Syncer) Finalize() error {
+	if s.db == nil {
+		return nil
+	}
+	return s.db.Commit()
+}


Why is this committing the db? Isn't onFinish already doing the same thing? I think this interface is a bit confusing with onFinish already here.

So based on my understanding onFinish() and Finalize() serve very different goals:

onFinish()

Critical Completion Handler for the sync process.

called by CallbackLeafSyncer when a leaf sync task successfully finishes (all leaves fetched and processed).

its responsibilities include:

Commit the final trie state.

Insert the trie into the atomic trie database.

Accept the trie at the target height.

Commit the database.

Validate that the synced root matches the expected target root.

Errors propagates up -> syncTask() -> Sync() → caller. Basically the sync process fails if this fails.

The root validation is critical - without it, we could accept a corrupted or incomplete sync.

Finalize()

Best-Effort Cleanup Handler

called SyncerRegistry.FinalizeAll() via defer - always called after Sync() returns (success or failure).

its responsibilities include:

Commit any uncommitted database changes.

Errors are logged but not returned. Does not affect sync success/failure status.

Preserves partial progress when sync fails mid-way.

In the success case, Finalize() does a redundant db.Commit() since onFinish() already committed. Finalize() only provides unique value when sync fails after processing some leaves that haven't been committed yet (between commit intervals in onLeafs).

NOTE: Given that db.Commit() on an already-committed database is essentially a no-op (very cheap), the added complexity of tracking when the syncing completes may not be worth it. But if you want explicit clarity about what's happening I can add a completed flag that could be set in onFinish and then checked this way:

func (s *Syncer) Finalize() error { if s.completed { return nil // Already committed in onFinish. } return s.db.Commit() }

ceyonur · 2025-11-28T17:24:34Z

graft/coreth/sync/statesync/state_syncer.go

-// progress to restore.
-func (t *stateSync) onSyncFailure() {
+// Finalize checks if there are any in-progress tries and flushes their batches to disk
+// to preserve progress. This is called by the syncer registry on sync failure or cancellation.


isn't this also being called in success?

Yes. The reasons why this is ok and part of the design direction with this centralized best-effort Finalize() are the following:

Finalize() in the state syncer flushes any uncommitted segment batches from in-progress tries to disk.

we are shifiting the current design of dedicated on success and on failure handlers, because we don't want the syncers to self manage their own lifecycles. We want to delegate that to the SyncerRegistry.

Finalize() is always called after Sync() returns (via defer in registry).

on success: triesInProgress is empty -> no-op (tries already removed after completion).

on failure: triesInProgress contains incomplete work -> flushes batches to disk

we have a single code path, guaranteed cleanup via defer.

Added an early return for the success path and more clarifications in the method docs.

ceyonur · 2025-11-28T18:00:08Z

graft/coreth/sync/statesync/state_syncer.go

 	defer t.lock.RUnlock()

 	for _, trie := range t.triesInProgress {
 		for _, segment := range trie.segments {


(could be an over-cautious comment)
this looks like should've been cleared on success, but if not then we might end-up in a weird spot.

check my previous reply #4623 (comment)

…ate syncer

…inalize()

…ync success

powerslider added 8 commits November 26, 2025 20:46

fix(vmsync): improve on remarks

746ab4b

fix: remove checking if error is wrapped

eaef2ad

style: improvements

cd986bc

fix: add context cancellation check to block syncer loop

e5cbefc

fix(statesync): remove pre-select context err check in code_queue.go

7190c80

powerslider self-assigned this Nov 26, 2025

powerslider requested a review from a team as a code owner November 26, 2025 18:55

github-project-automation bot added this to avalanchego Nov 26, 2025

powerslider added 2 commits November 26, 2025 20:58

fix: necessary cleanup in registry_test.go

07546b5

Merge branch 'master' into powerslider/1089-finalize-syncers

29e2310

powerslider changed the title ~~Powerslider/1089 finalize syncers~~ feat(statesync): introduce Finalizer interface for syncer cleanup Nov 26, 2025

powerslider added 3 commits November 26, 2025 21:04

style: gci format

ce4f9e7

style: gci format

5796e07

style: gci format

cbd0706

ceyonur reviewed Nov 28, 2025

View reviewed changes

powerslider added 5 commits December 1, 2025 15:07

fix: add an early return for the success case in Finalize() in the st…

0bde2b9

…ate syncer

fix: remove unnecessary defensive db nil check in the atomic syncer F…

841ef64

…inalize()

Merge branch 'master' into powerslider/1089-finalize-syncers

f077d73

fix(statesync): add a syncCompleted flag to unambiguously determine s…

58adc19

…ync success

Merge branch 'master' into powerslider/1089-finalize-syncers

d9c9720

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(statesync): introduce Finalizer interface for syncer cleanup #4623

feat(statesync): introduce Finalizer interface for syncer cleanup #4623

powerslider commented Nov 26, 2025 •

edited

Loading

Uh oh!

ceyonur Nov 28, 2025

Uh oh!

powerslider Dec 1, 2025 •

edited

Loading

Uh oh!

powerslider Dec 1, 2025

Uh oh!

ceyonur Nov 28, 2025

Uh oh!

powerslider Dec 1, 2025 •

edited

Loading

Uh oh!

powerslider Dec 1, 2025

Uh oh!

ceyonur Nov 28, 2025

Uh oh!

powerslider Dec 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat(statesync): introduce Finalizer interface for syncer cleanup #4623

Are you sure you want to change the base?

feat(statesync): introduce Finalizer interface for syncer cleanup #4623

Conversation

powerslider commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why this should be merged

How this works

How this was tested

Need to be documented?

Need to update RELEASES.md?

Uh oh!

ceyonur Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

powerslider Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

onFinish()

Finalize()

Uh oh!

powerslider Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

ceyonur Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

powerslider Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

powerslider Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

ceyonur Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

powerslider Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

powerslider commented Nov 26, 2025 •

edited

Loading

powerslider Dec 1, 2025 •

edited

Loading

`onFinish()`

`Finalize()`

powerslider Dec 1, 2025 •

edited

Loading