feat: custom indexer with multi index support #455

whankinsiv · 2025-12-05T00:03:03Z

Description

This PR extends the custom_indexer module to support multiple independent indexes within a single indexer instance. Each index is registered using add_index, which creates a dedicated channel for receiving transaction and rollback events. All messages include a oneshot response channel, allowing the manager task to update cursor state and remove corrupted indexes.

The error handling model is explicit and prevents corrupted index state from persisting across runs:

Decode failure: Indicates invalid transaction bytes (should never occur for valid blocks). The index is halted immediately and awaits the next rollback event.
Handle failure: Indicates an error in user-defined index logic. The index is halted and will be reset on the next process start due to the persisted halted = true flag.
Rollback failure: If an index cannot roll back cleanly, the index attempts a full reset.
- If reset succeeds, the chain sync point is rewound.
- If reset fails, the index is removed from the runtime (senders). On next startup, the persisted halt flag causes a fresh reset.

Related Issue(s)

Completes #380

How was this tested?

Added full coverage for index_actor (apply tx, rollback, reset paths).
Verified that the example indexer runs multiple indexes with different starting point correctly.

Checklist

My code builds and passes local tests
I added/updated tests for my changes, where applicable
I updated documentation (if applicable)
CI is green for this PR

Impact / Side effects

Adds safe recovery behavior and multi-index support to the custom indexer.

Reviewer notes / Areas to focus

Rollback failure behavior and the halt/persist/reset flow

Signed-off-by: William Hankins <william@sundae.fi>

…ndex channels Signed-off-by: William Hankins <william@sundae.fi>

Signed-off-by: William Hankins <william@sundae.fi>

lowhung

Just so I understand the flow...

When a new index is added, it spawns an actor per index, each with its own channel
ChainSync messages come in on the subscription and fan out to all actors
The actors process independently, then return a result with updated cursor entries
Main loop collects responses + persists cursors + handles failures

lowhung · 2025-12-06T01:28:06Z

modules/custom_indexer/src/cursor_store.rs

-        let raw = self.cursor.get("cursor")?;
+    async fn load(&self) -> Result<HashMap<String, CursorEntry>> {
+        let mut out = HashMap::new();
+        let iter = self.partition.prefix("cursor/");


[nit] We reference this prefix "cursor/" in a few places, should we keep it as a const in this file?

Added "cursor/" as a const in 7b5d429.

lowhung · 2025-12-06T01:33:10Z

modules/custom_indexer/src/cursor_store.rs


-        self.cursor.insert("cursor", raw)?;
+        for (name, point) in tips {
+            let key = format!("cursor/{name}");


[nit] A method in the cursor store to format the key and return it / get the prefix could be nice.

Added key_for, name_from_key, and prefix_iter helper methods in 7b5d429.

lowhung · 2025-12-06T01:35:27Z

modules/custom_indexer/src/cursor_store.rs

+}
+
+#[derive(Debug)]
+pub struct CursorSaveError {


[nit] Could leverage thiserror here

Suggested change

pub struct CursorSaveError {

#[derive(Debug, thiserror::Error)]

#[error("Failed to save cursor tips for: {failed:?}")]

pub struct CursorSaveError {

pub failed: Vec<String>,

}

Thanks for this suggestion! Switched to using thiserror in 7b5d429.

lowhung · 2025-12-06T01:37:07Z

modules/custom_indexer/src/custom_indexer.rs

+            halted: false,
+        });
+
+        if force_restart || entry.halted {


Nice 👍🏻

lowhung · 2025-12-06T01:39:26Z

modules/custom_indexer/src/custom_indexer.rs

+                new_tips.insert(name.clone(), entry.clone());
+                change_sync_point(entry.tip, run_context.clone(), &sync_topic.to_string()).await?;
+            }
+            Ok(IndexResult::FatalResetError { entry, reason }) => {


Very clear flow here 👍🏻

lowhung · 2025-12-06T01:41:42Z

modules/custom_indexer/src/index_actor.rs

+    }
+
+    #[tokio::test]
+    async fn rollback_fails_then_reset_succeeds_clears_halt_and_updates_tip() {


Very clear tests 🔥 nicely done

Signed-off-by: William Hankins <william@sundae.fi>

whankinsiv · 2025-12-07T22:46:02Z

Just so I understand the flow...

When a new index is added, it spawns an actor per index, each with its own channel

ChainSync messages come in on the subscription and fan out to all actors

The actors process independently, then return a result with updated cursor entries

Main loop collects responses + persists cursors + handles failures

Yes, that's the flow 😄 One extra detail: the index actors themselves handle halting on error and attempt their own reset on rollback failure.

The main loop/manager just forwards events to the actors, persists tip/halting state into the cursor store, and removes any actors from the runtime that couldn't recover.

modules/custom_indexer/src/custom_indexer.rs

SupernaviX · 2025-12-08T16:03:42Z

modules/custom_indexer/src/custom_indexer.rs

+        if force_restart || entry.halted {
+            index.reset(&default_start).await?;
+            entry.tip = default_start.clone();
+            entry.halted = false;
+        }


I'm a little confused by this block. If an index started on slot 50000 and then halted with an error on slot 60000, why are we resetting back to 50000? That implies we successfully processed a few thousand blocks, and I don't think we need to undo that work.

In general, we should avoid resetting indexes if the caller didn't explicitly ask. In production, it will almost certainly cause downtime.

The reason we reset to default_start when an index stops in a halted state is that we can't rely on receiving a corrective rollback. If a rollback fails or a block fails to decode, ChainSync won't send a second rollback message on the next run which leaves the index in a corrupted state with no further way to recover. Given that limitation, resetting on startup is the only reliable way to bring the index back to a healthy state.

SupernaviX · 2025-12-08T16:14:23Z

modules/custom_indexer/src/index_actor.rs

+        // If the rollback failed, attempt to reset the index
+        Err(_) => match wrapper.index.reset(&wrapper.default_start).await {


I'm a little concerned about resetting if a rollback fails; it could fail for arbitrary reasons, including for transient issues like a DB failure.

But I think in practice it'll be fine to do this:

rollbacks are relatively rare compared to roll forwards

most transient failures which happen in a rollback would probably happen in a rollforward, sooner rather than later

resets will almost certainly result in downtime in production, but the system will eventually recover by itself

So if it's possible to recover from a rollback without triggering this reset, that'd be ideal. But we can live without it.

The reason the rollback failure path triggers a reset is that we don't get a second rollback message if the first one fails. Without that, the index is stuck in a corrupted state with no further corrective signal coming from ChainSync. A reset is the only guaranteed way to bring it back to a healthy state. If you think a retry inside the module would meaningfully reduce resets from transient failures I can add that.

SupernaviX · 2025-12-08T16:20:07Z

modules/custom_indexer/src/custom_indexer.rs

+
+        let mut entry = cursors.get(&name).cloned().unwrap_or(CursorEntry {
+            tip: default_start.clone(),
+            halted: false,


Little thing, can we default this to true? That way, we call reset every time an index is created, which means callers can add (re)initialization logic in the reset method.

Updated to default to halted: true in 100d2d4.

SupernaviX · 2025-12-08T16:39:27Z

modules/custom_indexer/src/chain_index.rs

+
+    async fn reset(&mut self, start: &Point) -> Result<Point>;


Is there a reason we return a point from this? I'm not sure what it makes sense to return besides the start it was passed.

The intent here was to give implementations the option to adjust their replay point on reset. For example, an index that only needs the last N blocks could implement reset to resume from a later point than the one provided. If we don't expect any index to diverge from the provided start, then returning a point isn't needed. I can switch it to () unless we want to support this kind of behavior.

Signed-off-by: William Hankins <william@sundae.fi>

whankinsiv · 2025-12-08T20:14:18Z

Merging. @SupernaviX and I discussed a refactor to improve halt recovery on subsequent runs which is outlined in #461. This will be implemented in a follow up PR after our milestones have completed as these are not required changes.

whankinsiv added 10 commits December 3, 2025 18:31

feat: multi index support for custom indexer

0498e03

Signed-off-by: William Hankins <william@sundae.fi>

refactor: simplify tip handling logic

83d5c35

Signed-off-by: William Hankins <william@sundae.fi>

refactor: convert index dispatch pipeline into actor model with per i…

d7875ef

…ndex channels Signed-off-by: William Hankins <william@sundae.fi>

fix: decode txs in actors instead of manager due to lifetime issues

27c6a72

Signed-off-by: William Hankins <william@sundae.fi>

fix: move logic to helper functions and improve rollback handling

0cce96c

Signed-off-by: William Hankins <william@sundae.fi>

fix: Cargo fmt/shear

768edca

Signed-off-by: William Hankins <william@sundae.fi>

merge origin/main

b8ef82b

Signed-off-by: William Hankins <william@sundae.fi>

test: add comprehensive coverage for index actor

32907cb

Signed-off-by: William Hankins <william@sundae.fi>

fix: update example to handle 2 indexes

d9d9a7c

Signed-off-by: William Hankins <william@sundae.fi>

merge origin/main

f377baf

Signed-off-by: William Hankins <william@sundae.fi>

whankinsiv marked this pull request as ready for review December 6, 2025 00:27

whankinsiv requested review from SupernaviX and lowhung December 6, 2025 00:27

lowhung approved these changes Dec 6, 2025

View reviewed changes

whankinsiv added 2 commits December 7, 2025 22:26

refactor: unify key handling and use thiserror for CursorSaveError

7b5d429

Signed-off-by: William Hankins <william@sundae.fi>

merge origin/main and update test_block to include tip_sot

ceba7a8

github-actions bot mentioned this pull request Dec 8, 2025

📊 Weekly Status - Week of 2025-12-08 #460

Open

SupernaviX reviewed Dec 8, 2025

View reviewed changes

fix: implement feedback

100d2d4

Signed-off-by: William Hankins <william@sundae.fi>

whankinsiv merged commit 2e90625 into main Dec 8, 2025
2 checks passed

whankinsiv deleted the whankinsiv/multi-index-custom-indexer branch December 8, 2025 20:14

github-actions bot mentioned this pull request Dec 15, 2025

📊 Weekly Status - Week of 2025-12-15 #496

Open

-pub struct CursorSaveError {
+#[derive(Debug, thiserror::Error)]
+#[error("Failed to save cursor tips for: {failed:?}")]
+pub struct CursorSaveError {
+    pub failed: Vec<String>,
+}

		// If the rollback failed, attempt to reset the index
		Err(_) => match wrapper.index.reset(&wrapper.default_start).await {

feat: custom indexer with multi index support #455

feat: custom indexer with multi index support #455

Uh oh!

Conversation

whankinsiv commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue(s)

How was this tested?

Checklist

Impact / Side effects

Reviewer notes / Areas to focus

Uh oh!

lowhung left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

whankinsiv commented Dec 7, 2025

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

whankinsiv commented Dec 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

whankinsiv commented Dec 5, 2025 •

edited

Loading