Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] A new ChainIndexer that replaces the fragmented MsgIndex, EthTxHashIndex and EventsIndex #12388

Closed
wants to merge 3 commits into from

Conversation

aarshkshah1992
Copy link
Contributor

@aarshkshah1992 aarshkshah1992 commented Aug 14, 2024

This PR implements a new ChainIndexer in Lotus that indexes tipsets, messages, transaction hashes, and events by consuming the ChainNotify API.

It aims to replace and subsume the existing MsgIndex, EventsIndex, and EthTxHashIndex, which are currently fragmented across multiple databases and have several known issues documented in filecoin-project/lotus#12293.

Key Features

The ChainIndexer offers the following key features:

  • Indexes all necessary state in a single database.
  • We now index a "tipset" and the relevant state changes caused by a tipset which makes the Indexed state easy to reason about and more aligned with how we persist state in the Chainstore.
  • Implements snapshot hydration.
  • Implements automated config driven Index garbage collection (GC).
  • Provides automated backfilling.
  • Offers simplified configuration.
  • Wraps the asynchronous indexing in a synchronous READ API for RPC endpoints to consume to avoid off by N errors caused by async Indexing.

Here are the next implementation steps going ahead to get this PR in a ready for review/ready for testing state.

Switch RPC APIs to use the Chain Index

  • The Filecoin and ETH RPC APIs will switch to using the ChainIndexer instead of the MsgIndex, EthTxHashIndex and EventsIndex.
  • The EventFilterManager will read events from the ChainIndexer and prefill all registered filters rather than depending on the Indexer to do the pre-filling of filters.
  • The ChainIndexer will listen to Mpool message addition updates to index the corresponding ETH Tx Hash. The EthTxHashManager will no longer be used for this.

Read APIs Should Account for the Async Nature of Indexing

  • All APIs that read from the index will wait for the current head in the chainstore to be indexed if their first read attempt fails and then retry the read before returning a response to the user.
    • Note that we're only waiting for an already produced tipset in our chainstore to be indexed here; we're not waiting for a new tipset to be produced. This should work well in practise but if it times out, it points to another underlying problem in our chainstore <-> indexing pathway that should be investigated.
  • This is a workaround to handle asynchronous indexing in Filecoin.
  • For events, it should be noted that indexing the current head T only indexes events in T-1 because of deferred execution.

ETH RPC APIs Should Only Expose Executed Tipsets and Messages

  • As part of this work, we should review all ETH RPC APIs to ensure they only expose and respond to requests for executed tipsets and transactions. This is because Filecoin uses a deferred execution model, unlike Ethereum:
  • In Filecoin, messages included in tipset T are executed in tipset T + 1.
  • In Ethereum, messages included in tipset T are also executed in tipset T.
  • Exposing messages and tipsets that have not been executed yet via ETH RPC APIs in Lotus causes errors when users ask for the corresponding execution state, receipts, or events for those tipsets/messages because they do not yet exist.
  • There have been proposals for an even more conservative implementation where ETH RPC APIs only expose "finalized" tipsets and messages post-F3. However, it remains to be determined how well this would work in practice, given that clients might end up waiting for up to 3 epochs (90 seconds) for already included messages in the worst case.
  • For now, we should go ahead with exposing all(even non-finalised) executed tipsets/messages and revisit the implementation after F3 ships.

Removing Re-orged Tipsets That Are No Longer Part of the Canonical Chain

  • The ChainIndexer will periodically prune all permanently re-orged/reverted tipsets from the index. It can do this by simply pruning all tipsets at a height less than (current head - finality policy - some buffer).
  • The use of foreign key-based cascading deletes in the DDL will greatly simplify this implementation. By simply deleting a tipset from the index, all associated indexed state will be deleted from the DB. See SQLite Foreign Keys for more information.

Garbage Collection

  • Garbage collection (GC) will be configuration-driven. Users can specify how much history they want to retain, and the ChainIndexer can perform periodic GC based on this configuration.
  • GC should be straightforward in the ChainIndexer because of the use of FOREIGN KEY ON CASCADE DELETES, as described in SQLite Foreign Keys.
  • When a tipset is deleted, all associated indexed state will be automatically deleted from the DB due to the cascading delete behavior.

Snapshot Hydration

  • When a node is synced from a snapshot, the index should be completely deleted, and a new index should be hydrated from the snapshot.
  • It's important to note that snapshots don't contain events. In order to hydrate events in the index, messages in the tipset will have to be re-executed.

Automated Backfilling

  • When a Lotus node starts up, it performs the following steps:
    • Looks up the latest non-reverted tipset in the ChainIndex for which the corresponding state exists in the statestore.
    • Instantiates the Observer with that tipset as the current head.
    • Starts the Observer.
  • This process ensures that the ChainIndexer will observe the (Apply, Revert) path between its last non-reverted indexed tipset and the current heaviest tipset in the chainstore before processing real-time updates, effectively performing automated backfilling.
  • One challenge arises for a niche use case when an RPC provider toggles the Indexing flag to ON after keeping it OFF for an extended period or for the first time. In such cases, the backfilling backlog could interfere with indexing real-time tipset changes, potentially impacting RPC queries that primarily target state at or near the head.
  • To address this issue, a configuration option can be exposed that allows such users to disable automated backfilling if their primary focus is serving RPC queries for new tipsets after enabling indexing.

Simplify Indexing Config

  • The current indexing configuration in Lotus is extremely complex, with partial indexing options that make it difficult for node operators to understand the state they are indexing or should be indexing.
  • To improve the user experience and simplify the implementation, the current config will be replaced with a simple "Indexing ON/Indexing OFF" switch. Users either index everything that Lotus needs to provide fast RPC responses or index nothing.
  • The niche use case of "Index X but do not Index Y" will no longer be supported.

Migration from Old Indices to the New ChainIndex

  • Develop a lotus-shed utility that allows users to migrate existing indices to the new ChainIndexer database. This command should only be executed when the Lotus node is offline to ensure data consistency and avoid potential conflicts.
  • When a Lotus node starts up, it should bypass any migration or backfilling processes and directly begin indexing new tipsets in the ChainIndexer. This approach offers several benefits:
  1. Users can migrate the historical index at their own pace without incurring a performance penalty during node startup.
  2. The node can quickly respond to queries for new tipsets since indexing for these tipsets commences as soon as the node is operational.
  • By decoupling the migration process from the node startup, users gain flexibility in managing the transition to the new indexing system while maintaining optimal performance for real-time tipset indexing.

Solid Unit Tests, itests and testing on a calibnet node

  • Goes without saying

@aarshkshah1992 aarshkshah1992 marked this pull request as draft August 14, 2024 16:48
github-actions[bot]

This comment was marked as duplicate.

@@ -23,15 +23,6 @@ const DefaultDbFilename = "msgindex.db"

var log = logging.Logger("msgindex")

var ddls = []string{
`CREATE TABLE IF NOT EXISTS messages (
cid VARCHAR(80) PRIMARY KEY ON CONFLICT REPLACE,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will completely remove the msgIndex once this PR lands.

Comment on lines +22 to +23
event_index INTEGER NOT NULL,
emitter_addr BLOB NOT NULL,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space/tab issue here btw

if err != nil {
return xerrors.Errorf("error unreverting events for tipset: %w", err)
}
if rows > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so we're using this as our indicator of whether we've seen these events before and won't bother checking individual events like we do now, this seems sensible I think and avoids overhead, although it has more confidence that we have a consistent state!


if err == sql.ErrNoRows {
// wait till head is processed and retry
if err := ci.waitTillHeadIndexed(ctx); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The msg could be in the mpool, right? that's about the only other place that legitimate messages might be. This call will block for up to 30s, is that a reasonable thing to do in this case? Shouldn't we just bail and say "not found" earlier under the assumption that we'll pick it up in the mpool (when you eventually implement that)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rvagg Eventually, mpool msgs will also be in this DB once I wire it up. This is to handle the case where user is asking for something in the HEAD.

return ems, nil
}

func (ci *ChainIndexer) GetMsgInfo(ctx context.Context, msg_cid cid.Cid) (MsgInfo, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hopefully the linter is going to have issues with your underscore in msg_cid

Copy link
Contributor Author

@aarshkshah1992 aarshkshah1992 Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, trying out some new tooling. Will fix.

}

func (ci *ChainIndexer) waitTillHeadIndexed(ctx context.Context) error {
ctx, cancel := context.WithTimeout(ctx, 30*time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we just use EpochDurationSeconds here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rvagg Honestly, this has nothing to do with an epoch. I think we can change the timeout to 5seconds. The idea is that the Index should reflect the heaviest block we have in our statestore that's it. We're not waiting for a new block to come in here.

@rvagg
Copy link
Member

rvagg commented Aug 15, 2024

The "waitTill" up to 30s concerns me a little in these APIs. I think the problem we're trying to solve is the async updating of the index, correct? So we might be querying something that's just happened but it hasn't propagated to the db. But these waits could be up to 30s to find out if something just doesn't exist.

I wonder if there's an alternative strategy here in checking with the status of the chain synchronously. We could do something like:

  1. Check if what we want is there (like currently); return if it is, continue if it's not
  2. Find out what the heaviest tipset is with GetHeaviestTipSet which I believe is updated synchronously with execution
  3. Do a "wait for tipset" on that particular tipset
  4. Re-do the query and return that result

Then we deal with the async updating but only have to wait until what has already been processed has propagated. Does that work?

@aarshkshah1992
Copy link
Contributor Author

The "waitTill" up to 30s concerns me a little in these APIs. I think the problem we're trying to solve is the async updating of the index, correct? So we might be querying something that's just happened but it hasn't propagated to the db. But these waits could be up to 30s to find out if something just doesn't exist.

I wonder if there's an alternative strategy here in checking with the status of the chain synchronously. We could do something like:

  1. Check if what we want is there (like currently); return if it is, continue if it's not
  2. Find out what the heaviest tipset is with GetHeaviestTipSet which I believe is updated synchronously with execution
  3. Do a "wait for tipset" on that particular tipset
  4. Re-do the query and return that result

Then we deal with the async updating but only have to wait until what has already been processed has propagated. Does that work?

@rvagg That is the plan and what I'm trying to do here i.e. ensure that the Index has Indexed the current heaviest tipset in the chainstore before retrying a read if the first read attempt fails. We don't need to wait for a new block to come in here. I will update the code/docs/timeout to make this more clearer.

reverted INTEGER NOT NULL,
message_cid BLOB NOT NULL,
message_index INTEGER NOT NULL,
events_processed INTEGER NOT NULL,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the purpose of this field btw?

Copy link
Contributor Author

@aarshkshah1992 aarshkshah1992 Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rvagg Eventually, want the event read API to differentiate between "events processed and no events" vs "yet to process events". This is a simple denormalised field that will help with that.

Copy link
Contributor Author

@aarshkshah1992 aarshkshah1992 Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a tipset T, events in T will be processed when we index a tipset U such that Parent(U)=T i.e. when tipset T is first "executed" and it's execution tipset is indexed. This field just makes it easy to track that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but do we use this anywhere? I'm not seeing it, what's the plan?

Copy link
Contributor Author

@aarshkshah1992 aarshkshah1992 Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rvagg Yeah lemme add a "ReadEventsAPI" to show how it'll be used and wire it into the EventFilterManager.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR was to show the general direction I am taking. Next one will be code complete.

github-actions[bot]

This comment was marked as duplicate.

@aarshkshah1992 aarshkshah1992 changed the title [WIP Draft] POC for a chain Index that replaces the (msg, tx, event) Index [WIP] A new ChainIndexer that replaces the fragmented MsgIndex, EthTxHashIndex and EventsIndex Aug 16, 2024
github-actions[bot]

This comment was marked as duplicate.

@rjan90
Copy link
Contributor

rjan90 commented Aug 16, 2024

Please update the PR title to match https://github.com/filecoin-project/lotus/blob/master/CONTRIBUTING.md#pr-title-conventions

Maybe we should allow the [WIP] prefix in PR-titles as well, so that this PR-checker is not so noisy?

@aarshkshah1992
Copy link
Contributor Author

@Stebalien In case you feel like taking a look.

@BigLep
Copy link
Member

BigLep commented Aug 16, 2024

Maybe we should allow the [WIP] prefix in PR-titles as well, so that this PR-checker is not so noisy?

@rjan90 : I think ideally we'd allow [WIP] in the title when the PR is in draft but then have a tighter check when it's out of draft?

@BigLep
Copy link
Member

BigLep commented Aug 16, 2024

Thanks for the awesome writeup @aarshkshah1992 about where this is going and what is remaining. Very thorough but easy to understand. Good stuff! I'm excited to see this land.

github-actions[bot]

This comment was marked as duplicate.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aarshkshah1992
Copy link
Contributor Author

@github-actions No sorry, cannot bro ❤️

@BigLep BigLep mentioned this pull request Sep 9, 2024
7 tasks
@aarshkshah1992
Copy link
Contributor Author

The goal of this PR was to implement a POC for the ChainIndexer to flush out the problem space and get initial feedback.

This PR has now been subsumed by #12421.

The description has been ported over to an issue created for the ChainIndexer work at #12453.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: ☑️ Done (Archive)
Development

Successfully merging this pull request may close these issues.

4 participants