Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Off-Chain] ModuleParamsClient & Historical Params #543

Open
4 tasks
bryanchriswhite opened this issue May 21, 2024 · 1 comment · Fixed by #993 · 4 remaining pull requests
Open
4 tasks

[Off-Chain] ModuleParamsClient & Historical Params #543

bryanchriswhite opened this issue May 21, 2024 · 1 comment · Fixed by #993 · 4 remaining pull requests
Assignees
Labels
off-chain Off-chain business logic

Comments

@bryanchriswhite
Copy link
Contributor

bryanchriswhite commented May 21, 2024

Objective

Enable support for off-chain processes to cache on-chain parameters and react to parameter updates asynchronously.

Origin Document

  1. Off-chain actors currently query the network much too frequently.

  2. Observations made while working on [Protocol] Add a NumBlocksPerSession governance parameter #517. In order for off-chain processes to use the correct governance params in the appropriate places in the protocol, they MUST be aware of the most recent param values. In some cases, off-chain processes MUST also know the historical values of some params.

Assumptions

  1. It doesn't make sense for changes to protocol parameters to take effect during a session.
  2. We SHOULD limit the amount of historical data that is persisted to only what is necessary; let external indexers support non-essential indexing.

Example

A param update message is committed in the middle of a session, modifying num_blocks_per_session & min_relay_difficulty_bits by the time a proof is submitted & validated on-chain. The num_blocks_per_session and min_relay_difficulty_bits at the time of the session to which the proof corresponds (i.e. will not be the current session) MUST be used. There may also be off-chain use cases (presently or in the future) which require previous values for params.

sequenceDiagram

participant SessionQueryClient as Supplier
participant Poktrolld
participant PNF

Note over SessionQueryClient,PNF: Session N Start

PNF->>Poktrolld: Update params

break
    SessionQueryClient-->Poktrolld: (wait for sesison end)
end

Note over SessionQueryClient,PNF: Session N End

break
    SessionQueryClient-->Poktrolld: (wait for grace period end)
end

Note over SessionQueryClient,PNF: Session N Grace Period End

SessionQueryClient ->> Poktrolld: Create claim

break
    SessionQueryClient-->Poktrolld: (wait for next grace period end)
end

Note over SessionQueryClient,PNF: Session N+1 Grace Period End

SessionQueryClient ->> Poktrolld: Submit proof

Poktrolld->>Poktrolld: Validate proof

Loading

Goals

  • To support off-chain reactivity to parameter updates (in-between sessions).
  • Support on-chain persistence (& pruning) of historical param values.

Design Choices

On-chain persistence of historical params

When a param update message is committed:

  1. The corresponding on-chain params value(s) are updated on commit.

  2. The previous param value(s), the current session number, & the height at which the param update message is committed are stored in the on-chain state. There are a couple of options for how to persist historical params that effect what their usage would look like:

    1. Add a cache of previous values to the existing params data structure
      • History: map[updateHeight]Params{}
    2. Persist historical param values in a new multistore store (w/ height & param name indices)

i. When an off-chain client queries for params, the response includes the current param values, as well as previous values, the last session number that each previous was set, and the height at which each previous value was updated (no longer valid after). An additional parameter would be introduced to determine the "retention block count" for (each or all?) param(s).

Pros

  • Does not require additional on-chain query logic but would support it.
  • Consistent with gateaway undelegation logic

Cons

  • Increases the size of all params query responses which support historical values.

ii. An off-chain client could query would make distinct queries whether querying for current params or historical params. Historical param queries would take a height argument & MAY return an error if the requested height is older than the "retention block count"

Pros

  • Could be more easily encapsulated (e.g. in a params module) to consolidate common behavior across modules.

Cons

  • Requires additional multistore store, key prefixes, etc.

Another degree of freedom is open regarding whether or not to add methods for querying param values a specific height, and if so, whether those methods should be on-chain, off-chain, or both.

  1. On-chain only (e.g. aroundKeeper#Params() & QueryParamsRequest)
    • Could apply to both i or ii.
  2. Off-chain only (e.g. on the Params go protobuf struct itself)
    • Only applies in i, where historical params are present on the Params data structure.

ModuleParamsClient

  1. Use the respective module's QueryClient to query for the params initially, followed by subscribing to block events via a BlockClient (or its EventsReplayClient deconstruction used with block.UnmarshalNewBlockEvent()) such that updated params may be emitted by an observable & a cache is accumulated over time. Params at height which are not in this cache MUST be queried for on-chain.

    Pros:

    • Easy to use & versatile
    • Compatible & consistent with other async client behavior

    Cons:

    • May beg refactoring in the block client. It would be convenient to have a method that returns an observable of block.CometNewBlockEvents (see: session_steps_test.go:131).
  2. Use a BlockClient & the respective module's QueryClient to query for the params at the start height of each session.

    Pros:

    • Reduces frequency (as compared to naive implementation) of queries

    Cons:

    • More imperative
    • Increased query overhead
    • Likely reduces to 1 just without the cache

Deliverables

  • ...

Non-goals / Non-deliverables

  • ...

General deliverables

  • Comments: Add/update TODOs and comments alongside the source code so it is easier to follow.
  • Testing: Add new tests (unit and/or E2E) to the test suite.
  • Makefile: Add new targets to the Makefile to make the new functionality easier to use.
  • Documentation: Update architectural or development READMEs; use mermaid diagrams where appropriate.

Creator: @bryanchriswhite

@bryanchriswhite bryanchriswhite added the off-chain Off-chain business logic label May 21, 2024
@bryanchriswhite bryanchriswhite self-assigned this May 21, 2024
@bryanchriswhite bryanchriswhite moved this to Draft in Shannon May 21, 2024
@Olshansk Olshansk moved this from Draft to 🔖 Ready in Shannon Jul 19, 2024
@bryanchriswhite bryanchriswhite moved this from 🔖 Ready to 🏗 In progress in Shannon Oct 9, 2024
@bryanchriswhite bryanchriswhite moved this from 🏗 In progress to 🔖 Ready in Shannon Oct 14, 2024
@Olshansk Olshansk moved this from 🔖 Ready to 🏗 In progress in Shannon Nov 4, 2024
@bryanchriswhite
Copy link
Contributor Author

bryanchriswhite commented Jan 17, 2025

Current Status

TL;DR, parking this work as-is for now to shift focus. Cold-start edge case IS critical-path; therefore, E2EApp testutil is an imminent dependency.
(Given that adding substantial complexity to the implementation solely to avoid the dependency for this case is NOT a rational outcome).

The current state of this work implements off-chain historical and non-historical, in-memory caching with event-driven cache warming (via comet RPC tx event query subscription). Given the options at this juncture (see the mindmap below), my view is that it would be too wasteful (and create unnecessary uncertainty) to try to postpone event-driven cache warming. Additionally, that branch of the decision tree is independent from that of how to test the cold-start edge case, which I am going to argue IS critical-path, especially in the alpha/beta regimes where there's a very non-zero possibility that binaries may getting bounced more than otherwise expected.

Options From Here

  1. Add feature flags to disable the new functionality by default while allowing the PRs to be merged.
  2. Park the PRs for a bit.

My view is that 1. requires extra work prior to merging and may introduce bugs. As the client caching architecture is fairly thoroughly described by class diagrams, collectively across the individual PRs, and since there's no currently planned work that should significantly overlap with these changes (because it's a foundational optimization), there's no good reason to try to rush a merge right now.

Related PRs:

Decision Tree

mindmap
    (event-driven cache warming)
        (POSTPONE event-driven cache warming<br>⏳⏳⏳⏳)
            (MUST SetVersion on cache miss)
                (how to know latest height?<br>❓❓❓)
                 (add BlockQueryClient dependency)
                    (what query frequency?<br>❓❓❓)
                        (block time - consensus param)
                    might as well use BlockClient subscription?<br>🤔
                        (BlockQueryClient depends on BlockQueryClient)
                        (BlockClient would waste more bandwidth)
        (IMPLEMENTATION OPTIONS<br>🏗️🚧)
            (subscribe to CONTAINS MsgUpdateParam<br>✅ ⭐⭐⭐⭐⭐)
                (TESTING)
                    (depends on E2E app testutil - websocket server)
            (subscribe to new block events)
                (OVERFETCHING)
                    (MOST blocks WON'T contain param update messages but WILL contain other irrelevant messages and events<br>❌ 💣💣💣💣💣)
        (cold-start edge case)
            ("query ABCI RPC (cometbft)")
                (TESTING)
                    (depends on E2E app testutil)
                        (gRPC gateway<br>❓❓❓)
                        (combetbft RPC server<br>❓❓❓)
                    (OR depends on configurable cache)
                        (requires many additional generic parameters)
Loading

@bryanchriswhite bryanchriswhite linked a pull request Jan 20, 2025 that will close this issue
15 tasks
@bryanchriswhite bryanchriswhite moved this from 🏗 In progress to 📋 Backlog in Shannon Jan 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment