-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass continuation for handling ledger events #402
base: main
Are you sure you want to change the base?
Conversation
Now, needs to be threaded down to 'addBlockSync' and up back to 'runWith' to be usable from the node. The goal for now is simply to print ledger events on the console on a running node.
…tack - I've skipped 'chainSelectionForFutureBlocks' and 'addBlockAsync' which seem not relevant to the use case of streaming events down to clients. Those functions are used in anticipation when preparing blocks to apply from the mempool but should likely not lead to any event notification. - Similarly, the handler is set to 'const $ pure ()' on initialization functions which are simply replaying the database.
So that we can use the newest 'local' ouroboros-consensus-diffusion for building the node, at the same time as our new patch consensus.
This is to prepare adding arguments to index events by block hash/slot number
@@ -165,6 +169,17 @@ class ( -- Requirements on the ledger state itself | |||
applyChainTick :: IsLedger l => LedgerCfg l -> SlotNo -> l -> Ticked l | |||
applyChainTick = lrResult ..: applyChainTickLedgerResult | |||
|
|||
|
|||
-- | Handler for ledger events | |||
newtype LedgerEventHandler m l = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This very much seems like a Tracer
. Would you object to using the Tracer
interface for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is l
somehow more convenient in your use case than the more primary blk
? EG you could have blk -> AuxLedgerEvent (LedgerState blk) -> m ()
(or maybe it's ExtLedgerState
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracer
conveys the wrong semantic, doesn't it? I agree that the interface is similar but they are used for vastly different things (although you could simple trace events as well, true).
Regarding the l
parameter, I recall we had to stick to l
because some functions in the consensus are really abstract and don't even have blk
in context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is l somehow more convenient in your use case than the more primary blk?
If we replace l
with blk
, then the line (lrEvents result)
won't compile because it needs the type parameter l
and not blk
.
@@ -85,10 +87,11 @@ withDB | |||
, ConvertRawHash blk | |||
, SerialiseDiskConstraints blk | |||
) | |||
=> ChainDbArgs Identity m blk | |||
=> LedgerEventHandler m (ExtLedgerState blk) | |||
-> ChainDbArgs Identity m blk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the LedgerEventHandler
to the ChainDbArgs
instead of passing it alongside them?
ApplyVal b -> do | ||
result <- either (throwLedgerError db (blockRealPoint b)) return $ runExcept $ | ||
tickThenApplyLedgerResult cfg b l | ||
forM_ (lrEvents result) $ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct that this is the only place you're actually invoking the given event handler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Though not all call to applyBlock
will have an handler. In some situations, like when blocks are re-applied after a restart or, when blocks are applied to our own local chain, we pass down discardEvent
to avoid yielding unnecessary events.
And we'd love your insights on that. The idea is to only yield event once for each block - modulo rollbacks.
@@ -180,6 +181,9 @@ data RunNodeArgs m addrNTN addrNTC blk (p2p :: Diffusion.P2P) = RunNodeArgs { | |||
|
|||
-- | Network PeerSharing miniprotocol willingness flag | |||
, rnPeerSharing :: PeerSharing | |||
|
|||
-- | An event handler to trigger custom action when ledger events are emitted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This current diff is invoking this whenever chain selection validates a block. It's unclear to me that that is the most useful thing to do.
As such, would you reword and expand this comment to specify (in plain terms that your anticipated users of this feature would be likely to correctly interpret---not just eg Consensus team members) exactly which ledger events will be passed to this handler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, I would want to double-check whether the events for blocks on A be emitted twice times if a node switches from A to B and then back to an extension of A. And is that the desired behavior?
Does the client need to be informed of blocks whose events were previously emitted being rolled back? Or are they suppose to track that on their own, based on the emitted hashes and slots?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do they definitely need events for the volatile blocks? Or could you instead side-step these questions by only even emitting events for a block when it becomes part of the immutable chain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nfrisby This current diff is invoking this whenever chain selection validates a block.
Correct, although the validation is what's driving the addition of blocks to the volatile db. The indented behavior here really is to emit an event when a block is added to our local db.
@nfrisby exactly which ledger events will be passed to this handler?
Not sure to understand the question. This will pass ALL ledger events as emitted by the ledger. What events depend on the era and the relative time within the epoch. Are you asking to list all those in the comments here? (at the risk of creating discrepancy as soon as the ledger adds / removes events?) -- We could perhaps link to the relevant section of the ledger?
@nfrisby For example, I would want to double-check whether the events for blocks on A be emitted twice times if a node switches from A to B and then back to an extension of A. And is that the desired behavior?
That is the desired behavior. In this model, we let clients cope with this and that's why events are emitted alongside a block header hash and a slot number. With these added information, clients should be able to figure out by themselves that a chain switch has occurred and that they need to rollback some of their internal states.
@nfrisby Do they definitely need events for the volatile blocks? Or could you instead side-step these questions by only even emitting events for a block when it becomes part of the immutable chain?
That's a good question which can only be answered by use-cases I believe. Only relying on the immutable means that we are always lagging ~18h in the past. This may or may not sufficient depending on the use case (and I need to give it some thoughts for my own use case 🤔 ...). I believe that emitting events for volatile blocks is the most flexible. If clients need the immutability, they can wait for k blocks. The opposite isn't possible if we only emit events for immutable blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly which ledger events will be passed to this handler?
Not sure to understand the question.
I meant the ledger events from "exactly which" blocks will be passed to the handler. EG you could say "every block the Consensus Layer chooses to validate". However, that's probably too opaque for the end-user. So how to describe it more clearly than that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think dispatching events to the consumer at the point where chain selection occurs is the behaviour we want. In this current design, we want something that's minimally intrusive and therefore lay the burden of storing, indexing, and managing events to the client, providing them "all" the relevant information, eg. the block/slot and the event itself.
In any case, clients of a blockchain need to be able to understand and correctly handle rollbacks and forks. The block/slot index makes it easy to correlate the data of a block as provided by ChainSync
and the events this block generated.
Not sure if this answers your questions @nfrisby :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think so. I can imagine the intended spec is something along the lines:
An event handler that will be called with the ledger events arising from the validation of a block that was ever one of the k latest blocks on the node's selection. Note that this means the stream of ledger events will be affected by roll backs, and that it will exclude events from blocks that were older than k (eg the entire historical chain) when the node started.
At the moment, though it will currently (with this diff) also include events from blocks that were never (yet) selected, since currently the node does validate blocks from the near-future even before they are selectable 😬 (that's something we're working to stop).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nfrisby we've been quite careful to not wire in the event handler in functions like chainSelectionForFutureBlocks
and addBlockAsync
to avoid precisely what you're describing.
Is this "pointless" in the sense that the selection of future blocks will still happen through other execution paths?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the unbundled comments above; I wasn't expecting to do a full pass, but I did.
Just to clarify: I have not yet considered whether this seems like an appropriate overall architecture. On the other hand, maybe that ship has already sailed, given the existence of your PRs in this repo and others. I still will need to discuss it with the Consensus team and other architects, etc.
Perhaps we can discuss it in a Consensus office hours? cc @KtorZ @abailly-iohk @dcoutts @dnadales
@@ -165,6 +169,24 @@ class ( -- Requirements on the ledger state itself | |||
applyChainTick :: IsLedger l => LedgerCfg l -> SlotNo -> l -> Ticked l | |||
applyChainTick = lrResult ..: applyChainTickLedgerResult | |||
|
|||
-- | Handler for ledger events | |||
newtype LedgerEventHandler m l blk = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KtorZ Had to add the additional blk
in order to get the previous block header hash. Doesn't seem like we can achieve it with just l
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's odd. Why can't the previous header hash simply be a HeaderHash l
? There's no structural or semantic difference between the current header hash and the previous one 🤨
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no structural or semantic difference between the current header hash and the previous one
There is a structural difference: the previous hash might be Genesis
instead of actually a block's hash. (Though you could collapse that case down to the genesis block's hash as a block's hash, but that's not a built-in behavior we do ubiquitously/automatically.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. Correct. Bummer.
…clients can now if a rollback occurred, or if they missed an event.
-> HeaderHash l -- Block header hash of the applied block | ||
-> SlotNo -- Slot number of the applied block | ||
-> BlockNo -- Applied block number | ||
-> [AuxLedgerEvent l] -- Resulting 'AuxLedgerEvent's after applying `applyBlock`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KtorZ Changed it to a list so that clients and know if they missed an event or if a rollback occured.
@nfrisby Thanks a lot for such a quick review! The ship hasn't actually sailed and this is really preliminary work to gather feedback and insights from consensus and node experts. As pointed out by Matthias, the way to actually shovel that information towards clients through the node is an open space: We choose a very simple solution that works for a simple use case, there might be a lot of other options to consider and experiment with (have an |
:: ChainHash blk -- Previous block header hash | ||
-> HeaderHash l -- Block header hash of the applied block | ||
-> SlotNo -- Slot number of the applied block | ||
-> BlockNo -- Applied block number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is becoming crowded here. I believe that we can replace SlotNo -> BlockNo -> HeaderHash l
by a Tip blk
@@ -180,6 +181,9 @@ data RunNodeArgs m addrNTN addrNTC blk (p2p :: Diffusion.P2P) = RunNodeArgs { | |||
|
|||
-- | Network PeerSharing miniprotocol willingness flag | |||
, rnPeerSharing :: PeerSharing | |||
|
|||
-- | An event handler to trigger custom action when ledger events are emitted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the long-term plan. I think this data flow would be unnecessary if the the ChainDB instead stored the ledger events alongside each block. Then those events could simply be served alongside the (potentially immutable) blocks (and rollback messages!) by the local ChainSync client.
But that would of course require the additional on-disk database that the PR description posited.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I wonder of the effort should be spent there instead of on what currently seems to me to be a stop-gap measure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what currently seems to me to be a stop-gap measure.
What do you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While a dedicated database alongside the ImmutableDB
and friends would be desirable as it would remove the need for clients to track themselves the data, the proposed solution has the advantage of the simplicity and not adding an additional responsibility on top of an already pretty busy consensus.
In practice, we thought a client that crashes or wants to catch up could simply reset the node before the desired point and let the synchronisation magic happens. Doing it manually is error prone but relatively straightforward, of course one would want to have a tool for that but this is orthogonal to streaming the events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And having this solution in place does not preclude working on improvements, eg. storing the events and serving them through a custom protocol or alongside the ChainSync
…meter for LedgerEventHandler
Right, events must be stored if we want a design that does not involve clients computing them. There's then two approaches to that:
The advantage of 1 is that it avoids ever having two copies of the ledger state (currently in memory, but in future on disk). The disadvantage of 1 is that it goes against one of the general design principles of the node: have the node only compute and store what it needs. The advantage of 2 is that it does go with the existing node design principles, where the node provides the raw data but other clients provide the data in other ways that are more useful to applications. So that's things like db-sync, other indexers, ogmios and other event APIs. |
@dcoutts I'd honestly be okay with that responsibility falling on a client if-and-only-if there are easy ways for a client to rollback the node to a specific state. This becomes necessary in rare situations where connections are lost and one need to recover some lost events. At the moment, this is only achievable by manually removing |
@KtorZ if a dedicated client is doing it, i.e. storing all the events and making them available to other applications (e.g. something like Ogmios) then there's no need for any changes in the node at all. It is already possible to follow the chain with a ledger state, as cardano-db-sync demonstrates. (Yes, it should be easier to write such applications using the Cardano API, but that's another story.) So imagine a client with a combo of features from db-sync and Ogmios:
This is possible today, without any node changes (though it could be easier with some Cardano API improvements). There key thing is don't insist that there is only ever one copy of the ledger state. Allow this dedicated client to have its own copy. Then there's no tricky synchronisation problems or extra features needed from the node, or need to wind back or anything. |
It is possible but unpractical (hence the current PR) as clients must hold on the ledger state and keep a (multiple GBs) copy of it in memory (surely UTxO on-disk storage will alleviate that resource constraint, but will come at a cost of much more implementation complexity).
It is, but the developer experience is atrocious. What you're describing is the current state of affairs which we deem unsatisfactory. The current situation (having a client follow the chain and replicate the ledger state) is precisely why we seek alternative solutions.
If it were easy to build such a copy without the cardano-ledger libraries, then I could hear the argument. Nevertheless, as it stands, the specs are insufficient to build another valid implementation. And even with complete and precise specs, this is a costly and cumbersome endeavor which we generally do not want to consider lightly. So what you're describing is only possible for clients that would be written in Haskell, which excludes a large part of the ecosystem. Rather than re-doing the ledger calculations (might it be in Haskell or not), it seems to me that working out an API to make ledger events accessible is a relatively low-hanging fruit. As demonstrated by this PR, a couple of days of work suffices. And allowing the node to rollback its volatile/immutable db should be quite straightforward as well. Surely bundling that as a full mini-protocol is more work. But this isn't what we're asking here. |
I think that's a key point here, thanks for emphasizing it @KtorZ! And all the other points of course :) This proposed change is a first step, the beginning of a journey towards making the cardano data more open. It's a very minimal change to the consensus, which might or might not lead to changes in the mini-protocols, in the node, in specialised versions of the node, in clients... |
You're missing my point. I also agree that applications need to be in any language the author chooses. The architecture is intended to look like this: node -> api adaptor -> application API adaptors here include things like db-sync and Ogmios. Yes it is intended that these adaptors be written in Haskell (using the appropriate Cardano libraries), and provide language neutral APIs to other applications. So I'm not proposing you write all your applications in Haskell. I am proposing that this adaptor be written in Haskell (just as it would be if the feature were somehow integrated in the node). |
Note that it does not get one any closer to the desired design, including the design proposed in this ticket.
This design idea does not need this continuation PR, it needs something different and rather larger. Or the client based approach I propose needs no change in the node at all (but would benefit from improvements to the Cardano API instead). |
This echoes what I wrote in introduction of this PR, and why I think the current status quo isn't satisfactory:
Surely we can write another adaptor trying to solve some of those problems, but I'd rather not to. Even
Perhaps then I am simply challenging that vision. I do see a world where we also have many:
Point being that people do not want to work with a middleware. It usually adds cost and complexity (more things can go wrong). Since the community of builders is already walking down that path, why not make it easier for people to harness what's currently available -- especially when they are relatively low hanging fruits.
I concede that this doesn't get us closer to the "long-term vision" (which we should probably call an "alternative vision") from a design perspective. But it does one important thing: it makes ledger events available so that client applications can start relying on them. Whether they get the events via an event stream from a socket or through a more sophisticated mini-protocol doesn't invalidate that. Clients can start thinking in terms of what's now possible to build with these instead of being forced into one particular design (i.e. cardano-db-sync).
Correct. This PR is an intermediate solution that far easier to implement than the mini-protocol approach. It has the main benefit of being non-intrusive and easily turned off in the case of block producers. The reason this PR exists is because there are far less discussions needed to approve it than what it would take to design a complete mini-protocol. So its chances of being accepted are high(er). Now, if we ever get to make a move on the mini-protocol idea, I am more than happy to come, remove and clean up whatever this PR introduces. Pinky promise. |
But that requires the client to run a process which uses more that ~10GB of RAM. That's exactly what we're trying to avoid. If the counter argument is that we'll eventually have the LedgerState computed stored on disk, then the question is: when? My guess is that it's going to take a while, and we need a solution now, and this solution provides a quick way to for clients to get the info. Like @KtorZ said, once LedgerState doesn't use so much memory, we can always come back and delete what this PR introduced.
To clarify, are you referring only to |
Lets try and classify designs here. There's three:
Now I argue that design 1 is not something that people will actually want to use, and nor is is something the node team would want to support. It is not nearly as useful as it looks: there's no way to get events for old blocks, so a client cannot start late or reconnect. There's no sensible way to synchronise the client with the node. It also means the node is limited by the speed at which the slowest client can consume the events. This is different to how the local chain sync works which does not block the progress of the node growing its chain, irrespective of how slow or late clients consume blocks. Trying to run a node in this mode means it is really being run as a client, and cannot follow Ouroboros in timely manner. More generally, these problems are not specific to the proposed intermediate design. As far as I can see any design that tries to have just a single copy of the ledger state is going to have these problems. It means the ledger state (and thus progress of the chain) has to be synchronised between the node and a client. Yes it would be nice to avoid needing two copies of the ledger state, but I think that's inevitable for any kind of robust design.
The on-disk storage project has a branch ready for the node that stores the UTxO on disk, and which passes benchmarks for the in-memory backend. We're waiting for system level benchmarks for the on-disk backend. |
@KtorZ as I'm sure you know, the node is complex and has high maintenance costs. Without middleware we would have to integrate all those middleware features into the node itself. That's not a sustainable architecture, the complexity and resource costs would be unacceptable. It's fine not to use middleware where what the node can sensibly provide natively is sufficient for applications (i.e. just block streaming), but for extra things like indexing, or indeed storing and providing ledger events, the sensible thing is to use middleware. The real problem, imho, is that we've not put enough effort into making it easy to develop such middleware. It should be easy to write such things using the Cardano API, but it's currently much harder than it needs to be. We've got a proof of concept with |
BTW, how does the intermediate design support switching forks? With local chain sync, that's built in, but with live streaming? 😬 |
@dcoutts Thanks for enaging in the above discussion. Regarding your latest question:
This is something we did discuss during the Consensus Office Hours yesterday. I have an action item to write the summary as comment on this PR, but I've been preparing for my flight to Paris that departs in a few hours, so that will hopefully happen before Monday. |
My notes summarizing our Consensus Office Hours call on Oct 12 were far too big for a GitHub comment, and deserve review. So I opened a dedicated Draft PR for them: #440 Edit: we combined that PR into this one for simplicity; see the |
Problem
The Cardano ledger rules are complex and aren't getting easier. This complexity stems from the rules themselves and the various patches that occurred during the development lifecycle (e.g. intra-era hard forks). Some of the information computed by the ledger is made readily available to client applications through the local-state-query protocol.
However, this comes with a few limitations:
The local-state query can only query information that the ledger has. Yet, the ledger can only rewind up to$\frac{3 \times k}{f}$ slots in the past. This pertains to the security of the consensus protocol and the need to roll back to some old state -- but not too old. Accessing historical data from the ledger is, therefore, not possible.
The information exposed via the
local-state-query
protocol is already aggregated in a way suitable for the ledger but not necessarily for client applications.To cope with (1) and (2), the ledger eventually introduced Ledger Events into the small-steps semantic. For a couple of eras, the ledger emits them as it validates blocks. Events contain various kinds of useful information. However, at the moment, ledger events are only available to clients who have access to the entire ledger state in a fold-like Haskell API:
foldBlocks
. This is, in particular, used bycardano-db-sync
.Yet, this method is inconvenient for it requires client applications to hold a copy of the ledger state in memory and redo all calculations. This is rather inconvenient, even for Haskell applications, as it dramatically increases the resources needed by that application (as of today, the ledger-state(s) on mainnet requires 13-15GB of RAM!). Moreover, it is simply unusable outside of the Haskell landscape, for it is a Haskell-only interface.
Solution
Command-sourcing vs. Event-sourcing
From the point of view of software architecture, the Cardano ledger (and node) are designed as a Command-sourced system: What's persisted is a sequence of blocks which act as commands whereby each block applied on ledger state yields a new version of the ledger state. This works because the rules that govern application of blocks on ledger state are deterministic thus guaranteeing application of the same sequence of blocks on the same starting point will yield the same final state on any system running nodes with the same version.
This is great but is has one drawback: In order to compute the state-transition function, fully or partially, one has to know the exact set of rules to implement, eg. one has to run a ledger which is not a trivial thing to do as it requires a lot of resources (see above) and is not easily portable.
Events on the other hand, do not require such knowledge because they are its results: There's nothing more to compute, they are the direct result of the consensus and do not need to be verified. They are therefore intrinsically easier to understand, more portable, can be interpreted partially in any way the client sees fit, etc.
The proposed change combines the benefits of both approaches making Cardano even more flexible and open.
State of affairs
Since the ledger/node already calculates and emits those events -- since, by nature, it has access to the ledger state, our idea is to bubble up those events into a language-agnostic interface usable by local clients. We observed that events were currently discarded by the consensus layer, which remains the main driver of the block application. From the Abstract API of the consensus, we see the following:
https://github.com/input-output-hk/ouroboros-consensus/blob/49c7f76175b431ba4e9d16aa959db234cc6772bd/ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Ledger/Basics.hs#L71-L75
https://github.com/input-output-hk/ouroboros-consensus/blob/49c7f76175b431ba4e9d16aa959db234cc6772bd/ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Ledger/Abstract.hs#L80-L89
LedgerResult
contains both the new ledger state (a
, instantiated later) and a list of all events resulting from this block application. However, those events are, in fact, discarded by the actual implementation of the abstract interface:https://github.com/input-output-hk/ouroboros-consensus/blob/49c7f76175b431ba4e9d16aa959db234cc6772bd/ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Update.hs#L114-L127
https://github.com/input-output-hk/ouroboros-consensus/blob/49c7f76175b431ba4e9d16aa959db234cc6772bd/ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Ledger/Abstract.hs#L160-L166
Ledger events continuation
Using a continuation-passing style approach, we want to provide a handler which can be executed for each event. Using a continuation, we can easily turn off any event handling in situations where it does not matter or could even be harmful to performances (e.g. block producers). It also strikes the right balance between flexibility and intrusiveness as it only requires threading a simple function from the top-level of the call stack down to
applyBlock
.So, we introduce the following continuation:
Where
m
is eventuallyIO
andAuxLedgerEvent l
a multi-era ledger event GADT from thecardano-ledger
. From the consensus standpoint, this bubbles up to theNodeArgs
record object into a new field:The
NodeArgs
is ultimately created by thecardano-node
wrapper, where the consensus, ledger, networking and plutus layers get stitched together.Cardano-node event hook
From there, it becomes straightforward to expose the ledger events in the cardano-node directly by hooking into
handleNodeWithTracers
andrunNode
.At the moment, our proof of concept mounts a superficial server listening on a TCP socket and streaming all events back to a client application. All events are indexed by the slot number and the header hash of the block which "produced" them. This index is necessary to allow clients to detect and deal with chain switches (a.k.a. rollbacks). With such an interface, filtering and aggregating events is thus the client's responsibility.
To make the event handler optional, we've integrated the handler up inside the cardano-node command line with a new optional flag,
--ledger-event-handler TCP/PORT
. When present, the option should point to a port that the cardano-node will bind to and stream events. If omitted, we fall back to the original behaviour of the node, which is to ignore all ledger events.Ledger event serialization
We have to settle on a transport format to write events to a socket/file and stream them to a client. Hence, we've opted for CBOR, which is already pervasive across the Cardano interfaces and suitable for networking serialization. We have, therefore, added CBOR encoders and decoders for most ledger events. This currently lives in a fork of the cardano-node with the intention to eventually live in the cardano-ledger itself.
We have documented the on-the-wire format of all the mapped events using a CDDL specification. We've added descriptions and details from the (outdated) LedgerEvents.md document from the cardano-ledger, as well as from our understanding from reading the ledger rules and asking questions around. We invite the ledger team to review that document.
Use-cases
Explorers: historical rewards
Many tax jurisdictions require Ada holders to keep track of their rewards. As we explained earlier, this currently limits the options in tooling to
cardano-db-sync
, which introduces a massive cost in resources on any machine. Most commercial laptops cannot afford to run both the cardano-node and cardano-db-sync due to their resource usage. For 3rd party services which provide such access, the cost of running this infrastructure is also significant. It also adds complexity and increases the probability of faults. There have been several reports of bugs and inaccuracy in the past due to the logic duplication occurring between the cardano-ledger and cardano-db-sync.By streaming events directly from the node, we reduce complexity, resource usage, and the risk of 'getting it wrong'. The Cardano Foundation is particularly interested in this feature to power its new explorer and a number of similar use cases to help onboard financial institutions.
Mithril: Stake distribution & more
The Mithril signature depends on the knowledge of the stake distribution, which is currently acquired through the
cardano-cli
relevant command. However, as a mithril-signer or aggregator needs access to a cardano-node anyway to know what data to sign, it would make sense for it to interact with the node in a more direct way, e.g using the mini-protocols.Being notified of the stake distribution changes immediately, ideally through a mini-protocol delivering those events, would streamline the integration between mithril and Cardano.
Moreover, while mithril-signers currently sign the full node DB only, providing other data "certified" by a Mithril certificate, like Rewards, is certainly planned.
Long-term vision
With this first proof-of-concept, we've been able to stream events from the ledger to a file and re-compute the entire reward history of some stake credentials using only ledger events. This current proof-of-concept has a few limitations, which we acknowledge:
While it is sufficient to demonstrate the capabilities, we envision this becoming a more robust mini-protocol akin to the local-chain-sync protocol. Implementing such a protocol requires the node to store all ledger events and be able to rapidly search for them using chain points (block header hash + slot number). A chunk-based approach with indexes similar to the one used for the immutable and volatile db could work nicely.
This idea is similar to the one proposed in CIP-0078, though we propose to make an entirely new protocol for the following reasons:
cc @abailly-iohk @koslambrou