Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coupling between transaction and chainhead functions #79

Open
josepot opened this issue Aug 13, 2023 · 44 comments
Open

Coupling between transaction and chainhead functions #79

josepot opened this issue Aug 13, 2023 · 44 comments
Assignees

Comments

@josepot
Copy link
Contributor

josepot commented Aug 13, 2023

Problem Statement

Currently, the only way to send a transaction is through transaction_unstable_submitAndWatch. This method undertakes several tasks:

  • Validates the transaction against the latest finalized block.
  • Broadcasts the transaction.
  • Informs the consumer about the transaction's presence in the best and/or finalized block.

Despite its comprehensive functionalities, a significant challenge arises due to its interactions with another function: chaihead_unstable_follow. This function, belonging to a different group, provides crucial information about the latest finalized block, which is indispensable for creating valid transactions.

However, the inherent separation of these function groups and the lack of synchronization mechanisms pose risks:

  1. Lack of Synchronization: If the transaction function lags behind the chainhead, it could result in users being unable to send transactions altogether.

  2. Delayed Events: Events from the transaction function might be delayed or misaligned with the actual block status, rendering them ineffective for many applications. In such cases, users are forced to fetch block bodies manually and verify transaction presence, adding unnecessary complexity and duplication of efforts.

In essence, while the chainhead group of functions is properly isolated from other functionalities, the same can't be said for the transaction group. Its dependencies on the chainhead group bring up synchronization and information consistency concerns, which are pivotal for seamless transaction handling.

Proposed Solutions and Concerns

1. Tight Coupling of Function Groups

  • Description: Assume that the transaction group of functions is intrinsically linked to the chainhead group. Hence, if transaction is available, so is chainhead. This means that transaction would internally use chainhead.
  • Advantages: Straightforward approach; ensures that users would always be able to send transactions and also ordered message arrival.
  • Drawbacks: Challenges the very essence of function separation for load-balancers; cannot guarantee the presence of both function groups as per the spec.

2. Decoupling transaction from chainhead

  • Description: The only role for the transaction group would be broadcasting a transaction. This would shift validation and block body transaction tracking responsibilities to the client.
  • Advantages: Simplicity; forward compatibility due to its low-level nature.
  • Drawbacks: Heavier responsibility on the client side; potential performance concerns; the API becomes challenging for users less acquainted with chainhead.

3. Merging Function Groups

  • Description: Combine the two groups of functions (transaction and chainhead) into a single group, indicating their close interdependency.
  • Advantages: Fewer events and edge cases.
  • Drawbacks: Nodes may want to avoid transaction support. Allowing only a subset of functions in a group is currently not possible and also undesirable.

4. Superset Approach with transaction

  • Description: Allow the transaction group of functions to encompass the responsibilities of the chainhead group, effectively making transaction a supergroup. Same solution as before, but leaving chainhead untouched.
  • Advantages: Workaround to the function subset challenge.
  • Drawbacks: Duplication.

I understand that these solutions aren't exhaustive and hope they provide insight into the possible ways forward. As a spec consumer, I find it crucial not only to highlight issues but also to offer potential resolutions. Eager to hear your insights on this.

cc: @tomaka

@tomaka
Copy link
Contributor

tomaka commented Aug 14, 2023

Lack of Synchronization: If the transaction function lags behind the chainhead, it could result in users being unable to send transactions altogether.

The reason why transactions functions can lag behind the chainHead functions is in case of a load balancer that redirects transaction functions to one node and chainHead functions to a different node, and that the node handling transaction functions is lagging behind.

But I want to point out that if a node is lagging behind, its chainHead functions will lag behind as well, and the transactions validated using this node might be invalid as a result.

Consequently, coupling transaction and chainHead functions doesn't solve this problem. In fact, nothing on the JSON-RPC level can solve this problem, it's just something to accept.

@josepot
Copy link
Contributor Author

josepot commented Aug 14, 2023

Consequently, coupling transaction and chainHead functions doesn't solve this problem. In fact, nothing on the JSON-RPC level can solve this problem, it's just something to accept.

I concur that if a node's chainhead lags behind, the consumer won't be able to create valid transactions—given they're not basing those on the latest finalized block. This is a fundamental challenge that can't be addressed at the JSON-RPC level, as you rightly noted.

However, my central argument for merging the transaction and chainhead functions revolves around the operational challenges introduced by load balancers. If they are permitted to designate separate nodes for each function, it opens up potential synchronicity issues, especially when transaction_unstable_submitAndWatch implicitly depends on chainhead data.

By merging them:

  1. Unified Node Access: It ensures that both functions draw from the same node, eliminating discrepancies in the node data they rely on.

  2. Direct Synchronization: This would essentially eradicate issues stemming from desynchronized events.

  3. Simplified API: The consumer would have to deal with fewer events and edge-cases (no need for events likedropped, error and all the transaction specific events).

Considering solutions 3 and 4: If we choose the former (merging transaction into chainhead), it addresses the desynchronization risk at its root. However, opting for the latter (making transaction a superset) still exposes us to some potential load balancer allocation issues. However, they wouldn't be as important because most consumers would use either one or the other.

While I acknowledge the limitations you highlighted, I still think that merging the 2 groups of functions (in one way or the other) would be the right thing to do if the transaction functions are going to be implicitly coupled to chainhead.

@jsdw
Copy link
Collaborator

jsdw commented Aug 14, 2023

FWIW I can see the appeal of having eg a chainHead_submitAndWatch type function (ie merging them) if it means that load balancers would be expected to send all chainHead requests to the same node (and maybe that we could have some synchronisation of events produced in submitAndWatch + chainHead_follow).

The events (dropped, error etc) would still be useful I think (to me at least!), for the same reasons as before:

  1. Certain events tell me when I can stop expecting any future events/changes to some transaction state (concretely, in Subxt, this means I can close the Stream of transaction events being handed to the user to track its progress, or stop downloading block bodies if I were taking that route).
  2. The events like finalized and inBlock would (especially with some sort of synchronisation with chainHead_follow ones) still be useful for telling me which block to expect the tx in (yes, I could download all of the block bodies and check that myself, but eg CLI tools or other consumers may just want to report on the state of some submitted tx without needing to check themselves, and if the node is doing this check anyway then why require it to be duplicated on the client).

On these threads I feel like I am in the minority with my position on this, but there you go :)

@josepot
Copy link
Contributor Author

josepot commented Aug 14, 2023

The events (dropped, error etc) would still be useful I think (to me at least!)

Sorry, it seems that I didn't explain myself correctly. It's not that you will no longer have access to these events, it's that they will be encompassed by the already existing events on chainhead_unstable_follow.

For instance the current dropped event reads:

The dropped event indicates that the client wasn't capable of keeping track of this transaction

If we merged transaction into chainhead then we wouldn't need dropped because we already have stop. Similarly, we wouldn't need a specific error event, b/c we would be able to use the already existing operationError event.

  1. The events like finalized and inBlock would (especially with some sort of synchronisation with chainHead_follow ones) still be useful for telling me which block to expect the tx in (yes, I could download all of the block bodies and check that myself, but eg CLI tools or other consumers may just want to report on the state of some submitted tx without needing to check themselves, and if the node is doing this check anyway then why require it to be duplicated on the client).

I mean, sure, we could have some redundant events for the transaction operation... why not? 🤷 I mean, generally speaking: I rather leaving the responsibility of deriving those events to the client. Mainly: because it's trivial to do so, and it has the advantage that it reduces the API surface. However, I wouldn't be opposed to having some "sugar" events if their existence is sufficiently justified.

@lexnv
Copy link
Contributor

lexnv commented Aug 14, 2023

From my perspective, I believe that merging the two classes of functionality would be the cleanest from substrate perspective and user interaction (see my original comment here).

The events produced by the actual transaction_unstable_submitAndWatch would still be valuable to the end user, at least from my view of the chain through the subxt lens.
In the end, the rpc-v2 is a low-level API that provides the details for all types of user interaction. The chainHead follow events could be extended with no extra complexity with the transactionOperationEvent.

I believe having this low-level API would allow us (subxt / capi) to expose nicer APIs that may be more simplistic and still be efficient, we could filter out easily the events of the chainHead.

@jsdw
Copy link
Collaborator

jsdw commented Aug 14, 2023

If we merged transaction into chainhead then we wouldn't need dropped because we already have stop. Similarly, we wouldn't need a specific error event, b/c we would be able to use the already existing operationError event.

Aah, gotcha, yup def makes sense to be consistent if we were to go this route!

I mean, sure, we could have some redundant events for the transaction operation... why not?

If the node is already checking for the tx in blocks, then having it tell the client about the result doesn't feel redundant to me; it's just sharing potentially useful information that it already knows and which you'd otherwise have to work out locally instead, duplicating what the node is doing (which can be easily ignored if not interesting to you). Maybe we'll just have to agree to disagree on their usefulness (but blah blah full node, performance win not downloading block bodies if I don't have to too :))

(I'd also add/question this: downloading block bodies when submitAndWatch can return blocks that chainHead_follow may not know about yet makes sense. But if theses APIs were merged, i'd expect that any block hash submitAndWatch handed back was now pinned and guaranteed to be accessible, and so I'd see no more advantage in block body downloading?)

@josepot
Copy link
Contributor Author

josepot commented Aug 14, 2023

If the node is already checking for the tx in blocks, then having it tell the client about the result doesn't feel redundant to me; it's just sharing potentially useful information that it already knows and which you'd otherwise have to work out locally instead, duplicating what the node is doing (which can be easily ignored if not interesting to you). Maybe we'll just have to agree to disagree on their usefulness (but blah blah full node, performance win not downloading block bodies if I don't have to too :))

Sorry, I mistakenly assumed that you were well acquainted with the proposal that I made on #78 . In that PR I suggested this. In a nutshell: the already existing newBlock event would have a new submittedTransactions field which:

submittedTransactions when the transactions submitted via chainHead_unstable_transaction end up being included in a block, then they will be present in this object. The keys of this object are the operationIds and the values are the index where the transaction is present inside the block.

If we went down that this path (or a similar one), then the events that you are mentioning would certainly be redundant. Although, again, I wouldn't necessarily be opposed to having those redundant events (as long as I get to have the simplest one, of course).

@josepot
Copy link
Contributor Author

josepot commented Aug 14, 2023

Sorry, I only saw your edit after I had already answered.

so I'd see no more advantage in block body downloading

If we went with something like what I was proposing in #78 , then yeah, of course... there wouldn't be a need for the client to download the bodies, no. At least not for the purpose of finding the transaction inside the block, nope.

@lexnv
Copy link
Contributor

lexnv commented Aug 14, 2023

Regarding the events of the tx on the chainHead:

  • dropped

    This can happen for example if the JSON-RPC server's transactions pool is full

    Replacing the tx::Dropped event with the chainHead::Stop for this case will have the side-effect of stopping the block subscription together with the other pending method calls. We could use the operationInaccessible, however, that has the side effect of now knowing if the tx has been broadcasted and could potentially be included by some other peer.
    We could however be explicit in the spec and add a comment that users should rely on prior broadcasted events. If no broadcasted / operationTransactionBroadcasted has been sent, then receiving operationInaccessible means that the tx will not be included.
    I wonder here if we decide to keep the transaction class if we shouldn't remove the broadcasted field from the dropped event (maybe there's more to this that I don't have the context @tomaka).

  • The proposal Integrating transaction into chainhead #78 adds to the chainHead::newBlock event a submittedTransactions array field
    One thing that is missing to keep parity between the transaction API and the chainHead::submitAndWatch is the index of the tx. We could extend the proposal by binding the index of the tx with the operation ID:

"submittedTransactions": [
     {
        "operationID": "...",
        "index": 0
     },
     ...
   ],

Then we could remove the need for the bestChainBlockIncluded event. We'd rely on the newBlock in combination with bestBlockChanged to determine if the transaction is part of a best block. Then, on the newBlock in combination with finalized to check the tx is part of the chain (if the block is finalized) or discarded (if the block of the tx is pruned).

We could adjust the #78 to keep parity between the APIs while offering the guarantee of the pinned block of the transaction.

From a substrate standpoint, synchronization of the TransactionStatus with the generated blocks of the chainHead would introduce a certain level of complexity. This synchronization process should be approached thoughtfully

@josepot
Copy link
Contributor Author

josepot commented Aug 14, 2023

Replacing the tx::Dropped event with the chainHead::Stop for this case will have the side-effect of stopping the block subscription together with the other pending method calls

That's because after looking at the implementation on smoldot I was under the impression that the drop event is a byproduct of the stop event on chainhead_unstable_follow. In which other case would "the client not be able of keeping track of this transaction"? 🤔

The proposal #78 adds to the chainHead::newBlock event a submittedTransactions array field.

No, it doesn't.

It adds a "dictionary", for which the keys are the operationIds and the values the index of the transaction inside the block. Please, pay closer attention.

One thing that is missing to keep parity between the transaction API and the chainHead::submitAndWatch is the index of the tx

Please, please, please read it well 🙂 . This is not correct. Notice the code of the example, in the PR:

{
    "submittedTransactions": {
        "operationIdA": 0,
        "operationIdB": 3,
        ...
    },
}

And it's also in the description:

submittedTransactions when the transactions submitted via chainHead_unstable_transaction end up being included in a block, then they will be present in this object. The keys of this object are the operationIds and the values are the index where the transaction is present inside the block.

@tomaka
Copy link
Contributor

tomaka commented Aug 14, 2023

In which other case would "the client not be able of keeping track of this transaction"? 🤔

The impossibility to download block bodies, or the transactions pool being full. In the case of a full node, in case of a reorg, the transactions pool might need to be filled with older transactions which kick out more recent ones.


I see a lot of arguments in these discussions which I think are wrong but I don't have the energy to answer everything. I'll do a write up in the near future.

@josepot
Copy link
Contributor Author

josepot commented Aug 14, 2023

In which other case would "the client not be able of keeping track of this transaction"? 🤔

The impossibility to download block bodies, or the transactions pool being full.

Awesome! Then I stand corrected! So, it seems that if we went down this path, then we would need a specific event for the chainHead_unstable_transaction operation to communicate this.

I see a lot of arguments in these discussions which I think are wrong but I don't have the energy to answer everything.

😞

I'll do a write up in the near future.

Thanks @tomaka ! Really looking forward to it!

@tomaka
Copy link
Contributor

tomaka commented Aug 14, 2023

Objective and how things work

What we want to do is: construct the transaction, send it out, then detect when it's included in the chain.

Constructing the transaction

In order to construct a transaction, a UI needs to know, amongst other things, a block hash (for mortal transactions), a spec version, transaction version, and metadata hash. All this data should be obtained by the UI through the chainHead functions.

After the transaction's body has been constructed, there exists three different situations:

  • The UI wants to display the fees of that transaction, which is done by calling BlockBuilder_apply_extrinsic. If this is done, the UI is able to attest that the transaction is valid.
  • The UI doesn't know and doesn't attempt to know whether the transaction is valid.
  • The UI can attest that the transaction is valid but without calling BlockBuilder_apply_extrinsic. For example, in the case of balance transfers the UI can simply track the nonce and balance locally and assume that the transactions it generates are valid.

Note that in the case of mortal transactions the block hash found within a transaction do not have to match the block the transaction is valid against. The block hash found within a transaction is meant to be the reference point at which the transaction starts being valid, and, in fact, as far as I can tell, the block hash found within a transaction is completely unused.

It is important to note that a transaction is only ever valid or not against a specific block. If multiple nodes disagree on which block is the best block, they might also disagree on whether a certain transaction is valid. A transaction might be valid ag against a certain block and not the next, or vice versa, or might be valid againt a certain block but not its sibling.

Sending out the transaction

On a light client, submitting a transaction consists in sending the transaction to the full nodes it's connected to.
On a full node, submitting a transaction consists in sending the transaction to the other full nodes it's connected to, and also adding the transaction to a local pool in case the full node is authoring a block in the near future.

A full node that receives a transaction through its JSON-RPC server and adds it to its local pool always checks whether the transaction is valid. This is necessary in order to not fill the pool with invalid transactions.
Similarly, when a full node receives a transaction over the networking, it checks whether it is valid, and if not discards it and might ban the sender of the transaction to avoid spam attacks.
It is important to note that, as mentioned above, nodes might disagree over whether a transaction is valid.
While sending a small number of invalid transactions does not (and must not) result in a ban, repeatedly sending invalid transactions does.

When a full node receives a transaction over the networking, it might also discard said transaction if its local pool is already full. In that situation, no feedback is given to the sender. This is a niche situation but that can realistically happen.
It is important to note that in principle a transaction only ever needs to be sent once to every other node, but, because of this edge case where transactions are silently dropped, we might have to repeat sending the same transaction every now and then.
Furthermore, because peers might maliciously censor transactions or might be buggy, it is a good idea to rotate the peers a transaction sender is connected to and send the transaction to these new peers as well.

Detecting when the transaction is included

In order to detect when a transaction is included, a UI or JSON-RPC server has to download the block bodies of every block that was authored after the transaction has been sent out, and check whether the desired transaction is found in one of the bodies.

It is important to note that it is not possible to (reasonably) be sure that a transaction has been included. For example, one could send out a transaction then lose Internet connectivity for a few minutes during which the transaction might have been included. It is possible to check for example whether the nonce within the transaction has been used, but not whether a specific transaction has been included (except by downloading every single block that was missed, which is not a scalable solution). It might also for example simply be impossible to download block bodies due to peers refusing to respond.

Resolved questions

Should transactions be validated (either by the JSON-RPC client or server or both) before being sent out?

It depends.

If the number of transactions being gossiped in total at the same time by a node is significant (more than 2/3) then yes in order to avoid the node being banned, otherwise it's not necessary.

Should the JSON-RPC server validate the transactions that it receives from the JSON-RPC client?

It depends.

As explained above, if the number of transactions being gossiped at the same time is very small, then no.

Additionally, if the JSON-RPC server trusts the JSON-RPC client to always submit valid transactions, then no.

In other situations (JSON-RPC client untrusted and a large number of transactions), then yes.

How does the JSON-RPC client-server combination that gossips a transaction know when to stop gossiping?

This question is less trivial than it seems.

It is possible, due to the asynchronous nature of blocks gossiping and transactions gossiping, for a full node to receive a transaction from the peer-to-peer networking after this transaction has been included in a block.

In that situation, because we don't want the same transaction to be included a second time, it is assumed that the runtime will designate a transaction as invalid if that transaction has been included in the chain in the recent past.

Thanks to this property, there is no point in gossiping a transaction after it has been included in the chain.

A node that gossips the transaction should therefore stop gossiping at the latest after the transaction has been included in the finalized chain or has become invalid in the finalized chain. This can be determined either by the JSON-RPC server itself, or with the help of the JSON-RPC client.

If a transaction is not included in the chain, can a JSON-RPC client or server know the reason why not?

A transaction might not be included because other nodes consider it invalid, but also because the transaction pools of all other nodes is full, or maybe the chain is simply temporarily overloaded and the transaction will be included later.

Even if a JSON-RPC client or server validates the transaction it sends out, this does not offer a guarantee that other nodes consider the transaction as valid, and thus does not offer a guarantee that this transaction will be included in the chain in the future.

All in all, the answer to the question is no. The best a JSON-RPC client or server can do is send the transaction again after a little while and/or rotate its peers, and hope that it will be included.

On the other hand, if a transaction is invalid against the finalized block, then it can be assumed that it will remain invalid forever (see previous question).

In other words:

  • When will a valid transaction be included in the chain? Impossible to know, potentially never.
  • When will an invalid transaction be included in the chain? Never.

Does the JSON-RPC server used to generate or validate the transaction have to be the same JSON-RPC server that gossips the transaction?

If the JSON-RPC server that gossips the transaction doesn't validate the transaction, then obviously no, as there is no problem. If the JSON-RPC server that gossips the transaction validates the transaction, then also no for the reason explained below.

The JSON-RPC server that watches the chain might think that a transaction is valid while the JSON-RPC server that sends out the transaction might think that it's not, or vice versa. However, this is also the case for all nodes of the peer-to-peer network.

The very vast majority of the time, there will be no mismatch: the transaction will either be considered as valid by both JSON-RPC servers or invalid by both JSON-RPC servers.

If there is a mismatch, it is in practice likely cause by one of the JSON-RPC servers being misconfigured or buggy or lagging behind, in which case there will also be a mismatch between one of the two JSON-RPC servers and the rest of the peer-to-peer network.
In other words, using a unique JSON-RPC server does in no way solve the problem, as that unique JSON-RPC server might also be misconfigured or buggy or lagging behind, and the mismatch will instead be between this unique JSON-RPC server and the rest of the peer-to-peer network.

When it comes to checking whether the transaction has been included, does the JSON-RPC server that follows the chain have to be the same JSON-RPC server that gossips the transaction?

No.

There is no connection whatsoever between gossiping a transaction and receiving the block in which it has been included.

In other words, the chances of finding the transaction in a block is exactly the same whether the chain is watched through the JSON-RPC server that has gossiped the transaction or through a different JSON-RPC server.

(provided that there is no netsplit, but since we're in a situation where the same JSON-RPC client is able to reach both JSON-RPC servers, these two JSON-RPC servers are normally also able to talk to each other)

(also note that this is not really true if the JSON-RPC server is a validator, in which case that validator might generate the block containing the transaction, which is obviously an advantage, but we ignore that possibility because validators aren't supposed to be publicly-available JSON-RPC servers)

Does it make sense for a JSON-RPC server to support sending out (i.e. gossiping) transactions without supporting following the chain?

Not really.

This question is relevant, because sending out transactions can be implemented in a few thousand lines of code (provided transactions aren't validated), while following a chain requires probably at the very least 20k to 30k lines of code.
It is relatively easy to implement from scratch a JSON-RPC server that only supports sending out transactions and not following the chain, and this JSON-RPC server might be able to run on very small embedded devices (like Arduinos).

Even ignoring implementation difficulty, gossiping transactions doesn't require downloading and compiling the runtime of the chain while following the chain does. If a JSON-RPC server supports both, it is still advantageous to not follow the chain and only gossip transactions.

However, if a JSON-RPC server (or a group of JSON-RPC servers behind a load balancer) doesn't support following the chain, then a UI can only generate immortal transactions and can't know whether a transaction has been included in the chain.
This could be useful, but is most likely a rare edge case.

Alternatively, we could imagine a micro-services architecture where some short-lived micro-services follow the chain and generate transactions, then dispatch these transactions to long-lived micro-services that only them send out, or something similar. Also a rare edge case.

Open/controversial questions

If the JSON-RPC server validates transactions that it gossips, should it report to the JSON-RPC client when the validation fails? And should the JSON-RPC client validate transactions?

A JSON-RPC client is expected to only ever submit valid transactions. There is in general no point in submitting invalid transactions except for malicious intents. When it comes to designing the API, we simply ignore situations where the JSON-RPC client intentionally submits invalid transactions.

There are then three possibilities:

  • If the JSON-RPC client continuously (i.e. at each block) validates the transactions that it submits, and that the node used to validate the transaction is the same as the one gossiping it, then validation failures cannot happen.

  • If the JSON-RPC client continuously validates the transactions that it submits, and that the node used to validate the transaction is not the same as the one gossiping it, then validation failures are extremely rare and indicate a high probably of mismatch with the rest of the peer-to-peer network. Mismatches with the rest of the peer-to-peer network are not detectable. See the Does the node that follows the chain when it comes to generating or validating the transaction have to be the same node that gossips the transaction? question above. In other words, reporting validation errors to the JSON-RPC client in that situation is a unreliable detection mechanism against problems such as the JSON-RPC server being misconfigured or lagging behind. Because it's unreliable (the problem might not be detected), I would argue that alone it's not a strong enough reason to report validation errors.

  • If the JSON-RPC client does not or not continuously validate the transactions that it submits then the answer is unclear. If the JSON-RPC server reports validation failures, then the JSON-RPC client could use that to give feedback and return/log some kind of error message.

Put differently, the only situation where reporting validation failures to the JSON-RPC client is useful is if the JSON-RPC client does not continuously validate the transactions it submits.

The vast majority of the time, JSON-RPC clients validate transactions that they submit at least once before submitting them (in order to calculate fees, see first question), and consequently validating them continuously doesn't seem like much more effort to me in terms of implementation.

Should the JSON-RPC server continuously validate the transactions that it receives from the JSON-RPC client?

(this is a revisit of the question above)

If the JSON-RPC server must indicate when validation fails, then obviously yes.

Otherwise, the answer is the same as above: if the number of simultaneously gossiped transactions is small (2 to 3) then no, otherwise yes.
This is also convenient when it comes to the two implementations of the API: light clients typically only gossip a small number of transactions and would like to avoid having to perform the expensive task of validating transactions, and full nodes typically gossip a larger number of transactions and have no problem performing validations.

Should the JSON-RPC server report to the JSON-RPC client in which block a transaction is included?

This is where the problem originally stems from. If the JSON-RPC server does report blocks in which a transaction is included, then these blocks might conflict with the blocks reported by the chainHead-namespaced functions.

This problem can be solved:

  • By tying these block reports to the chainHead namespace (merging namespaces, adding ordering guarantees, etc., it doesn't matter in this context).
  • Or by not reporting blocks in which transactions are included (the question would be answered by no).
  • By supposing that the JSON-RPC client doesn't or doesn't precisely want to know when a transaction is included.

Should the JSON-RPC server follow the chain in order to find whether a transaction is included?

If the JSON-RPC server must report to the JSON-RPC client the block in which a transaction is included (see previous question), then yes.

Otherwise, the JSON-RPC server needs to a way to know when to stop gossiping transactions.
If the JSON-RPC server always validates transactions, then it doesn't need to follow the chain, as it can simply stop gossiping transactions after they become invalid against the finalized block.
Even if the JSON-RPC server does not validate transactions, then it doesn't need to follow the chain, as it can be told by the JSON-RPC client when to stop gossiping. The JSON-RPC client can't use this to abuse from the JSON-RPC server, because the security properties of the server are the same no matter whether a transaction has been included or not.

Conclusion

The paragraphs above can be summarized as:

  • "JSON-RPC client validates transactions" => NOT("JSON-RPC server reports validation failures").
  • (opinionated) NOT("JSON-RPC client validates transactions") => "JSON-RPC server reports validation failures".
  • "JSON-RPC server reports validation failures" => "JSON-RPC server always validates transactions".
  • "JSON-RPC server reports inclusion" => "JSON-RPC server follows chain".
  • (maybe) NOT("JSON-RPC server reports inclusion") => NOT("JSON-RPC server follows chain").
  • "JSON-RPC server reports inclusion" => NOT("JSON-RPC client cares about inclusion") OR "Merge chainHead with transactions".

We can also add:

  • "JSON-RPC client easier to implement" <=> "JSON-RPC client doesn't validate transactions continuously" AND "JSON-RPC server reports when transaction is included".

There are therefore three valid possibilities:

  • The JSON-RPC client continuously validates transactions, the JSON-RPC server doesn't report validation failures, the JSON-RPC server doesn't always validate transactions (it's optional), the JSON-RPC client cares exactly when the transaction is included, the JSON-RPC doesn't report inclusions.
  • The JSON-RPC client continuously validates transactions, the JSON-RPC server doesn't report validation failures, the JSON-RPC server doesn't always validate transactions (it's optional), the JSON-RPC client doesn't care exactly when the transaction is included.
  • The JSON-RPC client is easier to implement, doesn't continuously validate transactions, the JSON-RPC server reports validation failures, the JSON-RPC server always validates transactions, the JSON-RPC server reports when transactions are included, the JSON-RPC client doesn't care about exactly when a transaction is included.
  • The JSON-RPC client is easier to implement, doesn't continuously validate transactions, the JSON-RPC server reports validation failures, the JSON-RPC server always validates transactions, the JSON-RPC server reports when transactions are included, we merge the chainHead namespace with transactions.

My PR (#77) picks options 1 and 3 in this list.
@Josep's PR (#78) proposes to merge chainHead with transactions but this is IMO unnecessary. You can see in the logic above, that "Merge chainHead with transactions" is just the end of an OR after an =>.

Overall I'm still in favor of my PR (#77).

@josepot
Copy link
Contributor Author

josepot commented Aug 14, 2023

⚠️THIS MESSAGE WAS POSTED AT THE SAME TIME AS THE PREVIOUS ONE (A FEW SECS LATER). IT'S NOT A REPLY. ⚠️

While I didn't intend for this discussion to evolve into an indirect review of #78, it seems we've gravitated towards that. I’d like to reiterate that the PR was premature, and I acknowledge that.

One significant oversight in that PR is the expectation that a light-client can promptly confirm transaction presence upon emitting the newBlock event. That was my mistake.

Yet, I believe there's an opportunity to design an API providing more granular insights into the light-client's transaction verification processes in real-time. Here's a conceptual idea (not a formal proposal): As the light-client examines each block body, it might relay which blocks have been scrutinized for pending transactions and its findings up to that moment. This could be in the form of events being emitted post-evaluation of block bodies, indicating the discovery or lack thereof of ongoing transactions.

Such granularity might mean we reach the finalized block stage without conclusive knowledge of the presence of certain transactions. However, it equips clients with an understanding: the client knows that it still can't be known due to the light-client's ongoing verification, and they can adjust their operations based on this clarity.

I recognize that the approach in #78 might be seen as rudimentary, but my hope is for the core intent behind that proposal to resonate.

A special thank you to @tomaka for your patience and collaboration.

@josepot
Copy link
Contributor Author

josepot commented Aug 15, 2023

Thanks, @tomaka, for the detailed write-up. I've gone over it multiple times to fully understand.

With the newfound context from the earlier issue, I must clarify a pivotal realization I had only recently. My primary concern, now clearer than before, is understanding in "real-time" which blocks the light-client (or the JSON-RPC server) has assessed for pending transactions. In essence, if I could have this insight, it would satisfy my main requirement. I regret not having articulated this from the start—it's only now that I truly recognized this as the crux of my needs.

Furthermore, I'd like to apologize for any instances where I may have conflated my needs as a consumer of the API with operational requirements. I appreciate your patience in helping clear up these distinctions.

Given this clarified stance, the questions remain: Is it feasible to convey this information to the client? If not, what alternatives exist? And if yes, how best to relay it?

If conveying block-scrutiny details isn't feasible, I can settle with just having #77, with one request: the inclusion of broadcasted events in transaction_unstable_broadcast.

If it is feasible, two options come to mind:

Option 1: Merging transaction with chainhead

An integrated approach might yield a simpler API. We could:

  • Introduce chainHead_unstable_transaction for transaction submission.
  • Retain all (or most of the) existing events of transaction_unstable_submitAndWatch as "operation events".
  • Include a transversal chainHead_unstable_follow event to highlight scrutinized blocks and their corresponding transactions.
{
  "event": "txCheckedBlocks", // Ok, sure! Probably not the best name... :-)
  "blocks": [
    {
      "blockHash": "0x0000000000000000000000000000000000000000000000000000000000000000",
      "submittedTransactions": {
        "operationIdA": 0,
        "operationIdB": 3,
        ...
      },
    },
    {
      "blockHash": "0x0000000000000000000000000000000000000000000000000000000000000001",
      "submittedTransactions": {}, // nothing was found on this block, but now we know that it has been scrutinized
    },
    ...
  ]
}

Option 2: Keeping transaction separate from chainhead

Alternatively, we can maintain separation and append an event to transaction_unstable_submitAndWatch indicating the scrutinized blocks:

{
    "event": "blocksScrutinized", // Ok, sure! Probably not the best name... :-)
    "blocks": {
        // this block has been scrutinized and the transaction is not present, so the value is `null`
        "0x0000000000000000000000000000000000000000000000000000000000000000": null,
        // this block has been scrutinized and the transaction is present at index `2`
        "0x0000000000000000000000000000000000000000000000000000000000000001": 2,
    }
}

This method seems superior as it seems better suited for covering all different kinds of users and it's also compatible with the inclussion #77. To clarify: I wouldn't expect for transaction_unstable_broadcast to emit these events, of course not! What I mean is that adding this event into transaction_unstable_submitAndWatch would be something completely independent from #77. So, we could have both.

I must stress that I've derived these options based on your feedback and my initial bias towards merging, which upon reflection, is almost certainly not in the best interest of all users.

I deeply appreciate your patience, @tomaka, and your dedication to understanding and addressing our concerns.

@tomaka
Copy link
Contributor

tomaka commented Aug 15, 2023

Is it feasible to convey this information to the client?

Everything is feasible.
We don't want to constraint the API just because some things are harder to implement (well, with some exceptions such as #66).

understanding in "real-time" which blocks the light-client (or the JSON-RPC server) has assessed for pending transactions

Keep in mind that it's not because a transaction is found within a block that it won't also be found in the parent of the block.

@josepot
Copy link
Contributor Author

josepot commented Aug 15, 2023

understanding in "real-time" which blocks the light-client (or the JSON-RPC server) has assessed for pending transactions

Keep in mind that it's not because a transaction is found within a block that it won't also be found in the parent of the block.

I'm well aware of this 😉. At the same time, I can totally see why you felt compelled to remind me of this 🙂 - so, thanks!

Is it feasible to convey this information to the client?

Everything is feasible. We don't want to constraint the API just because some things are harder to implement (well, with some exceptions such as #66).

Awesome! I was asking because it's something that I hadn't brought up before your write-up, and since it's well stablished by now that I'm capable of making wrong assumptions... 😅

So, what are your thoughts on the addition of the blocksScrutinized event?

@tomaka
Copy link
Contributor

tomaka commented Aug 15, 2023

The principle sounds good to me 🤷
The details I'm not sure exactly.

@josepot
Copy link
Contributor Author

josepot commented Aug 15, 2023

The principle sounds good to me 🤷

Awesome!

The details I'm not sure exactly.

Got it! For sure, the snipped that I suggested was a just a draft... I certainly hope that we can -at least- come up with a better name for the event 🙈

Ok, so -as far as I'm concerned- we have a plan of action:

Once both PRs are merged, we close this issue 🎉 .

@jsdw @lexnv are we all in agreement that this is the best way to go? 🤞

@tomaka
Copy link
Contributor

tomaka commented Aug 15, 2023

Let's reopen #77, discuss in there the possible convenience of the broadcasated events and get it merged.

I don't think we need to merge #77 then?

The only reason why #77 is a good idea is if we want to have servers that implement transaction but not transactionWatch, but I feel like transactionWatch is too useful to not be implemented.

@josepot
Copy link
Contributor Author

josepot commented Aug 15, 2023

Let's reopen #77, discuss in there the possible convenience of the broadcasated events and get it merged.

I don't think we need to merge #77 then?

The only reason why #77 is a good idea is if we want to have servers that implement transaction but not transactionWatch, but I feel like transactionWatch is too useful to not be implemented.

I suppose that's true, yes. I surely wouldn't use it.

I thought that perhaps it was a good idea to have it there, because it could be potentially beneficial to other users... However, you are right, I can't think of a single use-case in which the consumer would need it. So, if and when that need arises we can revisit its addition. Let's leave it out for now, yes 💯

So, @jsdw @lexnv are we all in agreement that the addition of this new TBN event is the best way to go? 🤞

@jsdw
Copy link
Collaborator

jsdw commented Aug 16, 2023

So, bestChainBlockIncluded fires once the transaction is seen in a current best block. finalized fires when it's in a finalized block (and we're done after this).

Is the idea that this blocksScrutinized event fires each time the server checks any new blocks looking for the transaction? Would these assumptions of mine about it be correct?:

  • The blocks that it informs about may not be in any particular order (just depends on which bodies are downloaded first)
  • And so eg it may report the transaction being found in a child block before later finding it in the parent block (and then the best/finalized events would speak of the parent block)
  • This blocksScrutinized event would be fired prior to the best/finalized ones, just to indicate the blocxks that have been checked. Then, best/finalized events may follow straight after if the block is a best/finalized one.

In principle I have no objection to this sort of event, since it's just letting the client know a little more about what's happening, which doesn't feel like a bad thing! @josepot I'm curious to know what use case you actually have for such an event though? I guess you want to know ASAP which blocks don't contain a tx so you can unpin them sooner?

It also makes me wonder a bit on the overlap between bestChainBlockIncluded/finalized/blocksScrutinized events. Eg I could see an event like this existing just to note which blocks have been checked and dont contain it like:

{
    "event": "notInBlock",
    "hash": "0x00000001"
}

Perhaps that gives the information needed without having any overlap with the best/finalized events (and we don't have to think about tx in parent+child blocks or whatever).

But anyway, it's all good for me (I made a lot of guesses above so I may well be way off the mark) :)

@josepot
Copy link
Contributor Author

josepot commented Aug 16, 2023

Is the idea that this blocksScrutinized event fires each time the server checks any new blocks looking for the transaction?

Yes

Would these assumptions of mine about it be correct?:

  • The blocks that it informs about may not be in any particular order (just depends on which bodies are downloaded first)

correct

  • And so eg it may report the transaction being found in a child block before later finding it in the parent block (and then the best/finalized events would speak of the parent block)

correct

  • This blocksScrutinized event would be fired prior to the best/finalized ones, just to indicate the blocxks that have been checked. Then, best/finalized events may follow straight after if the block is a best/finalized one.

I assume that you must be referring to the best/finalized events produced by transactionWatch, since it's already been stablished that we can't generate any guarantees between the events of different function groups. So, if that's the case then my answer to that question is that I couldn't care less. I mean, it would probably be the case, b/c probably most implementations would behave like that... I guess? However, I'm not requesting for any guarantees in that regard, because I really couldn't care less. So, in other words: no guarantees about this should be offered or needed.

I mean, just to be clear: the server should try to communicate these "scrutinized" events to the client ASAP. However, let's say that the server is batching the messages that it plans to send to the client in the next batch, then I wouldn't expect for the messages of that batch to arrive in any particular order. As long as the server sends them as soon as it's technically feasible (without having to sacrifice performance), then I'm cool and I don't expect them to come in any particular order.

All that being said: we can discuss about the possible convenience of these guarantees in incoming PR.

In principle I have no objection to this sort of event, since it's just letting the client know a little more about what's happening, which doesn't feel like a bad thing!

Amazing! 🙌

@josepot I'm curious to know what use case you actually have for such an event though?

I think that I have explained this at length across multiple comments in different issues and PRs... I really don't have the energy right now to explain it again.

I guess you want to know ASAP which blocks don't contain a tx so you can unpin them sooner?

Partially... yes.

It also makes me wonder a bit on the overlap between bestChainBlockIncluded/finalized/blocksScrutinized events.

Please, don't assume that there is necessarily an overlap. This event should only be useful for those users who want to combine the information from this event with the info that they are getting through chainhead. It's just a "nice to have" so that -in most cases- they don't have to do again work that has already been done by the server. That's all it is, really.

Eg I could see an event like this existing just to note which blocks have been checked and dont contain it like:

{
    "event": "notInBlock",
    "hash": "0x00000001"
}

No, this alone wouldn't cover my needs, no.

Perhaps that gives the information needed without having any overlap with the best/finalized events (and we don't have to think about tx in parent+child blocks or whatever).

It doesn't.

But anyway, it's all good for me

Awesome!

(I made a lot of guesses above so I may well be way off the mark) :)

A little bit 🙂

@jsdw
Copy link
Collaborator

jsdw commented Aug 16, 2023

I think that I have explained this at length across multiple comments in different issues and PRs... I really don't have the energy right now to explain it again.

Fair enough! The reason I ask is that I can't see much use myself for this new event. Because as I understand it, it will either report the TX in a block (but this may not be the eventual block that best/finalized will report it in, so it's not something I think I would react to) or it will report that it's not in a block (which is the thing that I can see some use for; helping to know if I can unpin things sooner).

But for this reason I also don't care much either way (and I appreciate that more information from the server is often good) so if you and @tomaka can see the utlity in it then no objections from me :)

@lexnv
Copy link
Contributor

lexnv commented Aug 16, 2023

If I understand correctly, option 2 introduces the "blocksScrutinized" event, containing various block hashes.

When the server responds, it includes block hashes within the "blocks" field / hash map. Optionally, an index is provided if the transaction exists in the block.

As long as the server sends them as soon as it's technically feasible (without having to sacrifice performance), then I'm cool and I don't expect them to come in any particular order.

In case servers choose to send individual events, might this resemble a subset version of the chainHead::NewBlock event?

According to the information shared at #77 (comment), my understanding is that CAPI's aim is to promptly notify users about transactions as soon as a block is encountered. Could you kindly confirm if my interpretation aligns with yours, @josepot? However, in certain situations there might be potential delays with the "blocksScrutinized" event, causing it to fall out of sync with the reported blocks in the "chainHead."

To address this gap more effectively, users have the option to utilize transaction_submitAndWatch, allowing them to discontinue event tracking once the "Broadcasted" event occurs. Additionally, it might prove more efficient to inspect the content of each block produced in the "chainHead" for transactions, rather than waiting for events from an alternative subscription.

Wouldn't it be beneficial to introduce a similar version of transaction_broadcast instead? Which is effectively a subset of transaction_submitAndWatch?

@josepot
Copy link
Contributor Author

josepot commented Aug 16, 2023

In case servers choose to send individual events, might this resemble a subset version of the chainHead::NewBlock event?

the specifics about this event have been deliberately omitted so that we can discuss them in detail in the incoming PR... Whether the server is allowed to report many blocks at once in the same event, whether the payload should be an array or a dictionary, the name of the event, the guarantees, etc are things that IMO should be ironed out in the PR for that event.

According to the information shared at #77 (comment), my understanding is that CAPI's aim is to promptly notify users about transactions as soon as a block is encountered.

Partially... yes.

Could you kindly confirm if my interpretation aligns with yours, @josepot?

That is one of the reasons (not the only one).

However, in certain situations there might be potential delays with the "blocksScrutinized" event, causing it to fall out of sync with the reported blocks in the "chainHead."

I am well aware of that... That's why in my previous comment I said that: "It's just a "nice to have" so that -in most cases- they don't have to do again work that has already been done by the server. That's all it is, really."

To address this gap more effectively, users have the option to utilize transaction_submitAndWatch, allowing them to discontinue event tracking once the "Broadcasted" event occurs. Additionally, it might prove more efficient to inspect the content of each block produced in the "chainHead" for transactions, rather than waiting for events from an alternative subscription.

Sigh... Thanks for you "insight", but I wouldn't be "waiting for events". Instead, I would be checking if they happened in the past, so that by the time that a block becomes the best block or the finalized block (via chainhead), I may already have that knowledge. The vast majority of the times I will have been able to get that insight before the block becomes best/finalized and I won't need to do anything else. The other times I will have to start a race between the call that tries to find that tx in the body of the block and the possible event that could come while that's happening... (yes, a race: if the event comes before the call, I would cancel the operation of the call. Please, keep in mind that there is a limit to the number of concurrent operations that can be performed, so they shouldn't be abused).

Then there is the whole set of edge-cases related to the "discontinuity" of the tracking of finalized blocks, which -despite the fact that I already have a really good solution for- I really don't think that I should have the need to explain in detail right now.

Please, I have been developing complex applications that work with real-time data since way before I joined parity. So, please: unless you really understand the problem that I'm trying to solve, please stop trying to tell me how I should be solving it.

Wouldn't it be beneficial to introduce a similar version of transaction_broadcast instead? Which is effectively a subset of transaction_submitAndWatch?

No, as Pierre already pointed out: it wouldn't.

@josepot
Copy link
Contributor Author

josepot commented Aug 16, 2023

I think that I have explained this at length across multiple comments in different issues and PRs... I really don't have the energy right now to explain it again.

Fair enough! The reason I ask is that I can't see much use myself for this new event. Because as I understand it, it will either report the TX in a block (but this may not be the eventual block that best/finalized will report it in, so it's not something I think I would react to) or it will report that it's not in a block (which is the thing that I can see some use for; helping to know if I can unpin things sooner).

Thank you for your response. I understand your perspective and I appreciate your patience.

I do agree that Subxt and the new CAPI have very different goals, and it seems our conversations sometimes end up in circles because we're trying to solve different problems. I'm sorry if my explanations have been repetitive, but I'm eager to break the current cycle of: "we can't be light-client first because the new JSON-RPC spec it's not stable" <-> "the new JSON-RPC spec is not stable because we are not using it." As a developer working on the "new CAPI" (a light-client first library which tries to break this vicious cycle), I have to address problems and constraints that might not yet be on the radar for Subxt (a library which is waiting for this vicious cycle to be broken).

I understand that you don't see much use for this new event, and you've made it clear that being light-client first is not Subxt's primary concern ATM. I respect that, and I want to clarify that I'm not trying to convince you to change your approach. However, when we discuss these changes, it feels like my points get brushed aside because Subxt is not able to see things from a light-client first perspective. This is quite frustrating, particularly when what I'm proposing has no impact on Subxt and could be beneficial for other projects.

Your interpretation of the event reporting is partially correct, but the complete picture is more nuanced. I have tried to explain it multiple times, and perhaps I've failed to communicate the full implications. I believe that the disconnect stems from our different focuses: Subxt's traditional RPC approach versus CAPI's light-client first requirements.

I would appreciate it if, in future discussions, we could respect the fact that we're coming from different perspectives and might not fully understand each other's needs. I kindly ask that you acknowledge that our goals are different, and what might not make sense for Subxt could be critical for CAPI.

Given that these proposed changes have no impact on Subxt and could potentially benefit other projects, would it be possible for you to simply say something like, "I don't mind, go for it!"? It would make my work on CAPI more straightforward and less frustrating, and I'd be able to focus more on making progress.

@lexnv
Copy link
Contributor

lexnv commented Aug 17, 2023

Hey Josep,

Thanks a lot for your patience and for sharing your perspective regarding this!

Indeed, at the moment Subxt and CAPI are different in terms of implementation. That is to say, subxt's implementation is heavily based on the traditional RPC APIs. We are making progress towards offering a common API that handles under the hood both the new RPC spec V2 and the traditional RPCs.

The RPC spec V2 will be supported by both the light client and the substrate servers. My goals are to understand the spec better to efficiently implement it in substrate and to get a clearer picture of what would need to be done there; while also keeping in mind user interactions (may that be from a biased subxt perspective).

@josepot Since these new events could be beneficial for CAPIs use cases and could be filtered out by other users, would you be up for creating a PR to add them?
I believe @tomaka has a broader view on these topics (which ensures we'll deliver the best possible API -- which is what we all want in the end :D) and it might be easier to continue the discussion on that PR instead.

From my perspective, I would like to learn and understand more about these events since we could probably take a similar approach with Subxt and CAPI.

Again, thanks for your patience as I understand these discussions could cause frustration sometimes, and look forward to making progress here! 👍

@jsdw
Copy link
Collaborator

jsdw commented Aug 17, 2023

But anyway, it's all good for me (I made a lot of guesses above so I may well be way off the mark) :)

But for this reason I also don't care much either way (and I appreciate that more information from the server is often good) so if you and @tomaka can see the utlity in it then no objections from me :)

@josepot I am indeed fine for these things to be added, as I said above a couple of times. I also very much do appreciate and understand the fact that we have different perspectives.

My goal in asking any further questions has simply been to try and understand your perspective and use cases. This obviously hasn't worked out in this case. As long as somebody else does, it's all good for me! As you say, this proposed event addition doesn't adversely impact Subxt.

@tomaka
Copy link
Contributor

tomaka commented Aug 18, 2023

I thought I'd do a PR like requested in #80, but I fail to understand how adding the searched events would solve anything. I had just assumed it would without thinking about it.

The searched event would only give you the blocks that were searched after the transaction has been gossiped, not the blocks that weren't searched because they were too early. Furthermore, the chainHead subscription might give you blocks that the transactions subscription doesn't even search, making you unable to unpin them.

If the chainHead subscription and the transactions subscription come from two different nodes, these two nodes can in principle have two completely different sets of non-finalized blocks, meaning that you couldn't unpin any of the blocks reported by chainHead.

To me the only solution to your problem remains #77

@josepot
Copy link
Contributor Author

josepot commented Aug 19, 2023

The searched event would only give you the blocks that were searched after the transaction has been gossiped

perfect!

not the blocks that weren't searched because they were too early.

Also perfect!

Furthermore, the chainHead subscription might give you blocks that the transactions subscription doesn't even search,

Whether the transactions subscription searches for those blocks or not is secondary. The primary objective is to identify the finalized blocks and their relevance to the transaction in question.

making you unable to unpin them.

I'm unsure where the confusion arises. Even if the chainHead gives blocks that the transactions subscription doesn't search, I'll still have the ability to unpin them. Their presence or absence in the transactions subscription doesn't impede this.

If the chainHead subscription and the transactions subscription come from two different nodes, these two nodes can in principle have two completely different sets of non-finalized blocks,

My primary concern is the finalized blocks. Non-finalized blocks, regardless of their origin or difference, aren't relevant to the problem I'm addressing.

meaning that you couldn't unpin any of the blocks reported by chainHead

Again, I'm unclear as to the rationale here. I'll certainly have the ability to unpin those blocks.

To me the only solution to your problem remains #77

The assertion that #77 is the "only solution" is quite a bold one.

My problem:

I will try to explain -once again- what I'm after. I'm looking for a specific functionality, and I'll break it down simply:

Whenever a block is finalized, I need to determine the status of an ongoing transaction. Specifically:

  1. Whether it's within the finalized block body (and its index if it is).
  2. Whether it's absent from the finalized block.
  3. Whether the server hasn't yet determined this.

From your explanation, it seems that (with the searched event) the third situation will be a rare occurrence, which is great. However, even if it does happen more often, it's still better than the scenario in issue #77. Why? Because if option 3 arises, the searched event gives me an opportunity to race between fetching the finalized block body and the potential searched event occurrence for the finalized block. This approach can be more efficient and possibly reduce the time an operation slot is occupied. This is specially relevant when transaction and chainhead are implemented by different servers. Hence, the chainhead server will most likely not have the body of the finalized block around. However, the transaction server has either already requested it or it already has it, and it's about to emit the searched event.

Also, I'd like to point out that consuming an operation slot just in the precise moment when a block gets finalized is not ideal. This is the time when the consumer is keen to commence operations to gauge how the finalized block impacts the app's state.

Lastly, I'm unsure where the idea came from that I won't be able to "unpin" unrelated blocks. I'll handle them just as I do with blocks not in the finalized chain. The purpose of this "nice to have" event is to swiftly inform the user about the transaction's status in the finalized block. Prompt feedback means users can decide on any subsequent actions faster. Plus, the quicker I understand the user's interaction with that block, the sooner I can determine if it's ripe for unpinning.

Addressing Specific Queries:

1. Is the notInBlock event enough?

While some might think the notInBlock event would suffice, here's why it won't:

  • Comprehensive Information: If the server can notify about a transaction not being in a block, logically, it should also be able to notify if it's present. Providing half the information when the full context could be given doesn't make sense.

  • Avoiding Race Conditions: Let's imagine a scenario: The server identifies that a transaction is in a block and that block will become finalized within the next few seconds. If notified right away, I could update my pinned blocks with this new info and by the time that chainhead emits its finalized event I will already know that block has the transaction in it. However, with only a notInBlock event, I will have to deal with the unnecessary races conditions between the finalized events of transaction and chainhead. It's just unnecessary complexity.

2. "the only solution to your problem remains #77"

Solution #77 has significant drawbacks:

  • Operational Overheads: Fetching and examining the bodies of all new blocks as they arrive is resource-intensive. This approach would hog operation slots which are taken away from consumers. Besides, there's a more efficient external process that can perform this task better. Generally speaking, specially in javascript, it's always better to offload these kinds of tasks into a separate process. So, if it can be done by the server, that's always better.

  • Unwanted Waterfall Effect: Since I can't scan all possible block bodies, I will have to resort of scanning the bodies of the finalized blocks, therefore creating a cascade effect. Consumers have to delay certain operations until they ascertain whether a transaction exists in a finalized block. This lag affects how quickly the end user receives information about that block, which isn't ideal.

  • Potential Server Disparity: Remember, chainhead and transaction could be handled by different servers. If I request block bodies from a server without them, it's counterproductive. It seems that the best-suited server for this request should be the server implementing the transaction group, since it's the one that is downloading the bodies for scanning them. However, if the groups remain separated, that's not possible. Reflecting on this Rename the existing transaction to transactionWatch and add a new different transaction namespace #77 loses its appeal. Instead, I'd prefer using transaction_submitAndWatch. On the "off chance" that the transaction and chainhead functions are hosted on the same server, at least I'd know that the block body I'm inquiring about is either cached or previously requested.

In conclusion, the searched event remains ideal to me for the reasons outlined, and the proposed solution #77 doesn't really bring anything to the table. So, let's disregard that option.

Reflections on the Rejection of the searched Event:

  • The searched event was, in my view, a pragmatic middle ground. While not the ideal solution, it was a "good enough" answer to my challenges.

  • My preferred solution? The merge of chainhead and transaction. In my opinion, this integration aligns best with user needs. Pierre has done a great job at explaining why it's technically feasible to keep these two groups of functions separate, but that doesn't negate the fact that it complicates the spec's consumption. And let's be honest, it's not a straightforward spec to begin with.

  • I designed the searched event to be universally beneficial and non-intrusive. It was set to "off" by default, emitting only when specifically requested during a transaction's initiation. I proposed a backward-compatible update to an API labeled "unstable," quite ironically, a flag indicating its evolution based on user feedback.

  • When I introduced Issue transaction: add searched event #80, @tomaka was still receptive to such an event. However, his emphasis was on blending it with other events. I suspect that, upon trying to integrate these events, he might've faced the challenging crossroads of merging chainhead and transaction. Perhaps this led to his shift in stance, favoring Rename the existing transaction to transactionWatch and add a new different transaction namespace #77 over the searched event.

I've come to terms with the realization that I might need to work within the confines of the current spec. But please understand: my acceptance stems from exhaustion, not endorsement of the spec's adequacy.

@tomaka
Copy link
Contributor

tomaka commented Aug 21, 2023

My primary concern is the finalized blocks. Non-finalized blocks, regardless of their origin or difference, aren't relevant to the problem I'm addressing.

That's what is extremely surprising to me. Being able to show in a UI when a transaction is included in the best chain, instead of only when it's finalized, seems like a critical feature to me.

If you only care about the finalized block, then half of this discussion is off-topic. Most of the discussion points are relevant only because nodes disagree on the non-finalized chain, whereas the finalized chain can be assumed to always be the same for everyone.

I will try to explain -once again- what I'm after

You have never actually explained anywhere what you were trying to do. You jump directly to the problems that you are facing.

Solution #77 has significant drawbacks:

The only possible drawback of #77 that is valid is that it makes the client-side implementation more complex. #77 strictly consists in stripping down functionalities from the JSON-RPC server.
Any drawback to #77 that you mention also applies to not having #77, the only difference being that you don't see them as a JSON-RPC client implementer.

@josepot
Copy link
Contributor Author

josepot commented Aug 21, 2023

My primary concern is the finalized blocks. Non-finalized blocks, regardless of their origin or difference, aren't relevant to the problem I'm addressing.

That's what is extremely surprising to me. Being able to show in a UI when a transaction is included in the best chain, instead of only when it's finalized, seems like a critical feature to me.

It might be surprising, but while indicating a transaction's presence in the best-block is valuable, it's not critical. The primary utilities of such information are for optimistic dApp updates (that may need to be reverted) and showing intermediate transaction states. These are beneficial but not vital. Hence, while CAPI will strive to provide this insight, its absence isn't detrimental. And, as you're likely aware, the searched event aids this secondary function.

You have never actually explained anywhere what you were trying to do.

From the onset, my issue title:

explicit guarantees on the finalized event of submitAndWatch

explicitly emphasized the finalized event. I believed I elaborated my concerns and the importance of this information during the block's finalization in various comments. Apologies if that wasn't clear enough. However, it's perplexing to see assumptions made about my problem, especially when they culminate in the dismissal of the searched event and the assertion that #77 is the sole solution. I'd be interested to know which parts of my earlier comments informed such conclusions.

Solution #77 has significant drawbacks:

The only possible drawback of #77 that is valid is that it makes the client-side implementation more complex. #77 strictly consists in stripping down functionalities from the JSON-RPC server. Any drawback to #77 that you mention also applies to not having #77, the only difference being that you don't see them as a JSON-RPC client implementer.

Agreed, the primary concern with #77 is the added complexity on the client-side. But we should tread carefully. Relying heavily on the argument that client-side implementations can manage everything is a slippery slope. Where is the boundary when most tasks could potentially be handled by the client? In my view, if a function requires an external process for effective client-side implementation, it's better suited for the "server".

@tomaka
Copy link
Contributor

tomaka commented Aug 21, 2023

Where is the boundary when most tasks could potentially be handled by the client?

The boundary is actually pretty clear: if there are multiple different ways to implement something that have different trade-offs, and that which way is the best depends on the use case, then it should be on the client side. In other words, the server should be as un-opinionated as possible and just follow instructions.

According to this definition, searching for transactions in block bodies should clearly be a client-side process, because there are multiple different ways of searching for the transaction (all forks/only best fork/only finalized, accept/reject trees where some bodies are unknown, etc) that have different trade-offs in terms of bandwidth, latency, and amount of information provided.

If you only care about finalized blocks, you can actually do a more efficient search than what smoldot is doing right now, as you can search only in blocks that are children of the latest finalized block, which overall reduces the chances that you'll search in a block that will later not be finalized, and thus reduce the number of block bodies that are downloaded (by 12.5% if I'm not mistaken).

The only reason why I think it's not a bad idea to look for solutions other than #77 is because I understand that this significantly complicates the client-side. But if we follow the guiding principles behind which this entire JSON-RPC API has been built around, #77 naturally follows.

@josepot
Copy link
Contributor Author

josepot commented Aug 21, 2023

The only reason why I think it's not a bad idea to look for solutions other than #77 is because I understand that this significantly complicates the client-side. But if we follow the guiding principles behind which this entire JSON-RPC API has been built around, #77 naturally follows.

I agree that exploring alternatives to #77 seems prudent. And to clarify, I've never been against #77, but I don't see it as the sole solution for my concerns. Are you suggesting that there's been a change in perspective regarding the searched event?

If you only care about finalized blocks, you can actually do a more efficient search than what smoldot is doing right now, as you can search only in blocks that are children of the latest finalized block, which overall reduces the chances that you'll search in a block that will later not be finalized, and thus reduce the number of block bodies that are downloaded (by 12.5% if I'm not mistaken).

On the topic of efficient block searches: I might be overlooking something fundamental here. Outside of disruptions like disconnections (or a similar situation that makes it impossible for the server to have continuity on the current chainhead), is there a scenario where the best block isn't a descendant of the latest finalized block? My understanding was that when a block gets finalized, it might not directly follow the previous finalized block due to potential intermediary finalized blocks. Shouldn't the same rationale apply to the best-block?

In the absence of anomalies like disconnections, why would smoldot look for a transaction in blocks that aren't descendants of the most recent finalized block? It's not a critique of smoldot's approach; I'm genuinely trying to clarify a potential gap in my understanding. Could it be related to real-time uncertainties causing such "discontinuities"?

@tomaka
Copy link
Contributor

tomaka commented Aug 21, 2023

Are you suggesting that there's been a change in perspective regarding the searched event?

I initially said yes to the principle of the searched event because you said that it solved the problem and because it would be a bit more refined than what you proposed. #80 is clearly just a hacky solution and not a proper design.

It does solve the problem if you only care about finalized blocks, but it doesn't if you also care about in which best block the transaction is included. As I've said, I'm extremely surprised that you don't care that a transaction has been included, and I'm very confident that this is something that you will add (or asked to add) later, because to me it's a pretty critical thing.

The API of transactions_watch clearly has a defect, and I don't like the idea of monkey-patching that defect for one specific use case of the API while leaving the problem open for other use cases.

In the absence of anomalies like disconnections, why would smoldot look for a transaction in blocks that aren't descendants of the most recent finalized block? It's not a critique of smoldot's approach; I'm genuinely trying to clarify a potential gap in my understanding. Could it be related to real-time uncertainties causing such "discontinuities"?

I said "children of the latest finalized block", and I think you read "descendants" instead of "children".

Smoldot tries to report as soon as possible when a transaction is in the chain, so as soon as block appears at the head of the chain smoldot tries to download its body. Unfortunately, 12.5% (I think) of the time this block will end up not being finalized. If instead you wait a little bit then this problem disappears.

@josepot
Copy link
Contributor Author

josepot commented Aug 21, 2023

The API of transactions_watch clearly has a defect, and I don't like the idea of monkey-patching that defect for one specific use case of the API while leaving the problem open for other use cases.

I share your concerns about the searched event. While it aims to address a specific use case, its implementation feels like a patch rather than a holistic solution. Given our mutual understanding of the issue, I'd like to offer two alternatives that might provide a more comprehensive answer:

  1. Separate, Yet Enhanced Tracking: Maintain distinct transaction and chainhead groups, but introduce Rename the existing transaction to transactionWatch and add a new different transaction namespace #77. Alongside this, we can introduce a new operation like chainHead_unstable_searchTransaction. This would permit users to scan for a specific transaction within the blocks generated by chainhead. Notably:

    • Consumers can manually halt this operation.
    • If left running, the server should automatically cease producing events once the transaction reaches a finalized block.
    • The server retains the capability to issue a dropped event, covering scenarios where it can't effectively monitor the transaction.

    Furthermore, under this scheme, I would argue for the deprecation of transaction_submitAndWatch. It might offer a more elegant and consistent experience, though retaining it wouldn't be a deal-breaker if there's strong sentiment towards its utility.

  2. Unified Approach: This echoes the "Option 1" I outlined in my earlier comment. We would combine chainhead and transaction, subsequently introducing a txSearched event (or a similar name) specific to the newly established chainHead_unstable_transaction operation.

Both alternatives aim for a seamless and comprehensive experience. I believe they might address our current design challenges more effectively than the existing searched event. I'm eager to hear your thoughts.

@josepot
Copy link
Contributor Author

josepot commented Aug 25, 2023

The more I think about it, the more I'm leaning towards the approach of splitting the responsibilities of transactionWatch_submitAndWatch into two separate methods: transaction_unstable_broadcast and chainHead_unstable_searchTransaction, each under their respective function groups. This not only offers a cleaner design but also addresses the concerns we've raised.

@tomaka, I'm particularly interested in your perspective on this proposal. Could you please share your thoughts when you have a moment? 🙏

@tomaka
Copy link
Contributor

tomaka commented Aug 28, 2023

a cleaner design

I completely disagree with "a cleaner design".

  • You must be careful to call chainHead_unstable_searchTransaction before transaction_broadcast, as otherwise there is a chance that the transaction gets included before the searchTransaction starts. Not only that, but you must wait for the JSON-RPC server to send back the response to the searchTransition to indicate that it was started before calling gossip, in case the two calls get load balanced between two different servers.

  • It's not clear to me whether you want the information about the search to be a separate event or to be included in the chainHead notifications. If it's the latter, then you're adding a lot of latency (potentially an infinite amount) to every block notification the time for the JSON-RPC server to download the body. This also means that malicious full nodes can "attack" light clients by refusing to answer block body requests, as this will make them completely stop working as soon as they submit a transaction.

  • A chainHead notification that was being queued while the chainHead_unstable_searchTransaction is being processed will not have been searched. This means that the JSON-RPC client might see weird things, including for example blocks that are notified after the searchTransaction has started not being searched. In case of the search information being a separate event, this means that the JSON-RPC client can't differentiate between a block that will not be searched ever and a block that is still being searched, and thus can't know whether it can unpin the block.

  • If the JSON-RPC server is unable to download block bodies, we don't want the whole chainHead_follow subscription to stop, I suppose, meaning that the search system needs its own separate search-dropped event. I really don't see this as more simple from the JSON-RPC client's perspective.

  • It's not clear to me how the search behaves when it comes to parent-children. If the search is unable to download the body of a block, but has found the transaction in its child, should it immediately report the child search to the JSON-RPC client or wait for the search in the parent to finish? If it immediately reports, then again the JSON-RPC client doesn't know when precisely the search has started and thus is unable to know whether there might be an earlier block with that transaction. If it doesn't immediately report, then you're again adding more latency to the events.

You can solve some of these problems by adding an "event ID" to each event, and indicating in the confirmation to chainHead_unstable_searchTransaction at which "event ID" the search starts, but again that overcomplicates the API and the JSON-RPC client code.

I don't care so much about precise answers to these questions, what I what to demonstrate is the fact that this design isn't straight forward at all. You can solve these problems by adding more complexity to the API, but this is not something that we want.

@josepot
Copy link
Contributor Author

josepot commented Aug 30, 2023

Thank you, @tomaka, for taking the time to enumerate your concerns. I'd like to address them individually:

  • On the sequencing of calls: You're right in noting the sequence, but this is akin to having Rename the existing transaction to transactionWatch and add a new different transaction namespace #77 without chainHead_unstable_searchTransaction. If that were the case, the consumer would have to ensure the correct sequence with their custom implementation. The introduction of chainHead_unstable_searchTransaction is to ensure we standardize this process, allowing the server, which is best positioned for this task, to handle it.

  • On the nature of the search information: It would indeed be a separate "operation event" akin to other operation events. An illustrative example:

    {
        "event": "operationTransactionSearched",
        "operationId": "...",
        "blocks": [
            {
                "hash": "..."
            },
            {
                "hash": "...",
                "index": "..."
            }
        ]
    }
  • Regarding queued chainHead notifications: If chainHead_unstable_searchTransaction is in process, the server should still search the body. The documented behaviour should be that the server should search for the transaction in the body all all the currently pinned blocks, plus all the newer blocks (unless they get unpinned). I anticipate the server should be capable of this without any issues.

  • On potential server unavailability: If the JSON-RPC server cannot access block bodies, it shouldn't disrupt the entire chainHead_follow subscription. Since this is a separate operation, we could use the already existing operationError event to signal issues specifically related to this operation.

  • On the parent-child block relationships: One of the highlights of this proposal is the client's ability to process block information as needed. Once the server completes its search in any block (regardless of the order), it can notify the client. This hands-off approach simplifies things for the server. This is beneficial because clients can then process the information tailored to their specific needs.

  • On the introduction of "event IDs": Ensuring uniform behavior with other operations should alleviate the need for extra identifiers like "event IDs".

Finally, while I respect your viewpoint on the complexity of this design, I believe this approach streamlines the client-side consumption of the API. By shifting some complexity server-side, we're simplifying things for the consumer, which is a worthy trade-off in my opinion.

@tomaka
Copy link
Contributor

tomaka commented Aug 31, 2023

By shifting some complexity server-side

But you're not shifting complexity server side.

You're shifting some work server side. Yes, the server does more things automatically with your design than with #77. But I'm pretty convinced that the complexity of the code on the JSON-RPC client, in other words the number of possible edge situations to think about, is way higher with your proposal than with #77.

That's because you're splitting in two something that shouldn't be split in two. The code that follows blocks and downloads their bodies should be entirely either JSON-RPC-client-side or JSON-RPC-server-side, but putting some on the client side and some on the server side just creates an insanely complex interface between the two.

@josepot
Copy link
Contributor Author

josepot commented Sep 6, 2023

I appreciate your feedback, but I have a differing perspective on the matter.

  1. Complexity:

    • You've mentioned that my proposal increases the complexity on the JSON-RPC client side. As the developer working on the JSON-RPC client, I've found that my approach actually simplifies the client-side logic, contrary to your observations.
  2. Allocation of Operation Slots:

    • If the server shoulders the search operation, the client is relieved from meticulously handling the 16 operation slots, which should ideally be preserved for the main client consumer.

    • I'd like to remind you of the key constraint:

      As long as the JSON-RPC client ensures that no more than 16 operations are in progress concurrently, all its operations will be accepted by the JSON-RPC server.

    • Thus, I advocate for an operation like chainHead_unstable_searchTransaction for two core reasons:

      a. Simplifying Client-side Logic:
      - By allowing a single API to search across potential bodies using just one slot, the complexity is vastly reduced. Without this, I'd be forced into sequential body requests, potentially compromising efficiency. Alternately, I'd have to juggle parallel requests, pausing them as needed based on the primary client's demands. This adds undue complexity.

      b. Performance & Resource Consideration:
      - As previously mentioned, transaction searches within the body are not complex, but they are resource-intensive and blocking, making them unsuitable for main-thread execution. The optimal solution is offloading this to an external process or a Worker.
      - If I were to implement this on the client side, I'd face a dilemma: either compromise the performance for the main consumer or add the complexity of task delegation to a separate worker. This seems redundant when the server can capably execute this for the client.

In essence, I believe my approach, far from complicating matters, aims to streamline processes, prioritize performance, and make efficient use of resources.

@jsdw
Copy link
Collaborator

jsdw commented Sep 6, 2023

For what it's worth, the approach I will probably take in the first case when submitting a transaction is:

  1. Start paying attention to chainHead_follow events (I'll be doing this in the background already anyway) and keeping blocks pinned.
  2. Call transaction_submitAndWatch. When this returns block hashes, I'll probably wait a little to see if they appear in any pinned blocks from chainHead_follow and, if so, I can keep them pinned while the user is interested in said block hashes. If not, then the user won't be able to access block details (which would be a bummer, and likely lead to an error being reported back).

The hope is that most of the time, the block hashes I'm told about in the submitAndWatch will appear or have recently appeared in chainHead_follow already, so I can keep them pinned and give a nice experience (The common case is that users will be told whether the tx was successful or not, which involves downloading the events for that block etc).

I wonder whether chainHead_searchTransaction would realistically help with this association? Because it seems like it would have the same success rate as my approach currently; it'll either find the block hashes in still-pinned blocks on the server, or it won't and I'll end up giving up if I wait too long. For what it's worth, I won't be downloading block bodies locally in either case (transaction_submitAndWatch does that search for me already); just looking for corresponding block hashes in my collection of pinned blocks.

Another thought/question is: transaction_submitAndWatch returns some status updates like whether it's in the best/finalized block but also a few other states which I'm currently going to relay back to the user. Would we lose some of this information is we remove it and use transaction_broadcast + chainHead_searchTransaction instead?

@Polkadot-Forum
Copy link

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/new-json-rpc-api-mega-q-a/3048/16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants