-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Comments
Extracting out a common utility makes sense, but I have a couple of comments on the approach: DMP and HRMP (as implemented) and XCMP (as designed) both use a Message Queue Chain (MQC) system designed to separate messages out into their own hash-chain. This means that systems looking to import and dequeue messages only need a single state proof of a recent MQC entry. Any other 'synchronization' between the system's previous MQC and the new MQC can be handled without further relay-chain state proofs. The queue should also have support for blob items. DMP, HRMP and XCMP are all blob transfer protocols at the moment with XCM being a higher-level concern. |
Yeah - I'm happy to move things over to blob and abstract away much of XCM. MQC is cool, but AFAIK we process a minimum of one MQC page at a time. This means that we'll need to deal with a whole page worth of messages at once. It might easily be that we cannot execute a whole page worth of messages, so we'll need to be able to deal with storing pages and incrementally executing messages over several blocks, hence the need for this primitive. |
At the moment there is no MQC paging mechanism in VMP or HRMP. The outstanding MQC contents in each case are stored in a Big Unbounded Vector and should be paged, but that's not the current state of It is true that we cannot expect chains to execute all of or even large swathes of the outstanding MQC entries in a single block and therefore there's a need for state machines to separate synchronization of the MQC (backwards) from processing of the MQC (forwards). It should never be necessary for state machines to evaluate more than a small constant number of relay-chain state proofs when updating their knowledge about the MQC head and processing messages, since MQCs already provide cryptographic guarantees of queue contents between some previous known link and the head. My mental model here is that parachain runtimes should have a multi-block algorithm something like this:
We could introduce some more clever relay-chain indexing of recent MQC heads for optimization of (2) and (3), and I think there are a few viable solutions like Hierarchical MQCs, Merkle Mountain Ranges, or power-of-two indexes. Pagination makes more sense at the message storage level than the MQC level itself to me. But since MQC entries are quite small (68 bytes apiece) a parachain runtime can update its local knowledge of the MQC trunk by ~15,000 messages per MB of PoV space and then process the messages once updated - it's not clear to me that this needs optimization. By processing, I don't mean execution but rather import/queuing/discarding. It seems to me that we have two related problems here: 1. queuing messages within a state machine and 2. retrieval and indexing of metadata about un-processed portions of message queues stored outside of a state machine. The original post of this issue is mostly focused on the former but HRMP/DMP (on both the relay-chain & parachain side) are also quite reliant on the former. I bring up (2) because it needs to be accounted for in any general pallet that's used on the relay-chain side of HRMP/DMP, but if this is just about the parachain side of things we can certainly separate those concerns. Regardless of storage optimizations, in HRMP/XCMP we need a mechanism to be able to clear ancient message storage via a 'watermark' block number for backwards compatibility. There are possibly better approaches than the watermark but those would require a lot of parachains-protocol level changes that would be quite involved and are best bundled with a new major version of the candidate format. |
Ok - I'm really not wanting to get deeply involved in the MQC side of things. I just want to figure out a reasonable API to connect the output of MQC reading/dequeuing to the input of the (potentially deferring, as required) message processing system. My understanding from originally integrating this with HRMP is that each chain has at most one large data chunk (inter-chain message blob or ICMB) arriving into it in a block from any other chain. This one large data block is actually comprised of many smaller (higher-level) message blobs (likely in XCM format) concatenated together, but this is not a concern of the transport layer. I don't know if it's reasonable to expect to be able to process (by which I essentially mean enqueue, though in some circumstances it may be reasonable to execute some or all of the constituent XCMs also) ICMBs in integer amounts per block: it would simplify things for sure, but if they're too big then they could cause too much PoV bloat to the point that blocks become exhausted just from message enqueuing. It would be helpful for me to understand the exact logistics of ICMBs - how large they are, where they are being fed into a block and what expectations there are over how many are fed in at a time. |
Draft PR open and in progress: paritytech/substrate#12485 |
Yeah, that makes sense. I misinterpreted the scope of the issue at first and we can continue the MQC conversations elsewhere. But to be clear, my underlying point is that the 'generalized XCM queue' is more useful for the Cumulus side of XCMP/HRMP/DMP than the relay-chain runtime side.
That is true. the maximum size of these blobs is determined by governance via the |
Yes, though it has applications on the Relay-chain too since there is UMP which needs to be serviced safely. |
It would be helpful to confirm that the given API in paritytech/substrate#12485 ( |
This issue has been mentioned on Polkadot Forum. There might be relevant details there: https://forum.polkadot.network/t/polkadot-release-analysis-v0-9-36/1529/1 |
Right now we have several different XCM message queues spread across UMP, DMP, HRMP/XCMP and (a yet to be done) bridge/loopback. They all implement roughly the same thing - accept one or more messages, then execute them either immediately if the queue is empty, or store them for later execution. They then execute the contents of the queue at regular intervals, paying attention to the weight limits. When a message takes too much weight to execute autonomously, it is generally placed into its own storage item for later execution (or deletion after a timeout).
Instead of this, we should have a single XCM queue pallet. It should implement a new trait
EnqueueMessage
and dispatch messages regularly. DMP, UMP, HRMP/XCMP and bridge code can use this pallet through its trait to enqueue messages which should (eventually) be executed, either autonomously if it is light enough or manually if overweight.The
EnqueueMessage
trait should look like:In terms of the state data format, it could be implemented in a few ways:
These options can also be augmented with the possibility of using Preimages pallet to store any large messages in its own storage item. We'll ignore that for now since the benefit and overhead will be mostly the same between them.
Let's assume we tend to have two modes of operation: Mostly-Processing (MP) where we receive 100 messages per block to enqueue but have 10,000 messages from previous blocks to execute, of which we process 1,000; and Mostly-Receiving (MR) where we receive 1,000 messages per block to be enqueued to the existing queue which is 2,000 messages big and of which we process 500.
We'll also assume messages (including metadata) range between 64 bytes and 256 bytes, averaging 128 bytes and occasionally going as high as 16KB.
Our storage item trie proofs require the key and value plus an trie overhead of 16 items * 32 bytes per level; the number of levels will typically be
2 + log_16(size)
, so around (2 * 16 * 32) = 1KB for singular values, 2KB for maps with up to around 200-300 values, 3KB for maps with up to around 50,000 values.For option 1/MP we'll need a proof of a regular value (1KB) plus all items in the queue (128 * 10000) = ~1,200 KB, and 1/MR would be 1 KB + 128 * 2,000 = ~240 KB. That's quite a lot of overhead.
For option 2/MP, it's 3KB * 1,100 = 3.3MB and 2/MR is 3KB * 1,500 = 4.5MB. This is impossibly high.
For option 3/MP, it's 1 page access for enqueuing and 4 pages access for processing. We'll be conservative and assume a total of 6 since we may be on a boundary. That's 6 * (2KB + 64KB) =~ 400 KB. For option 3/MR, we must deposit 2 pages worth and enqueue 4; again being conservative that's 7 pages and therefore 7 * (2KB + 64KB) =~ 460 KB.
Storing items in a linked-list heap may lead to substantial gains in performance since:
Vec<u8>
s are decoded or instantiated for each message.MaxEncodedLen
which actually reflects the data stored (aBoundedVec<BoundedVec<>>
will always be quite an inaccurate MEL unless the innerBoundedVec
has very little variance).Furthermore, depending on realistic message sizes and processing dynamics, option 3 can be optimised in terms of page size.
Suggested Storage Format
A reasonable storage format might be:
Messages are placed in the overweight index if either the message is beyond
max_message_size
(something like 1024) or the weight required for execution is beyondoverweight_threshold
(something like 0.5% of block weight).Inline Execution
If the queue is empty when non-overweight messages arrive, then it may be reasonable to have them be executed immediately to avoid state churn.
QoS
This basic design has only very limited QoS safeguards: the two levels of prioritisation allow for high-priority messages to take precedence over regular priority messages, assuming that they can be identified at the delivery stage. However, if all peer parachains' messages are dispatched with regular priority, then one parachain could effectively pay to delay the execution of any messages from other parachains by introducing a large amount of heavy messages into the queue immediately prior to the point they want to DoS other parachains.
This attack can be mitigated quite easily and cheaply (in terms of both performance and code complexity) by modestly expanding the number of queues to one per message origin. These queues would then need to be enumerated and processed in some fair fashion, round-robin (with a cycling starting origin) would be the simplest, but shuffling them with a secure random seed would bring extra security guarantees.
Note: This only works for the immediate delivery source. It will fail if a single delivery origin can represent and enqueue messages from multiple upstream origins which it represents, since it will be able to "sybil attack" the QoS system.
The text was updated successfully, but these errors were encountered: