Author: Andres Correa Casablanca <andres@thirdhash.com>
Status: Proposed
Created: 2018-12-14
Remove the current TTOR (Topological Transactions Ordering) implicit rule, and impose a canonical transactions order on blocks (transactions ordered by their IDs in lexicographical order).
- TTOR (Topological Transactions Ordering Rule): The transactions in a block
are sorted in a way that, if a transaction
A
depends onB
's output, thenB
precedesA
. This rule does not provide a total order. - AOR (Any Ordering Rule): No restrictions are applied on how transactions are ordered in a block, any order is valid according to this rule.
- CTOR (Canonical Transactions Ordering Rule): Mandatory total ordering rule.
- LTOR (Lexicographic Transactions Ordering Rule): Specific case of a CTOR. Transactions are ordered using their ID as a key, in lexicographical order.
- GTOR (Gavin's Transactions Ordering Rule): Specific case of a CTOR.
Transactions in a block follow a topological order, but when a transaction
A
is not ancestor nor descendant of a transactionB
, their relative order is given by their IDs, which MUST follow the lexicographical order.
Currently, when a proposer builds a block, or a relay node validates a block, it has to take into account the transactions topological order.
Getting rid of the topological ordering rule makes it easier to parallelize blocks' validation by applying the OTI (Outputs-then-Inputs) algorithm. It is possible to implement an OTI variant that works on TTOR, but requires keeping track of transaction indices in additional data structures, and a final extra check to ensure that the topological order is maintained [1].
Having a canonical order eases the introduction of better block propagation techniques, like Graphene [2][3][4], and allows for compact transaction inclusion/exclusion proofs applying algorithms to tackle the "set reconciliation problem" [5][6].
CTOR also enables the possibility of implementing simple and effective sharding techniques. By having a well known order, transactions can be grouped and locally processed in different threads or processes, taking advantage of multi-core and multi-processor systems. This applies to mempool acceptance, blocks validation and Merkle trees construction [7].
More specifically, LTOR gives us a simple rule on how to partition shards (by hash or ID prefix), and opens the door to efficiently use Merklix trees [8][9][10], a special case of Merkle trees (they are constructed like a trie, or radix tree). Merklix trees are interesting because they allow us to create presence or absence proofs for the whole represented set, while generic Merkle trees only allow us to create proofs for specific positions/indices.
As an extra "side effect" of implementing LTOR, protocol & implementations get slightly simplified, making testing easier, and making harder to design attacks based on "block malleability" (i.e. hash-flooding[11] or timing attacks, immunity to hash-flooding attacks would be trivial by using trees instead of hash tables for the mempool).
It is also worth to say that in our case, with a target spacing (expected time between blocks) of 16 seconds, the probability of having "0-conf" transactions is even lower than in Bitcoin or Bitcoin Cash, so the topological ordering property will hold with high probability at the intra-block level even if it's not proactively enforced (which, in any case, is not a real advantage).
As users of Unit-e, we only care about which transactions belong or don't belong to a block, but not about its order (we think in terms of sets, not in terms of lists), but a specific order is required to construct the Merkle tree that will be used to build the header.
By not having to specify the transactions order (because it's implicit) at the
time of block propagation, we have to send less data. If we have n
transactions in a block (without considering the coinbase transaction), we have
n!
different ways to sort them, so in the ideal case, encoding this
information would take log2(n!)
bits (i.e., for 1000 transactions we need at
least 1.04KB, and 14.46KB for 10000).
The Graphene[2] block propagation method (designed to decrease block propagation times) relies on sending an invertible bloom lookup table[12] and a Bloom filter in order to specify the block's transactions set.
These lightweight probabilistic data structures carry lossy-compressed information about the transactions set, but don't provide any hint about its order. We can obtain the order by different means:
- Relying on a canonical order: In this case, there's no extra overhead.
- Passing an explicit order: This would tamper Graphene's benefits as it
would increase messages' sizes. As an example, for 2000 transactions,
Graphene would need approximately
2.1KB
, adding the explicit order it would need2.1KB + 2.3KB = 4.4KB
, more than doubling the message's size. - Hard-coding different ordering methods and specifying which one is being used: This option adds extra complexity and maintenance burden without providing known benefits.
Notice how AOR only allow us to apply the 2nd and 3rd option, while CTOR allows to implement Graphene with zero overhead.
- All transactions in a block (except for the coinbase transaction, which must be the first one) MUST be in lexicographical order (using their ID as key). The order is relevant at the time of constructing the transactions Merkle tree.
- The implementation MUST NOT rely on the topological order assumption in any of its steps.
- Using the OTI algorithm is a good starting point, but it's also possible to implement parallel variants of the same idea.
- In case of block disconnection and/or chain reorganizations, the OTI algorithm has to be applied in reverse order (Inputs-then-Outputs).
Among the different known possibilities, LTOR has the best balance:
- TTOR: Does not provide any of the advantages presented in the Motivation section, and keeping a topological order does not scale well as the number of transaction grows [13][14].
- AOR: It gives us the possibility to implement some parallelization improvements, but does not help with sharding or block "malleability", and in the case of block propagation times it would require to change the Merkle root construct as well (for example using Merklix trees).
- GTOR: Although it's almost equivalent to LTOR, its implementation is more complex, and applying sharding is not as easy as with LTOR.
Prior research work on the topic has been done by the Bitcoin Cash community [7][15], and CTOR has been implemented at November 2018 as a hard fork[16].
- Less sources of entropy: In the case of PoW, some miners argued that they would have less sources of entropy to generate good enough hashes, but this does not apply to Unit-e because it's PoS-based.
- Some developers argued that there's still room for improvement for TTOR, and that it would be worth to exhaust the solutions space before switching to CTOR/LTOR. It's true that some processing speed could be gained, but choosing micro-optimizations & speculative improvements over clear & known improvements on the data structures & algorithms does not seem reasonable from an engineering perspective.
There are no backwards compatibility problems.
- Jonathan Toomin's proposal to apply OTI on top of TTOR
- Graphene: A New Protocol for Block Propagation Using Set Reconciliation, 2017, A. Ozisik, G. Andresen, G. Bissias, A. Houmansadr, B. Levine
- Graphene (Presentation slides)
- O(1) Block Propagation, Gavin Andresen
- Set Reconciliation With Nearly Optimal Communication Complexity, 2000, Yaron Minsky, Ari Trachtenberg, IEEE, and Richard Zippe
- What’s the Difference? Efficient Set Reconciliation without Prior Contex, 2011, David Eppstein, Michael T. Goodrich, Frank Uyeda, George Varghese
- Sharding Bitcoin Cash, 2018, Bitcoin ABC
- Introducing Merklix tree as an unordered Merkle tree on steroid, 2016
- Using Merklix tree to checkpoint an UTXO set, 2016
- Using Merklix tree to shard block validation, 2016
- Denial of Service via Algorithmic Complexity Attacks, 2003, Scott A. Crosby, Dan S. Wallach
- Invertible Bloom Lookup Tables, 2011, Michael T. Goodrich, Michael Mitzenmacher
- Practical performance of incremental topological sorting and cycle detection algorithms, 2016, Ragnar Lárus Sigurðsson
- Incremental Topological Sort and Cycle Detection in O(m sqrt(n)) Expected Total Time, Aaron Berstein, Shiri Chechik, January 2018
- Canonical Transaction Ordering for Bitcoin, 2018, Joannes Vermorel, Amaury Séchet, Shammah Chancellor, Tomas van der Wansem
- Canonical Transaction Ordering Code Review, Bitcoin ABC