Skip to content

Global State

Aviv Eyal edited this page Oct 1, 2019 · 7 revisions

Global State and Transaction Processing Notes

WARNING: This doc is outdated and needs to be updated.

The first stage of creating Spacemesh global computer is to allow users to transfer tokens between each other on the network. For this we need a way to represent There are currently two main methods to code transactions into the network: Bitcoins UTXO and Ethereums global state machine. There are advantages and disadvantages to each method. Simply put- in Bitcoins UTXO method there are no accounts, the status of an account is the last UTXOs committed. In Ethereum, the account state is stored in a global account tree, and is kept consistent according to the transactions. The global account tree allows querying of different account balances. Another good thing about the ethereum state tree is that due to the fact that it is implemented using a merkle tree, each state change of a single account changes the root hash of the tree and therefore creates a unique id at the root of the tree for every state and transition.

In blockchain, only one block is added to the chain at a time. It contains all the transactions that has been collected since the last block. When dealing with block mesh, each time period multiple blocks are added with many transactions. these blocks are created by different nodes which can include the same transaction in different blocks although they are incentivized not to, this can still happen. In order to process these transactions first all transaction need to be received from all blocks in the layer.

Receiving a block:

Since blocks are sent to the network and processed by different clients such as full nodes, wallets, light clients and others it makes sense to split the entire block to several parts in order to reduce the overall data sent and received by clients. Because of this, a block header can only consist of the following properties:

Block Header:

  • PoST [PUB key +PoST] - the post commitment
  • Timestamp - the timestamp of this block ceration
  • LayerID - number of id this block belongs to
  • List - votes on other blocks according to protocol
  • List<HASH(transactions)> Transactions : list of transactions
  • Signature - signature of sender validating transactions.

This is the basic block of data that each of the clients can accept, additional data can be queried from the origin. A full node that wishes to receive transaction information could request the entire block and the transactions it contains. (TODO: should we keep transaction in merkle tree? Yes: we could search these trees more easily to find transactions, No: duplicate transactions may be found in different trees, should we handle it? )

Block body:

  • List<Transactions>

Block body validations:

All blocks are validated for syntactic validity as described in the Spacemesh protocol. A key validation that must take place before trying to parse transactions from the block is the transaction size limit, currently set to 50K (TODO: at which state is this validated? If there are 2 parts to a block, are all of them downloaded by miners and nodes who are not mining now?)

Transaction Processing:

Performing transactions is applying a transformation to the global state tree. for a state tree S applying transaction will produce a new state S` in the following manner: T(S) = S`. Each state transition produces a new root hash that identifies the state uniquely. In order for the state tree to be consistent in between different clients transactions from a specific layer must be processed in the same order across all nodes (canonical state).

Transaction sorting

Transaction sorting must be defined by a deterministic function, in a way that for a given set of transactions the order of the sorted transaction will be the same in between all nodes. Only then the transaction data can be used to transform the global state tree and apply transactions to the accounts. The sorting algorithm poses a security issue, lets say that the values are sorted by the hash of the transaction, in case an attacker wishes his transaction to come first in the sorted order, he can grind the hash by changing some parameters of the transaction. To overcome this, transactions must be hashed by some random number which is known to all validators at the time of the validation such as the random beacon hash. TODO: discuss usage of the random beacon oracle for this.

Duplicate Transactions:

Since the same transaction can be included in several blocks in the same layer (TODO: what happens when the same transaction appears in the different blocks across layers? Is it possible?). After sorting, all duplicate transactions are eliminated and only unique transactions remain to be processed.

Running transactions:

Transaction must be run in order and sequentially, this is to keep the root hash consistent between nodes. Currently the only supported transaction is the transfer of tokens, and transactions can’t be run on parallel. For each transaction a gas fee must be paid, even if the transaction performed did not succeed. If the transaction didn’t succeed any change state must be reverted back to the original state. Currently these are the properties of a single transaction:

Transaction

  • Hash (txID) - calculated
  • Source account
  • Destination account
  • Amount
  • Source Sig
  • Gas price

(TODO: Transactions will be saved in the key value store as well?) (TODO: Think about a staging area as part of preparing for VM and for isolating the run of a transaction until it is finished and we can safely commit it) (TODO: think about integrating EOS solution for concurrent run of transactions) TODO: review the possibility to commit blobs of data through transactions. Refund gas??

Global state and account balance:

The transaction processors output is a new state of affected accounts. This state will be stored in each nodes persistent memory and contain account data and later on smart contract code.

Global state data structure

The data structure storing the global state must have the following properties:

  • Provide fast access to accounts data
  • Store state in an efficient manner - efficient read, write, modify, delete.
  • Allow fast traversing between global states according to a specific layer ID (given that this layer is in consensus)

These properties are needed for usability as well as for protocol constraints which will be explained later. To accommodate these needs we chose to represent the global state using a radix tree. By doing so we allow modifying and appending changes to the balance of accounts in the same manner as ethereum's global state Trie. In our implementation the state is represented by a Merkle Patricia tree on top of a LevelDB persistent key value store. Similar to Ethereum, the merkle patricia tree can be represented entirely on the key value store. In contrary to Ethereums approach on optimizing the Patricia Merkle tree with custom RLP encoding, currently we have chosen to use the protobuf as encoder and decoder for actual leaf data to be stored in the tree. (TODO: elaborate on the serialization mechanism and datatypes of nodes / data)

The problem of keeping state consistency in a mesh

A challenge the mesh network poses is how to know what is the current correct global state tree of all accounts. In blockchain, this is again an easy task since the transformations of the global state tree originate from one single block at a time. In blockmesh, the same state tree hash will be received only if the same blocks were received and applied by all nodes in the network. If a node haven’t received all the blocks or has been attacked and received malicious data, it will produce a different root hash and it’s account balances may be different. Ethereum overcomes this by including the root hash of the current global state tree in it’s blocks. In our mesh it is not quite easy. Blocks are received constantly by validators, they are then aggregated into “layers”. The layers are verified by two consensus algorithms. One is a real time algorithm that tries to achieve consensus over the valid blocks in the previous layer, and another internal consensus protocol that over time will calculate the validity of each block using a vote counting algorithm. The slower “tortoise” algorithm determines if the blocks are valid or not.

Global state consistency in Spacemesh:

In order to allow all new nodes to know if their state is accurate, a global state root hash must be published on the blockmesh. The published root hash must be derived from a state which is surely in consensus between all nodes. This state can be derived only by processing blocks and transactions from previous layers which are irreversibly valid. A layer will be considered irreversible:

  • All blocks in that layer are irreversible
  • All layers prior to the processed layer are irreversible as well

This period is called the “Global state threshold”. Each node will publish its own state tree root hash on a block that's being mined alongside an aggregated hash of all transactions that were applied to receive the published root.

Performing transactions in real time

If the global root hash will be determined only by blocks that have been many layers ago the actual account balances will be outdated and irrelevant for current transactions. To overcome this we will define a “local threshold” of blocks from which it can update it’s global root hash. The local threshold is the last contextually valid layer. A contextually valid layer is defined by the following parameters:

  • A layer in which all blocks are contextually valid
  • All previous layers are contextually valid as well

This will allow to perform transactions on the current state but on the other hand it will pose a security risk, since the blocks within the local threshold have not passed the global threshold for certain consensus, making it unclear whether the transactions are valid and should be applied to the global state tree.

Block state change handling

A node can detect a wrong state monitoring the blocks that were used to create the state. If a block is downvoted to be contextually invalid, The state machine will revert to a layer prior to the layer of the modified block and will replay all blocks until the current contextually valid layer. (TODO: what about the revert function? How far to the past it will revert the state? What happens when the state is reverted to all transactions performed on the chain?)

Other notes... TODO:

  • Data size limit: 50kb

  • Optimization: save only indexes of trans

  • Need to obtain consensus over values.

  • randomness of transaction intake into block

  • position of the account/ transaction