BIP: 134 Title: Flexible Transactions Author: Tom Zander <tomz@freedommail.ch> Status: Draft Type: Standards Track Created: 2016-07-27
This BIP describes the next step in making Bitcoin's most basic element, the transaction, more flexible and easier to extend. At the same time this fixes all known cases of malleability and resolves significant amounts of technical debt.
Flexible Transactions uses the fact that the first 4 bytes in a transaction determine the version and that the majority of the clients use a non-consensus rule (a policy) to not accept transaction version numbers other than those specifically defined by Bitcoin. This BIP chooses a new version number, 4, and defines that the data following the bytes for the version is in a format called Compact Message Format (CMF). CMF is a flexible, token based format where each token is a combination of a name, a format and a value. Because the name is added we can skip unused tokens and we can freely add new tokens in a simple manner in future. Soft fork upgrades will become much easier and cleaner this way.
This protocol upgrade cleans up past soft fork changes like BIP68 which reuse existing fields and do them in a much better to maintain and easier to parse system. It creates the building blocks to allow new features to be added much cleaner in the future.
It also shows to be possible to remove signatures from transactions with minimal upgrades of software and still maintain a coherent transaction history. Tests show that this can reduce space usage to about 75%.
Token based file-formats are not new, systems like XML and HTMl use a similar system to allow future growth and they have been quite successful for decades in part because of this property.
Bitcoin needs a similar way of making the transaction future-proof because re-purposing not used fields for new features is not good for creating maintainable code.
Next to that this protocol upgrade will re-order the data-fields which allows us to cleanly fix the malleability issue which means that future technologies like Lightning Network will depend on this BIP being deployed.
At the same time, due to this re-ordering of data fields, it becomes very easy to remove signatures from a transaction without breaking its tx-id, which is great for future pruning features.
In the compact message format we define tokens and in this specification we define how these tokens are named, where they can be placed and which are optional. To refer to XML, this specification would be the schema of a transaction.
CMF tokens are triplets of name, format (like PositiveInteger) and value. Names in this scope are defined much like an enumeration where the actual integer value (id, below) is equally important to the written name. If any token found that is not covered in the next table it will make the transaction that contains it invalid.
Name | id | Format | Default Value | Description |
---|---|---|---|---|
TxEnd | 0 | BoolTrue | Required | A marker that is end of the transaction. |
TxInPrevHash | 1 | ByteArray | Required | TxId we are spending |
TxPrevIndex | 2 | Integer | 0 | Index in prev tx we are spending (applied to previous TxInPrevHash) |
TxInScript | 3 | ByteArray | Required | The 'input' part of the script |
TxOutValue | 4 | Integer | Required | Amount of Satoshis to transfer |
TxOutScript | 5 | ByteArray | Required | The 'output' part of the script |
LockByBlock | 6 | Integer | Optional | BIP68 replacement |
LockByTime | 7 | Integer | Optional | BIP68 replacement |
ScriptVersion | 8 | Integer | 2 | Defines script version for outputs following |
NOP_1x | 1x | . | Optional | Values that will be ignored by anyone parsing the transaction |
In the current version of Bitcoin-script, version 1, there are various opcodes that are used to validate the cryptographic proofs that users have to provide in order to spend outputs.
The OP_CHECKSIG is the most well known and, as its name implies, it validates a signature. In the new version of 'script' (version 2) the data that is signed is changed to be equivalent to the transaction-id. This is a massive simplification and also the only change between version 1 and version 2 of script.
The tokens defined above shall be serialized in a certain order for the transaction to be valid. Not serializing transactions in the order specified would allow multiple interpretations of the data which can't be allowed. There is still some flexibility and for that reason it is important for implementors to remember that the actual serialized data is used for the calculation of the transaction-id. Reading and writing it may give you a different output and when the txid changes, the signatures will break.
At a macro-level the transaction has these segments. The order of the segments can not be changed, but you can skip segments.
Segment | Description |
---|---|
Inputs | Details about inputs. |
Outputs | Details and scripts for outputs |
Additional | For future expansion |
Signatures | The scripts for the inputs |
TxEnd | End of the transaction |
The TxId is calculated by taking the serialized transaction without the Signatures and the TxEnd and hashing that.
Segment | Tags | Description |
---|---|---|
Inputs | TxInPrevHash and TxInPrevIndex | Index can be skipped, but in any input the PrevHash always has to come first |
Outputs | TxOutScript, TxOutValue | Order is not relevant |
Additional | LockByBlock LockByTime NOP_1x | |
Signatures | TxInScript | Exactly the same amount as there are inputs |
TxEnd | TxEnd |
TxEnd is there to allow a parser to know when one transaction in a stream has ended, allowing the next to be parsed.
Notice that the token ScriptVersion is currently not allowed because we don't have any valid value to give it. But if we introduce a new script version it would be placed in the outputs segment.
The default value of ScriptVersion is number 2, as opposed to the version 1 of script that is in use today. The version 2 is mostly identical to version one, including upgrades made to it over the years and in the future. The only exception is that the OP_CHECKSIG is made dramatically simpler. The input-type for OP_CHECKSIG is now no longer configurable, it is always '1' and the content that will be signed is the txid.
TODO: does check-multisig need its own mention?
The effect of leaving the signatures out of the calculation of the transaction-id implies that the signatures are also not used for the calculation of the merkle tree. This means that changes in signatures would not be detectable. Except naturally by the fact that missing or broken signatures breaks full validation. But it is important to detect modifications to such signatures outside of validating all transactions.
For this reason the merkle tree is extended to include (append) the hash of the v4 transactions. The markle tree will continue to have all the transactions' tx-ids but appended to that are the v4 hashes that include the signatures as well. Specifically the hash is taken over a data-blob that is built up from:
1. the tx-id 2. the CMF-tokens 'TxInScript'
The NOP_1x wildcard used in the table explaining tokens is actually a list of 10 values that currently are specified as NOP (no-operation) tags.
Any implementation that supports the v4 transaction format should ignore this field in a transaction. Interpreting and using the transaction as if that field was not present at all.
Future software may use these fields to decorate a transaction with additional data or features. Transaction generating software should not trivially use these tokens for their own usage without cooperation and communication with the rest of the Bitcoin ecosystem as miners certainly have the option to reject transactions that use unknown-to-them tokens.
Fully validating older clients are not compatible with this change.
SPV (simple payment validation) wallets need to be updated to receive or create the new transaction type.
This BIP introduces a new transaction format without changing or deprecating the existing one or any of its practices. Therefore it is backwards compatible for any existing data or parsing-code.
Bitcoin Classic includes this in its beta releases and a reference implementation can be found at;
bitcoinclassic/bitcoinclassic#186
To be determined
Copyright (c) 2016 Tom Zander <tomz@freedommail.ch></tomz@freedommail.ch>
This document is dual-licensed under the Creative-Commons BY-SA license v4.0 and the Open Publication License v1.0 with the following licence-options:
Distribution of substantively modified versions of this document is prohibited without the explicit permission of the copyright holder. Distribution of the work or derivative of the work in any standard (paper) book form is prohibited unless prior permission is obtained from the copyright holder.