-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BIP for OP_TXHASH and OP_CHECKTXHASHVERIFY #1500
base: master
Are you sure you want to change the base?
Conversation
bip-txhash.mediawiki
Outdated
* If the first byte is exactly 0x00, the Script execution succeeds immediately. | ||
//TODO(stevenroose) is this valuable? it's the only "exception case" that | ||
could potentially be hooked for some future upgrade |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not allow extra bytes at the end to mean OP_SUCCESS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roconnor-blockstream has previously warned about non-trivial OP_SUCCESS semantics. Though the current SUCCESS semantics are "any OP_SUCCESS opcode occurring in the script means SUCCESS", but we could have different semantics that allow any opcode internally to trigger "instant success", but (1) that are very different semantics and will require entirely different code and (2) it becomes way harder to reason about.
IIRC, @sanket1729 also noted that such SUCCESS semantics make reasoning about scripts for things like miniscript way harder.
Actually this BIP seems outdated, I have to push a small update. I decided to propose to make the 0x00
special case mean "ALL" to make this more ergonomic to use as a sighash together with CSFS. ("ALL" isn't valuable as a template check because it contains the prevout scriptPubkey which should contain the hash) Other suggestions welcome.
Alternatively, but slightly even more complicating the cases, since the first two fields (version, locktime), are not very valuable without anything else (especially since we have |
I just pushed an updated version of this BIP. It has a reference implementation that produces test vectors that are tested against an implementation for Bitcoin Core and for rust-bitcoin. I think it should be ready for review. I have one small last TODO in the specification related to txfs malleability. |
330d9c1
to
16dd455
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing a section on backward compatibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay! I've finally found a round tuit, and have performed a more detailed review.
# Summary | ||
|
||
## OP_CHECKTXHASHVERIFY | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize it's traditional, but why are we adding new non-Taproot opcodes? Is there a case where this is desirable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, bare OP_CHECKTXHASHVERIFY is really efficient. CTV also adds them. It's 34 bytes output script and 0 bytes witness/scriptsig. As opposed to 34 (spk) + 33 (cb: ver + internal key) + 34 (tapscript) for taproot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I forgot that OP_SUCCESSx was only a taproot thing, not a segwit thing. Yuck!
* 3. `TXFS_INOUT_NUMBER | TXFS_INOUT_SELECTION_ALL` | ||
* 4. `TXFS_INOUT_NUMBER | TXFS_INOUT_SELECTION_ALL` | ||
* the `0x00` byte: it is set equal to `TXFS_SPECIAL_ALL`, which means "ALL" and is primarily | ||
useful to emulate `SIGHASH_ALL` when `OP_TXHASH` is used in combination |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, "would be useful if that were proposed which it isn't". I am skeptical of this magic value.
While I understand Russell O'Connor's dislike of runtime OP_SUCCESS, it is a lesser evil here than this kind of guessing of future utility which will no doubt prove suboptimal when we get there.
And for miniscript: sure, it will only generate and decode a push followed by TXHASH. But there are other things it can't decode too, and that's OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the SUCCESS argument has merit, though. Also IMO it's not too much of a pain to pick one of the many SUCCESS opcodes tapscript still has to make a OP_TXHASH2 if really needed. I also don't like that witness input can turn an opcode into a SUCCESS operation for the entire script. This can be tricky when collaboratively constructing scripts.
summary, followed by a reference implementation of the CalculateTxHash function. | ||
|
||
* There are two special cases for the TxFieldSelector: | ||
* the empty value, zero bytes long: it is set equal to `TXFS_SPECIAL_TEMPLATE`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You re-use this term TXFS_SPECIAL_TEMPLATE twice for different things, which is confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry, one of them is a typo and should be TXFS_SPECIAL_ALL
. Fixing.
bip-txhash.md
Outdated
* The last (highest) bit of the first byte (`TXFS_CONTROL`), we will call the | ||
"control bit", and it can be used to control the behavior of the opcode. For | ||
`OP_TXHASH` and `OP_CHECKTXHASHVERIFY`, the control bit is used to determine | ||
whether the TxFieldSelector itself has to be included in the resulting hash. | ||
(For potential other uses of the TxFieldSelector (like a hypothetical | ||
`OP_TX`), this bit can be repurposed.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a footnote, at best, mentioning how this could be expanded for a new OP_TX. But there's no reason to design for it now that I can see, except to leave a clear carve-out for future expansion.
So TXFS_CONTROL
is a terrible name. TXFS_FIELD_SELECTOR
perhaps?
bip-txhash.md
Outdated
|
||
For both inputs and then outputs, do the following: | ||
|
||
* If the "in/outputs" field is set to 1, another additional byte is expected: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm following this correctly, the (non-special) TxFieldSelector format is, in bytes:
CORE_SELECTOR [INOUT_SELECTOR] [IN_SELECTOR] [OUT_SELECTOR]
If TXFS_INPUTS
is set in the CORE_SELECTOR, then INOUT_SELECTOR and IN_SELECTOR are present. If TXFS_OUTPUTS
is set in CORE_SELECTOR, then INOUT_SELECTOR and OUT_SELECTOR are present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. I'm thinking of changing this as follows:
- remove
TXFS_INPUTS
andTXFS_OUTPUTS
bits - reader will know the entire size of the txfs, so when a second byte is present, look at the bits present in the
INOUT_SELECTOR
byte to know whether to expectIN_SELECTOR
and/orOUT_SELECTOR
. - this frees up two bits in the
CORE_SELECTOR
, one of which I'm thinking to repurpose forSPEND_SCRIPT
(i.e. scriptCode for segwit v0 inputs and tapscript for v1 inputs, scriptPubkey for non-segwit)
bip-txhash.md
Outdated
* the leading in/outputs up to 8192 | ||
* up to 64 individually selected in/outputs | ||
** using absolute indices up to 16384 | ||
** using indices relative to the current input index from -64 to +64. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incredibly complex, and seems to mismatch what I can see covenants being used for in practice. I anticipate fees being high in future, such that people will do a reasonable amount of engineering to minimize their total footprint. In particular, they will want to add fees after commitment, and want to batch transactions using stacking.
The first case implies you want to exclude a specific input and output, to allow for fees, or at least allow binding not to cover the final input/output. The second case implies you want to mul/divide an input number to get the corresponding range of outputs.
The simplest case is a single input and output pair: a-la SIGHASH_SINGLE. This both allows almost arbitrary fee inputs/outputs, and stacking.
But what if you want to bind a pair of inputs to one output? Or a pair of outputs to one input? Both seem reasonably common things to want to do (e.g. opening a dual-funded lightning channel, and closing a channel).
That means you need to be able to select outputs as "current input index / 2" or "current input index * 2 and current input index * 2 + 1". Numbers other than 2 are possible but this is the most likely case (since, in order to stack, all txs must be of same input-output number form, and I consider 1 and 2 by far the most likely numbers here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is true. Initially I didn't have relative indices. I'm still not entirely convinced they are useful. Precisely for the 1-in 2-out case which seems super common to me. I heard "you'd be surprised how easy it is to add an extra input".
My initial thought was that private aggregation (i.e. not through public broadcast media like mempools) would be easily possible as a user can just create/sign a thousand variants of their txs, for each possible input index. This works with absolute indices and doesn't need relative indices. It might even work with public broadcast.
The problem is that doing this with absolute indices only works if everyone in the protocol has the same in-out ratio. (Everyone needs 1-to-2 so you can sign 1in1,2out, 2in3,4out, 3in5,6out, etc). Otherwise you get a quadratic amount of data. With relative indices, you can sign XinX,1out, XinX,2out, XinX,3out,.. and this way the coordinator can put you in any place and put your second output in some arbitraty place and pick your signatures based on where your second output is placed.
Ok this doesn't really require relative indices, but it requires the ability to mix "current" with absolute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This leaves some protocols vulnerable to the partial signature attacks. Say the covenant requires your outputs go half to pubkeyA and half to pubkeyB. Now I have two identical 1BTC covenant UTXO inputs, but re-use the same outputs to satisfy both, and steal the other 1BTC.
The same problem applies to "tell me the outputs in the witness data".
bip-txhash.md
Outdated
future addition of byte manipulation opcodes like `OP_CAT`, an additional | ||
cost is specified per TransactionHash execution. Using the same validation | ||
budget ("sigops budget") introduced in BIP-0342, each TransactionHash | ||
decreases the validation budget by 10. If this brings the budget below zero, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs much more justification. Why 10? It has an implied cost of 2 already, since you have to use the opcode and a selector. If it has to hash a lot, hasn't someone already paid that to make such a large transaction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's tricky. In practice, it has a similar amortized per-tx hash cost that sighashes have. It's hard to count those to the budget because they are amortized, it's basically hashing all the large tx fields once so that if they are repeatedly requested their hash can be used.
After the amortized hash cost, it's just a finite series of ~32-byte chunks with maximally 64 in/out which in total can have 8 fields that are each ~32 bytes. This is ~16,384 bytes max.
Then, another consideration is that it would be nice and reasonable if TXHASH+CSFS would not have a higher cost than what naturally would be placed in the witness, the 64-byte signature.
I see it like this: we have a 64-byte budget to divide over TXHASH+CSFS as I think it's reasonable that this combination doesn't cost more than 28% more than a CHECKSIG (which is 50).
So maybe it's right that TXHASH can actually cost more, something like 25 if CSFS would be priced at 35 or 40.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a signature budget, it's a hashing budget. Perhaps we should make this a first-class citizen then?
See https://rusty.ozlabs.org/2023/12/22/script-limits-opcat.html#my-proposal-a-dynamic-limit-for-hashing
|
||
# Detailed Specification | ||
|
||
A reference implementation in Rust is provided attached as part of this BIP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I would really appreciate a table of all the bits and exactly what and how they encode them. It's particularly nasty because some values are little-endian 32 bit encoded, not CScriptNum encoded, and others are varint encoded?
But it's nice to be explicit in each case, for people like me who are not deep in the weeds of bitcoin's onchain representation, since it helps when considering how to use this alongside things like OP_CAT and extended arithmetic opcodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I agree. I think I tried to encode values the way they are consistently encoded in other contexts like sighashes and p2p. But I will go over them and list them in the BIP as well. It's true that I didn't consider the interactions between regular LE encoding and CScriptNum encoding which is what will be used when math is done in Script for things like values.
* The element on the stack is at least 32 bytes long, fail otherwise. | ||
* The first 32 bytes are interpreted as the TxHash and the remaining suffix bytes specify the TxFieldSelector. | ||
* If the TxFieldSelector is invalid, fail. | ||
* The actual TxHash of the transaction at the current input index, calculated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should maybe specify that the element is not popped off the stack, or is that implicit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it might be worth mentioning yeah, but I thought it was implicit as the other opcode explicitly mentions that it takes the items from the stack. It's kinda characteristic of a -VERIFY
opcode to not touch the stack.
bip-txhash.md
Outdated
* 4. `TXFS_INOUT_NUMBER | TXFS_INOUT_SELECTION_ALL` | ||
* the `0x00` byte: it is set equal to `TXFS_SPECIAL_ALL`, which means "ALL" and is primarily | ||
useful to emulate `SIGHASH_ALL` when `OP_TXHASH` is used in combination | ||
with `OP_CHECKSIGFROMSTACK`.<br>Special case `TXFS_SPECIAL_TEMPLATE` is 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be TXFS_SPECIAL_ALL
? Maybe same as Rusty is saying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed I think.
summary, followed by a reference implementation of the CalculateTxHash function. | ||
|
||
* There are two special cases for the TxFieldSelector: | ||
* the empty value, zero bytes long: it is set equal to `TXFS_SPECIAL_TEMPLATE`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"what" is set equal to TXFS_SPECIAL_TEMPLATE
? Maybe define what the bytes of the field selector means before the special cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I improved this section. What I mean is "The input txfield selector is set from empty to this one, so whatever that one means".
* 3. `TXFS_INOUT_NUMBER | TXFS_INOUT_SELECTION_ALL` | ||
* 4. `TXFS_INOUT_NUMBER | TXFS_INOUT_SELECTION_ALL` | ||
|
||
* The first byte of the TxFieldSelector has its 8 bits assigned as follows, from lowest to highest: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found this section very hard to follow. Would it be an idea to more gently introduce an example field selector to show how it looks like (bit representation)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it might be the case. In the latest version I added some example bit selectors after the written explanation. Can you see if they make sense to you?
bip-txhash.md
Outdated
* all in/outputs | ||
* the current input index | ||
* the leading in/outputs up to 8192 | ||
* up to 64 individually selected in/outputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this allow everlasting covenants (coins locked forever in a fixed set of addresses)? It is not clear from the text of BIP if this is possible and an intended use-case. IIUC, it is impossible because of chicken-and-egg problem: output script has to include a hash of itself to make an everlasting covenant which is impossible. But it is better to clarify this in BIP text explicitly, including a formal proof of why an everlasting covenant is impossible or (if it is possible) discuss use-cases and consequences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is not possible AFAIK. But I wouldn't be comfortable making that claim, if you see what people can do. Especially if we get OP_CAT and OP_TWEAKADD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh? Isn't this trivial, just by requiring prevout sPK/value = output sPK/value? Something like PUSH3[0x00 0x18 0x40] TXHASH PUSH3[0x00 0x60 0x40] TXHASH EQUAL
(don't commit the field selector, grab the sPK/value, for just this input/output, check the txhashes are equal) ? You mightn't be able to do anything very interesting though.
Assigned BIP 346. |
Instead, add a notice about malleability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seem to be a lot of unresolved comments in this PR, and the document appears to still be missing the Backwards Compatibility section. Please resolve existing review comments and let us know when this is ready for an editor review
BIP: tbd | ||
Layer: Consensus (soft fork) | ||
Title: OP_TXHASH and OP_CHECKTXHASHVERIFY | ||
Author: Steven Roose <steven@roose.io> | ||
Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-tbd | ||
Status: Draft | ||
Type: Standards Track | ||
Created: 2023-09-03 | ||
License: BSD-3-Clause |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please incorporate the assigned number, add the README table entry, and add the Post-History header to link to the mailing list discussion or other fora where this proposal was discussed.
* 8: values (`TXFS_OUTPUTS_VALUES`) | ||
|
||
* We define as follows: | ||
* `TXFS_ALL = TXFS_VERSION | TXFS_LOCKTIME | TXFS_CURRENT_INPUT_IDX | TXFS_CURRENT_INPUT_CONTROL_BLOCK | TXFS_CURRENT_INPUT_LAST_CODESEPARATOR_POS | TXFS_INPUTS | TXFS_OUTPUTS | TXFS_CONTROL` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a typo?
* `TXFS_ALL = TXFS_VERSION | TXFS_LOCKTIME | TXFS_CURRENT_INPUT_IDX | TXFS_CURRENT_INPUT_CONTROL_BLOCK | TXFS_CURRENT_INPUT_LAST_CODESEPARATOR_POS | TXFS_INPUTS | TXFS_OUTPUTS | TXFS_CONTROL` | |
* `TXFS_ALL = TXFS_VERSION | TXFS_LOCKTIME | TXFS_CURRENT_INPUT_IDX | TXFS_CURRENT_INPUT_CONTROL_BLOCK | TXFS_CURRENT_INPUT_LAST_CODESEPARATOR_POS | TXFS_INPUTS_ALL | TXFS_OUTPUTS_ALL | TXFS_CONTROL` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a first pass. Big Concept ACK from me. I have some notes about doc organization and plan to do another pass really drilling into the details soon.
The TxFieldSelector has the following semantics. We will give a brief conceptual | ||
summary, followed by a reference implementation of the CalculateTxHash function. | ||
|
||
* There are two special cases for the TxFieldSelector: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I suggest that we move the section on the special cases to be after the fundamental atomic selector flags? It's much easier to read this document by building up from the building blocks rather than starting with the template and having to squint to find where all the sub-properties are defined later.
I think the discussion for the optimized case follows very naturally once we understand the components here.
v outputs | ||
<-> <---------> inputs | ||
1 1 1 1 1 1 1 1 | ||
| | | | | | | ^ prevouts | ||
| | | | | | ^ sequences | ||
| | | | | ^ scriptSigs | ||
| | | | ^ prevout scriptPubkeys | ||
| | | ^ prevout values | ||
| | ^ taproot annexes | ||
| ^ scriptPubkeys | ||
^ values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "v" threw me off so I'm offering a suggestion that may help readability.
v outputs | |
<-> <---------> inputs | |
1 1 1 1 1 1 1 1 | |
| | | | | | | ^ prevouts | |
| | | | | | ^ sequences | |
| | | | | ^ scriptSigs | |
| | | | ^ prevout scriptPubkeys | |
| | | ^ prevout values | |
| | ^ taproot annexes | |
| ^ scriptPubkeys | |
^ values | |
<-> outputs | |
| | <---------> inputs | |
1 1 1 1 1 1 1 1 | |
| | | | | | | ^ prevouts | |
| | | | | | ^ sequences | |
| | | | | ^ scriptSigs | |
| | | | ^ prevout scriptPubkeys | |
| | | ^ prevout values | |
| | ^ taproot annexes | |
| ^ scriptPubkeys | |
^ values |
0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 | ||
| | | | <--------------> second idx: 3 | ||
| | | | <--------------> first idx: 1 | ||
| | | | <-----> selection count: 0b10 == 2 | ||
| | | ^ index size 0: single byte per index | ||
| | ^ absolute index | ||
| ^ individual mode | ||
^ don't commit the number of in/outputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 | |
| | | | <--------------> second idx: 3 | |
| | | | <--------------> first idx: 1 | |
| | | | <-----> selection count: 0b10 == 2 | |
| | | ^ index size 0: single byte per index | |
| | ^ absolute index | |
| ^ individual mode | |
^ don't commit the number of in/outputs | |
0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 | |
| | | | <-------------> second idx: 3 | |
| | | | <-------------> first idx: 1 | |
| | | | <-----> selection count: 0b10 == 2 | |
| | | ^ index size 0: single byte per index | |
| | ^ absolute index | |
| ^ individual mode | |
^ don't commit the number of in/outputs |
Hey @stevenroose, this pull request has had unaddressed review for over six months. Are you still working on this? If not, is there someone else that wants to pick this project up? |
Semantic changes
I thought it might be valuable to keep track of actual semantic changes being made since the initial out-of-draft version.
Implementations
Add OP_TXHASH and OP_CHECKTXHASHVERIFY opcodes bitcoin#29050
txhash: Implement TxHashCache for OP_TXHASH and OP_CHECKTXHASHVERIFY rust-bitcoin/rust-bitcoin#2275