Skip to content
This repository has been archived by the owner on May 3, 2024. It is now read-only.

In-circuit anchor transaction checks #67

Closed
Brechtpd opened this issue Feb 23, 2023 · 27 comments · Fixed by #105
Closed

In-circuit anchor transaction checks #67

Brechtpd opened this issue Feb 23, 2023 · 27 comments · Fixed by #105

Comments

@Brechtpd
Copy link

Brechtpd commented Feb 23, 2023

With the latest protocol changes, the anchor transaction checks have been moved from the smart contract to inside the circuit. So we now have to extend the circuits to be able to check what we previously did on-chain: https://github.com/taikoxyz/taiko-mono/blob/69ede688d9c51e8265b4724c32566992b290b223/packages/protocol/contracts/L1/libs/LibProving.sol#L325.

  1. The best approach is probably be to just do this in the public input circuit to directly check against the expected values there for the 1st transaction. Alternatively it may also make sense to just do this in the tx circuit.
  2. Another way to do it is to more closely follow the smart contract approach where some data is checked in the MPT and transaction tree, but that seems more complicated.

So unless there's a problem with the 1st method and somehow 2nd method is easier, would just stick with approach 1.

The data that needs to be checked, except for the standard transaction data:

  • The signature data is the expected signature. This will have to implement this code using halo2wrong: https://github.com/taikoxyz/taiko-mono/blob/69ede688d9c51e8265b4724c32566992b290b223/packages/protocol/contracts/libs/LibAnchorSignature.sol. The best place to check this would be the tx circuit because it already integrates with halo2wrong. If this turns out not to be easy, we can still generate the signature on L1 and put it in the circuit.
  • The L1 block height matches the value in the public input
  • The L1 block hash matches the value in the public input
  • The L1 signalRoot is the storage root of the L1 signal smart contract account, which needs to be checked using the MPT circuit. The witness data for this Merkle proof is L1 data, not L2 data like all other MPT checks!
  • The L2 signalRoot is the storage root of the L2 signal smart contract account, which needs to be checked using the MPT circuit. The witness data for this Merkle proof is L2 data like normal.

Because the Anchor transaction always has the same byte length, should be easy to just paste and verify the necessary data in front of the main tx list data.

There are 2 additional MPT checks that need to inserted in the MPT circuit and witness data supplied to be able to do so, in once case it's L1 witness data so this may need some more work to be able to get this data from somewhere (passed in by the node directly seems to make sense because there the data should be more directly available because an L2 node needs to communicate with an L1 node already).

Because the anchor function can fail when the prover sets some incorrect values (like the basefee) we need to enforce that the tx is successful.

@Brechtpd Brechtpd converted this from a draft issue Feb 23, 2023
@Brechtpd Brechtpd moved this from 📝 Todo to 🏗 In progress in Taiko Project Board Mar 16, 2023
@johntaiko
Copy link

johntaiko commented Apr 24, 2023

The Anchor Checks:

What do we need to check?

  • The first transaction from witness MUST be anchor transaction
  • Anchor transaction MUST be a new transaction type (just like other transaction without signature and needs from field
  • TxTable needs check:
    • TxFieldTag::Nonce == fixed-cell 0
    • TxFieldTag::Gas == fixed-cell 150000
    • TxFieldTag::GasPrice == fixed-cell 0
    • CallerAddress == fixed-cell GOLDEN_TOUCH(Maybe we don't need golden-touch address?)
    • CalleeAddress == fixed-cell LAYER2_CONTRACT_ADDRESS
    • IsCreate == fixed-cell 0
    • Value == fixed-cell 0
    • CallDataLength == fixed-cell fixed anchor call length
    • CallDataGasCost based on CallData
    • TxSignHash based on CallData (the transaction hash without signature)
    • CallData:
      • method_signature == fixed-cell 0x3d384a4b of sha3(anchor(bytes32,bytes32,uint64,uint64))
      • bytes32 l1Hash == from public-input
      • bytes32 l1SignalRoot == from public-input
      • uint64 l1Height == from public-input
      • uint64 parentGasUsed == from public-input

Annotations

  • fixed-cell means constant value in circuit
  • public-input means l1 contract verify public input
  • based on means its value is calculated by other parts of the transaction

Circuit Layout

We don't need a more circuit to prove the anchor transaction, and using the pi_circuit2 with an extra fixed column:

pi_circuit2_layout anchor(fixed)
...... Nonce 0
...... GasLimit 150000
...... GasPrice 0
...... CallerAddress GOLDEN_TOUCH
...... ......

How to constrain

Use the constrain_equal method to constrain every fields one by one.

New Public Input

The old public input format: https://www.notion.so/taikoxyz/New-public-input-circuit-3c556ac6452b47e486f2315c0dbe1fc0?pvs=4

The new one

Because we need check the anchor transaction's parameters, so we need extra inputs from l1 contract:

taikoxyz/taiko-mono#13640

  • l1SignalRoot the root of l1 signal
  • parentGasUsed l2 parent block gas used(?) @davidtaikocha

Conclusion

The easiest way to check the anchor transaction is trying to constant all the fields with fixed cells,
If constating is not possible, then we try to constrain them with public input in assignment stage.

TBD

  • How to deal with eip-1559?

@Brechtpd
Copy link
Author

The first transaction from witness MUST be anchor transaction

What do you mean with "from witness" here?

Anchor transaction MUST be a new transaction type

So we are going to with a new transaction type? I heard from David that that may not be the plan anymore.

We don't need a more circuit to prove the anchor transaction, and using the pi_circuit2 with an extra fixed column

Can you explain a bit more how you will handle the anchor transaction and all other transactions, ensuring the anchor transaction has the expected data while still checking that all other transaction data matches the ones passed into the smart contract.

How to deal with eip-1559?

What is still an open question about this?

@johntaiko
Copy link

What do you mean with "from witness" here?

I mean that the txlist from l2-node

So we are going to with a new transaction type? I heard from David that that may not be the plan anymore.

Because we don't need sign the anchor transaction anymore, so if we reuse the same tx types, it is not easy to deal with in l2-node. Let me ask David to double check

What is still an open question about this?

I think the main problem is that if we support both legacy and 1559, the rlp_decode and tx_table(maybe any other parts) need to be refactored. Away from PSE 🤣

@Brechtpd
Copy link
Author

I think the main problem is that if we support both legacy and 1559, the rlp_decode and tx_table(maybe any other parts) need to be refactored. Away from PSE rofl

Right, hopefully the rlp circuit @smtmfft is working on will make this easy!

@johntaiko
Copy link

johntaiko commented Apr 24, 2023

Can you explain a bit more how you will handle the anchor transaction and all other transactions, ensuring the anchor transaction has the expected data while still checking that all other transaction data matches the ones passed into the smart contract.

Maybe an extra selector is OK, but I haven't verified. If it is too complex to check in pi_circuit2, we need an extra sub circuit 😢
TBH, the second option would be better for me

@johntaiko
Copy link

The another import thing is that the execution of anchor transaction MUST be success! (because our zkevm has invalid tx

@Brechtpd
Copy link
Author

Brechtpd commented Apr 24, 2023

I think just an extra fixed selector to handle it should work, it's basically just merging in the anchor transaction data into the txlist data right? Not sure why this would be too complex? Something like this:

is_anchor tx_list_bytes anchor_tx_bytes new_tx_list
1 0 anchor_tx_byte_0 anchor_tx_byte_0
1 0 anchor_tx_byte_1 anchor_tx_byte_1
1 ... .... ....
0 tx_list_byte_0 0 tx_list_byte_0
0 tx_list_byte_1 0 tx_list_byte_1
0 ... .... ....

We rlc all tx_list_bytes and hash it so that we can check that it equals the value in the public input

@Brechtpd
Copy link
Author

The another import thing is that the execution of anchor transaction MUST be success! (because our zkevm has invalid tx

No need to worry about that, with the fixed amount of gas supplied it cannot fail.

@johntaiko
Copy link

it's basically just merging in the anchor transaction data into the txlist data right?

Yeah, I think so

Not sure why this would be too complex?

Oh, I mean the pi_circuit2 is complex enough 🤣

@Brechtpd
Copy link
Author

Right :) Please think about how it could be simplified if you think it's a problem.

@johntaiko
Copy link

johntaiko commented Apr 24, 2023

Right :) Please think about how it could be simplified if you think it's a problem.

Now, the pi_circuit also checks the tx_table's constrains, it doesn't make scene. I mean if we can put these constrains to the tx_circuit.
The logic: https://github.com/privacy-scaling-explorations/zkevm-circuits/blob/main/zkevm-circuits/src/pi_circuit.rs#L312-L543

@smtmfft
Copy link
Collaborator

smtmfft commented Apr 24, 2023

I think the main problem is that if we support both legacy and 1559, the rlp_decode and tx_table(maybe any other parts) need to be refactored. Away from PSE rofl

Right, hopefully the rlp circuit @smtmfft is working on will make this easy!

Not sure when PSE will support 1559, but for us, supporting 1559 seems inevitable now. I think we can enable it in outside gadgets such as RLP or TX circuit, However, since protocol/node goes ahead further, it might delay we enable evm/state (kind of core circuits) for a long time, as which do not support 1559 tx.

@johntaiko
Copy link

johntaiko commented May 22, 2023

The architecture

The new architecture based on master branch(taiko-pi-test branch is too old, and the TxTable's layout between two branches are different).

Data Flow

anchor-circuit

How It works?

So, we need three TxTable for our different txlist stages:

  • In TxTable1 the txlist from l1 contract which comes from propose stage
    • This txlist hash will be checked in the MetahashCircuit
  • In TxTable2 the txlist from l2 block which is added with an Anchor transaction
    • This txlist hash will be checked in the BlockCircuit
  • In TxTable3 the txlist will be checked if it is valid.(previous will be done in the l1 contract @xiaodino
    • It's useful for other circuits. e.g. EvmCircuit

Finally, all these tables will be associated with PiCircuit

The AnchorCircuit's goal

In Anchor circuit we only care about the TxTable2, so the circuit input, output and constant are clear:

Input

We use the TxTable1 comes from TxCircuit as our AnchorCircuit input

Output

The TxTable2 is our AnchorCircuit's output which contains the Anchor transaction

Constants

We need some fixed fields for storing the Anchor's constants:

  • Gas form witness
  • GasPrice form witness
  • Caller form witness
  • Callee form witness
  • Value constant
  • Data's method signature constant

We can fill the fixed column with witness data for convenience, the verification will be failed when we use different constants

TxTable2

The TxTable2 will be used in BlockCircuit for calculating the block hash

Q&A

Q: Why do we need check both l1 txlist and l2 txlist?
A: Because the Anchor transaction is added in the l2, and the l1 doesn't need to known, and then the block hash in public input comes from the taiko-client in verify stage, we can't believe the correctness. But the l1 txlist comes from propose stage and we can trust the correnctness. So we need known the transformation from l1 txlist to l2 txlist and calculate the hash in our circuit for verifying.

Q: Why do we need three tables, TxTable1, TxTable2 and TxTable3?
A: Yeah, because different stages have different TxTable layout:

TxCircuit

tx_id tag index value
0 Nonce 0 1
0 Gas 0 100000
0 GasPrice 0 100000
...
1 Nonce 0 222
...

AnchorCircuit

tx_id tag index value
0 Nonce 0 2667
0 Gas 0 150000
0 GasPrice 0 0
...
1 Nonce 0 1
1 Gas 0 100000
1 GasPrice 0 100000
...
2 Nonce 0 222
...

The tx's id will be increased by 1 in AnchorCircuit.

InvalidCircuit

tx_id tag index value invalid
0 Nonce 0 2667 0
0 Gas 0 150000 0
0 GasPrice 0 0 0
...
1 Nonce 0 1 1
1 Gas 0 100000 1
1 GasPrice 0 100000 1
...
2 Nonce 0 222 0
...

Add invalid flag into TxTable

please review and comment

cc @smtmfft

@johntaiko
Copy link

In addition, we don't need the parentHash and gasUsed in our public input, because these fields are contained in the blockHash.

@johntaiko
Copy link

The Anchor's Nonce will be checked in the invalid circuit, because both the nonce and block hash are given by taiko-client. I can't check its correctness.

@Brechtpd
Copy link
Author

In addition, we don't need the parentHash and gasUsed in our public input, because these fields are contained in the blockHash.

I think we need them (see comment in protocols PR).

The Anchor's Nonce will be checked in the invalid circuit, because both the nonce and block hash are given by taiko-client. I can't check its correctness.

I guess it depends on what is easiest, but I don't think it really needs to go through the invalid circuit checks. We just force the anchor transaction to always be valid, that means the prover has to fill in a valid nonce value.

@Brechtpd
Copy link
Author

So, we need three TxTable for our different txlist stages:

* In `TxTable1` the txlist from l1 contract which comes from `propose` stage
  
  * This txlist hash will be checked in the `MetahashCircuit`

* In `TxTable2` the txlist from l2 block which is added with an `Anchor` transaction
  
  * This txlist hash will be checked in the `BlockCircuit`

* In `TxTable3` the txlist will be checked if it is valid.(previous will be done in the l1 contract @xiaodino
  
  * It's useful for other circuits. e.g. `EvmCircuit`

Finally, all these tables will be associated with PiCircuit

So these will all have their own separate columns, as in actually different tables? If so, why not just insert the anchor tx bytes before the tx list bytes, check if the data is valid, and then decode the data in a single tx table?

@johntaiko
Copy link

Yeah, This is the second way we discussed

@johntaiko
Copy link

johntaiko commented May 22, 2023

But, we already have the invalid TxTable, and we only add an extra table for anchor transaction. But if we insert the rlp bytes into the head of txlist, we need copy the input's bytes and add more other logic in our RlpCircuit, maybe the RlpCircuit will get more complicated.

@johntaiko
Copy link

johntaiko commented May 22, 2023

maybe the RlpCircuit will get more complicated.

It needs to know the detail of our Anchor Transaction

@Brechtpd
Copy link
Author

But, we already have the invalid TxTable, and we only add an extra table for anchor transaction. But if we insert the rlp bytes into the head of txlist, we need copy the input's bytes and add more other logic in our RlpCircuit, maybe the RlpCircuit will get more complicated.

Just a copy from 2 columns to a new combined column like described here is pretty simple I think: #67 (comment). Compared to having to work with multiple tx tables with different data so you also kind of have to copy data from one to the other? Could be that I'm missing something!

It needs to know the detail of our Anchor Transaction

What does the RLP circuit need to know about the anchor transaction?

@johntaiko
Copy link

Make sence, we need increase the rlp list len by one when we insert an anchor transaction rlp into it and check the list length by rules:

For lists – if the concatenated serialization of the elements is less than 56 bytes, the output equals to input with a prefix of 0xc0 plus the length of the list. If it is longer, prefix with 0xf7 plus the length in bytes of the payload (in binary form), plus the length of the payload

@johntaiko
Copy link

What does the RLP circuit need to know about the anchor transaction?

My misunderstood

@johntaiko
Copy link

johntaiko commented May 29, 2023

Update architecture:
taiko-zkevm

@Brechtpd
Copy link
Author

Very cool!

  • parentGasUsed is also an input in the anchor circuit, is this arrow missing?
  • What is the purpose of Invalid Circuit?
  • What does the arrow calculate block hash mean between the evm circuit and blockHash? The blockhash will just be calculated in the public circuit/BlockCircuit somewhere, the current block hash won't be available in the EVM circuit I think.

@johntaiko
Copy link

  • What is the purpose of Invalid Circuit?

I mean the circuit for checking invalid transactions (ruby's job)

  • What does the arrow calculate block hash mean between the evm circuit and blockHash? The blockhash will just be calculated in the public circuit/BlockCircuit somewhere, the current block hash won't be available in the EVM circuit I think.

Maybe some transactions are not valid after evm execution, because the transactions come from L1's txList.
So we need calculate the hash block after evm execution.

@johntaiko
Copy link

  • parentGasUsed is also an input in the anchor circuit, is this arrow missing?

Update

@johntaiko johntaiko linked a pull request Jun 1, 2023 that will close this issue
@Brechtpd Brechtpd moved this from 🏗 In progress to 🫡 In review in Taiko Project Board Jun 5, 2023
@github-project-automation github-project-automation bot moved this from 🫡 In review to ✅ Done in Taiko Project Board Jun 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants