Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle insert checkpoint #31

Merged
merged 3 commits into from
Jun 29, 2022
Merged

Handle insert checkpoint #31

merged 3 commits into from
Jun 29, 2022

Conversation

KonradStaniec
Copy link
Contributor

@KonradStaniec KonradStaniec commented Jun 25, 2022

Initial implementation of processing btc checkpoint transaction.

Things done:

  • All bitcoin related stateless validation.
  • OP_RETURN data extraction
  • Interfaces to light client and checkpointing

There is still a lot of TODOs there but I would rather do them in separate pr-s as this one is already to big for my taste.

Marking @vitsalis and @gitferry as reviewers mainly to check in expected_keepers to their modules are as expected.

@KonradStaniec KonradStaniec marked this pull request as ready for review June 25, 2022 10:29
Copy link
Contributor

@gitferry gitferry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done. Only left some minor comments. Otherwise, LGTM. I wish to see how you would deal with out-of-order sequence of ckpts but it seems it will be addressed in a future PR. Please ping me then. Thanks!

reader := bytes.NewReader(headerBytes)

e := header.Deserialize(reader)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you keep a line break between error and error checking? It seems unnecessary...

also, maybe it would be better to use err instead of e to make it (1) more informative and (2) consistent across the project.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw projects naming it err or e, doesn't really matter for me so we can stick with err, as for line breaks, tbh for me it makes code a bit more readable and go is already pretty mouthful language so one line less here or there does not make much difference.

Comment on lines 67 to 69
str := fmt.Sprintf("block timestamp of %v has a higher "+
"precision than one second", header.Timestamp)
return errors.New(str)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe explicitly define the error in errors.go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea already got TODO to that, although I would rather have whole flow to be bit more completed before decidng which layers propagates which error, and which errors need to ultimatly be logged and propagated to tendermint

}

func (k Keeper) GetCheckpointEpoch(c []byte) (uint64, error) {
return k.checkpointingKeeper.CheckpointValid(c)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be less confusing to name the interface as CheckpointEpoch because essentially you want an epoch number. You would not get 0 if c is not valid. Detailed info will be wrapped in error. If you use CheckpointValid, it would be expected to return a boolean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure I can go with that naming 👍


// TODO for now we jsut store raw checkpoint with epoch as key. Ultimatly
// we should store checkpoint with some timestamp info, or even do not store
// checkpoint itelf but its status
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use epoch+hash as the key since you might have multiple checkpoints for the same epoch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I still want to figure out some things this should be treated as place holder.

Although One can always built argument, that I delegate whole checking of checkpoint state and semantics to checkpointing module. Although from what i remember, we decided not to do it.

tldr; It will be improved/chaned in next pr-s

Copy link
Contributor

@aakoshh aakoshh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, big push here, great effort!

It looks like there are some misunderstandings about the schema. There can definitely be more BTC checkpoints for the epoch, it's just that there can only be one raw checkpoint.

I'm sorry, I should have created ER diagrams for the modules to make it easier to discuss.

Comment on lines 63 to 70
// TODO get powLimit from some config
// result of parsed proog is notneeded, drop it
// whole parsing stuff is stateless
_, e := ParseTwoProofs(m.Proofs, btcchaincfg.MainNetParams.PowLimit)

if e != nil {
return e
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// TODO get powLimit from some config
// result of parsed proog is notneeded, drop it
// whole parsing stuff is stateless
_, e := ParseTwoProofs(m.Proofs, btcchaincfg.MainNetParams.PowLimit)
if e != nil {
return e
}
// TODO get powLimit from some config
// result of parsed proof is not needed, drop it
// whole parsing stuff is stateless
_, err := ParseTwoProofs(m.Proofs, btcchaincfg.MainNetParams.PowLimit)
if err != nil {
return err
}

// input in the first tx
func ParseTwoProofs(proofs []*BTCSpvProof, powLimit *big.Int) ([]*btcutils.ParsedProof, error) {
if len(proofs) != expectedProofs {
return nil, fmt.Errorf("expected at least two valid op return transactions")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return nil, fmt.Errorf("expected at least two valid op return transactions")
return nil, fmt.Errorf("expected exactly two valid op return transactions")

expectedProofs = 2
)

// Parse and Validate provided OP_RETRUN transaction with their proofs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Parse and Validate provided OP_RETRUN transaction with their proofs.
// Parse and Validate provided OP_RETURN transaction with their proofs.

Based on this I expected there to be a check about the content of the OP_RETURN itself, that it indeed looks like a Babylon checkpoint (by delegating to the other module for parsing). Otherwise anyone could just send us two random transactions which have OP_RETURN and some data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general ValidateBasic checks do not depend on other modules or state, and thats why this function is here.

If we would like to do it anteHandler would be probably a correct place for it, although is it really necessary ?

There are more checks when transaction is executed and if some one did send such transaction which will fail during execution, he/she will be charged for it appropriately. And I would really keep this module for knowing anything about structure of checkpoint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, well, I wasn't necessarily talking about stateful checks, just whether the concatentated OP_RETURN data even looks like a checkpoint - that is (I think) supposed to be a pure function.

Ultimately you are right that you will do this check elsewhere, in the service layer, so if you would rather keep this module oblivious to what it's looking for, that's okay with me.

Although arguably even the knowledge that there need to be 2 transactions in the proof means the module does know something about the checkpoints after all, beyond the fact that they are in OP_RETURNS. I noticed that you concatenate all, possibly more than just two if there are more UTxOs created by the two transactions.

If anything, we can conclude that the phrase "Validate provided OP_RETURN" is a bit under specified here :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If anything, we can conclude that the phrase "Validate provided OP_RETURN" is a bit under specified here :)

agree :) I will improve it a bit

Although arguably even the knowledge that there need to be 2 transactions in the proof means the module does know something about the checkpoints after all, beyond the fact that they are in OP_RETURNS. I noticed that you concatenate all, possibly more than just two if there are more UTxOs created by the two transactions.

Yep it is correct, probably a bit more generic way to cap data in op_returns would be specify some byte limit for provided data. Then it stops become this implicit knowlege about checkpointing, but rather sensible limit of data provided by external user, although I am not sure this worth doing as ultimatly both of those limits boils down to how much generic data we want to accept in one message.

Comment on lines 28 to 31
// Function should return epoch of given raw checkpoint or indicate that checkpoint
// is invalid
// If chekpoint is valid checkpointing module should store it.
CheckpointValid(rawCheckpoint []byte) (uint64, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method seems a bit vague to me.

I could interpret it as "try parse checkpoint", to see if the bytes can be interpreted as a valid checkpoint. The return value could be a RawCheckpoint record (I think we have one), and then it makes sense to say that we expect the keeper to remember it.

Or it could be interpreted as validating whether this raw checkpoint is something we expected to receive from BTC, because for example the epoch is known to us. But if that was the case, then all we should be checking really is whether this is the raw checkpoint we already have on record, as we decided there's no point in storing unknown checkpoints.

We can also interpret it to include or not include BLS checks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW the convention is to start these comments with the function name, e.g. CheckpointValid should return ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh I was supposed to be a bit vague.

From point of view of op_return module, it is really not important what validations checkpoinitng module will do. The only thing it cares about if if checkpointing module return some index (I call epoch as this is what we use in babylon but really it could be called index to be more genaral) it to it or error. Most of docs about what validations should be done, will be places in checkpoining module.

From babylon system point of view, probably all validations are necessary:

  • format
  • if this is checkpoint we know about.
  • bls sigs

As I understand our system for now (correct me if I am wrong or missing something):

  • checkpoint module creates/send checkpoint to btc, when after processing some finalised block, at least 1/3 of validators sent theirs checkpoint transactions. (all is happening at epoch boundary)
  • as everything goes through tendermint, it is deterministic. This mean, node processes some block, after this block there is enough bls sigs to construct checkpoint. This checkpoint is treated as correct checkpoint for this epoch, and there will be no more checkpoints for this epoch i.e if there will be more transactions in following block, there will be discarded and not used to update/create new checkpoint.
  • this implies there will be only one valid checkpoint for given epoch
  • there is clear happens before relation between checkpoint submission to btc and checkpoint delivery from btc
  • therefore it makes sense to ask checkpointing module if checkpoint received is known and expected.

After that we can, built internal db of op_return however we want, exposing some methods/hooks for other modules which would like to refer to checkpoint by this index.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, your understanding is the same as mine ✔️

Tbh I was supposed to be a bit vague.

I understand this as you did not want to list what checks the other module is going to do, which is cool.

If chekpoint is valid checkpointing module should store it.

I think it was only this sentence that threw me off: it sounds like the checkpointing module, where CheckpointValid is implemented, is the one that you expect to store a checkpoint. However, based on the bullet points you just listed, this is not the case, because the ledger must already have this raw checkpoint, from the time it was created.

If it didn't have this checkpoint, that would mean we are on a node which is still catching up with the rest, and therefore it won't be able to propose a block, so a finalized block with a transaction with a BTC checkpoint meeting a ledger without a checkpoint can only happen if that checkpoint is from a fork.

Now I am thinking maybe this sentence was a comment that the btccheckpoint module will save this checkpoint, if the epoch number is valid?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I probably have mixed something up when thinking about who should store what 😅 nevertheless I have updated this comment to avoid any confusion

Comment on lines 46 to 50
// TODO copy of the validation done in btc light client at some point it would
// be nice to move it to some commong btc module
func validateHeader(header *wire.BlockHeader, powLimit *big.Int) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think godoc is going to turn these comments into documentation. The body would be a better place for them, and reserve this for what you are including down below, e.g.

// validateHeader performs the checks that [checkBlockHeaderSanity](https://github.com/btcsuite/btcd/blob/master/blockchain/validate.go#L430) of `btcd` does.
// 
// We skip the "timestamp should not be 2 hours into the future" check
// since this might introduce undeterministic behavior.
func validateHeader(header *wire.BlockHeader, powLimit *big.Int) error {

Comment on lines 132 to 133
if txHash.IsEqual(merkleRoot) && index == 0 && len(intermediateNodes) == 0 {
return true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, then I'm not barking up the wrong tree that just the root in verifyProof is not exactly valid, because you make sure it's never called with just the root.

I reckon either a panic would be a better way of handling that case over there. The data should be at a minimum 62 bytes, in which case it could replicate the hash == root check. There's no point checking an inclusion proof when the prover includes nothing, right?

}

if proofLength == 64 {
return false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the other function, this can be a valid proof.

Comment on lines 22 to 24
// returns fully assembled rawcheckpoint data and the latest header number of
// headers provided in the proof
func (m msgServer) getRawCheckPointAndBtchHeight(proofs []*btcutils.ParsedProof) ([]byte, uint64, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// returns fully assembled rawcheckpoint data and the latest header number of
// headers provided in the proof
func (m msgServer) getRawCheckPointAndBtchHeight(proofs []*btcutils.ParsedProof) ([]byte, uint64, error) {
// getRawCheckPointAndBtcHeight returns fully assembled rawcheckpoint data and the latest header number of
// headers provided in the proof
func (m msgServer) getRawCheckPointAndBtcHeight(proofs []*btcutils.ParsedProof) ([]byte, uint64, error) {

Sounded like "bitch height" 👯‍♀️

Comment on lines 41 to 52
if num > latestblock {
latestblock = num
}

rawCheckpointData = append(rawCheckpointData, proof.OpReturnData...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right, so we assume that the transactions are included in the order in which their OP_RETURN can be reassembled into something that makes sense, but also that this order is not not necessarily the one they were included on Bitcoin.

Maybe these two things don't belong to the same function, then, the "max header height" and the concatenation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, we have this assumption that this work should be done by honest submitter. Although now thinking about it, we still did not define how to do it.

Maybe these two things don't belong to the same function, then, the "max header height" and the concatenation.

Probably they do not as those are quite a bit different things. I combined them as I do not like to iterate over things more than once, but those proofs arrays will be small so lets change it.

Comment on lines +85 to +103
// TODO consider handling here.
// Checkpointing module deemed this checkpoint as correc so for now lets just
// store it. (in future store some metadata about it)
// Things to consider:
// - Are we really guaranteed that epoch is unique key ?
// - It would probably be better to check for duplicates ourselves, by keeping some
// additional indexes like: sha256(checkpoint) -> epochNum and epochnNum -> checkpointStatus
// then we could check for dupliacates without involvement of checkpointing
// module and parsing checkpoint itself
// - What is good db layout for all requiremens
m.k.StoreCheckpoint(sdkCtx, epochNum, rawCheckPointData)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is the keeper of the btccheckpoint module, and I assume all we store is what we see here.

I would have expected that this module stores all submitted BTC checkpoints for this epoch, including the BTC transactions they came from, so that later they can be queried in order to:

  • Figure out if they became stable, which requires the BTC header hash
  • Figure out who should get the reward, which is the first submitter

Other validations that I think are missing here is that checkpoint_n+1 should be included in a BTC block that is a descendant of the one(s) containing checkpoint_n. Only the light client can tell. See here. This was included so we can reject out-of-order checkpoints, that is, even if epoch_n, epoch_n+1 and epoch_n+2 exist, and their raw checkpoints are in our database, and epoch_n is checkpointed to Bitcoin, we should reject checkpoints for epoch_n+2 until we see one for epoch_n+1 first.

Copy link
Contributor Author

@KonradStaniec KonradStaniec Jun 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have expected that this module stores all submitted BTC checkpoints for this epoch

This is probably related to discussion from previous comments, as isn't it the case there should be only one valid checkpoint for each epoch ?

I mean, it can happen that it could end up on different forks of BTC, but those are really cases of same checkpoint, different submission. So we can still use checkpointing module to check if this is expected checkpoint.

Figure out who should get the reward, which is the first submitter
Aren't there really two people to get reward:

  • guy which submitted checkpoint to btc
  • guy which submitted checkpoint tx to babylon

Also reward should probably be distributed when we deem that checkpoint is stable. (however we define it)

Other validations that I think are missing here is that checkpoint_n+1 should be included in a BTC block that is a descendant of the one(s) containing checkpoint_n

To be sure If I understand correctly, for example, when receiving checkpoint for epoch 9, we should retrieve from db checkpoint 8 with its header and check if header provided with checkpoint 9 is descendant of header with from epoch 8 ?
I think, only question I have what if there are more than one checkpoints on epoch 8 as it is still not stable, and checkpoints ended on different blocks. I would assume we must check all of the headers for epoch 8 as candidates for this ancestry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those are really cases of same checkpoint, different submission.

Exactly, there should be at most 1 raw checkpoint for the epoch, but it could be submitted to BTC by different submitters, and be on different forks, or on the same.

we can still use checkpointing module to check if this is expected checkpoint.

Absolutely, yes, the checkpointing module can tell if the raw checkpoint is the expected one. What I wanted to highlight is that the btccheckpoint module should store the submissions, and that's why there can be multiple records per epoch.

Also reward should probably be distributed when we deem that checkpoint is stable.

I included this part in the diagram here. It's checking for stable checkpoints every time there's a new header - when the first submission goes stable, the epoch checkpoint is confirmed, and rewards are paid out.

The reason this will not skip epochs, is because this module should reject a submission, if it doesn't have submissions for the previous epoch already, in the BTC history of the submission.

I would assume we must check all of the headers for epoch 8 as candidates for this ancestry.

Yes, I think that's right. When you see a submission for epoch 9, you can look up all submissions for epoch 8, and check if any of them are in a BTC block which is an ancestor of the submission. That would ensure that by the time the submission for 9 is stable, 8 will already have been stable before it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that's right. When you see a submission for epoch 9, you can look up all submissions for epoch 8, and check if any of them are in a BTC block which is an ancestor of the submission. That would ensure that by the time the submission for 9 is stable, 8 will already have been stable before it.

Ok so we are on the same page here :) though I would add this check in next pr when I will work on schema, as only then I will have db layout to retrieve the checkpoints from previous epochs.

I included this part in the diagram here. It's checking for stable checkpoints every time there's a new header - when the first submission goes stable, the epoch checkpoint is confirmed, and rewards are paid out.

Ok then this should be done most probably after correct schema layout is defined.

@KonradStaniec
Copy link
Contributor Author

It looks like there are some misunderstandings about the schema. There can definitely be more BTC checkpoints for the epoch, it's just that there can only be one raw checkpoint.

Oh now as I responded to all comments I looked at top one and I think we are talking about the same thing a bit differently.
That is my current understanding, that there can be only one valid checkpoint per epoch , but there could be many valid checkpoint submissions from btc.

Also about schema, it will be done in next pr, I just wanted to cut this one at some point as it was getting to large. Most important parts here are all btc logic, and stateless validation. Other things, like what we are storing can be consider a placeholder.

OpReturnData []byte
}

func readBlockHeader(headerBytes []byte) (*wire.BlockHeader, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can take advantage of the BTCHeaderBytes interface. In #30 , I have also introduced a function NewBTCHeaderBytesFromBlockHeader which does this.

OpReturnData []byte
}

func readBlockHeader(headerBytes []byte) (*wire.BlockHeader, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can take advantage of the BTCHeaderBytes interface. In #30 , I have also introduced a function NewBTCHeaderBytesFromBlockHeader which does this.

@@ -0,0 +1,205 @@
package btcutils
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest that the functions defined in this package are included in a global namespace (like the BTCHeaderBytes type etc.) so that they are accessible from both the btccheckpoint module and the btclightclient module.

For example, the checks performed by validateHeader are already performed by the btclightclient module. It would be nice if we did not duplicate these checks across different modules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I will move validateHeader as this is quite obviously usefull in both modules :)

Although I would leave other as is until it won't be obvious they are usefull in other context. Usually creating such shared space upfront creates hidden dependencies between consumers of those functions.

pkScriptLen := len(pkScript)
if pkScriptLen > 0 &&
pkScriptLen <= maxOpReturnPkScriptSize &&
pkScript[0] == txscript.OP_RETURN {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another useful check here would be whether the OP_RETURN data starts with out specified babylon prefix (e.g. BBL)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be.

Although one could argue that any try to parse op_return data is the job of checkpointing module. From pov of btccheckpoint, those are just bytes received in form of provably correct bitcoin transactions, and what is inside the is not the interest for it. This way we can later attach meaning to those bytes in other modules however we want.

return []sdk.AccAddress{}
// cosmos-sdk modules usually ignore possible error here, lets assume it cannot
// happen
submitter, _ := sdk.AccAddressFromBech32(m.Submitter)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an error happens here, we should panic. See here. The logic behind this is that GetSigners is called after ValidateBasic so if we get an error here, something went really wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cosmos-sdk modules usually ignores those error e.g https://github.com/cosmos/cosmos-sdk/blob/main/x/nft/msgs.go#L39 i.e probably it is just impossible but sure we can panic.

Copy link
Contributor

@aakoshh aakoshh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confident we discussed everything and landed on the same page. Looking forward to the next PR to evolve the schema along those lines :)

@KonradStaniec KonradStaniec merged commit 741a57a into main Jun 29, 2022
@KonradStaniec KonradStaniec deleted the handle-insert-checkpoint branch June 29, 2022 06:19
KonradStaniec pushed a commit that referenced this pull request Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants