Btccheckpoint oracle schema and processing #43

KonradStaniec · 2022-06-30T17:13:42Z

Adds schema and full processing of btccheckpoint message.
The schema looks like follows:
Each checkpoint submission can be uniquely identified by hashes of block headers in which transactions are included and by indexes of those transactions therefore key for each submission is:

TransactionKey := (BlockHash, TransactionIdx)
SubmissionKey := List[TransactionKey]

Such keys enables easy check for duplicated submissions by just after parsing provdied message.

Another requirement is easy access to previous epoch checkpoints. Given that each submission key is pretty small i.e currently around 80bytes (2x32byte hash + 2x4bytes index + some proto overhead) we can easily store in database list of those i.e have mapping

epoch -> List[SubmissionKey]

Even list of 100000 submissions would only have around 8mb which is still reasonable to to hold as single value in store, and load into memory at once. Although more realistically I won't expect even 1000 of such unique submissions as each submission represent separate valid bitcoin transaction with the same checkpoint, just spread around different blocks/transactions. With such encoding we can get all info about checkpoints from previous epoch with just on db lookup. Another advantage of such encoding, is that order of the List[SubmissionKey] can represent order of submissions.

There is also values we hold with each submitted checkpoint and for now it mostly data required to pay up the reward.

So full schema looks like:

DataTypes:
TransactionKey := (BlockHash, TransactionIdx)
SubmissionKey := List[TransactionKey]
SubmissionsValue := (BabylonAccAddress, List[BtcTransactions])

Mappings:
epoch -> List[SubmissionKey]
SubmissionKey -> SubmissionsValue

Currently there is no pruning of List[SubmissionKey], but it is easy to imagine and end block handler which checks get lates epoch submission keys, checks them agains light client, and prune the ones which lost its validity.

Comment:
For data structures encoding I am using protobuf encoding, If we would like to use less space some structures can be encoded by simple concatenation i.e blockHash1 || txIndex1 as big endian bytes || blockhash2 || txIndex2 as big endian bytes.

vitsalis

Nice work! Some minor comments. Most of those involve the usage of the BTCHeaderBytes and BTCHeaderHashBytes custom objects in order to abstract the usage of wire.BlockHeader and chainhash.Hash objects. However, the type updates kidn of fall outside the scope of this PR so it is fine with me if we decide to implement this in another PR.

vitsalis · 2022-07-14T06:55:42Z

proto/babylon/btccheckpoint/btccheckpoint.proto

+// Each submission can generally be identified by this list of (txIdx, blockHash)
+// tuples.
+// Note: this could possibly be optimized as if transactions were in one block
+// there would have the same block hash and different indexes, but each blockshash


Suggested change

// there would have the same block hash and different indexes, but each blockshash

// they would have the same block hash and different indexes, but each blockhash

vitsalis · 2022-07-14T06:56:00Z

proto/babylon/btccheckpoint/btccheckpoint.proto

+// tuples.
+// Note: this could possibly be optimized as if transactions were in one block
+// there would have the same block hash and different indexes, but each blockshash
+// is only 33 (1 bytes for prefix encoding and 32 byte hash), so there should


Suggested change

// is only 33 (1 bytes for prefix encoding and 32 byte hash), so there should

// is only 33 (1 byte for prefix encoding and 32 byte hash), so there should

vitsalis · 2022-07-14T06:57:50Z

proto/babylon/btccheckpoint/btccheckpoint.proto

+ // TODO: Maybe it is worth recovering senders while processing the InsertProof
+ // message, and store only those. Another point is that it is not that simple
+ // to recover sender of btc tx.
+ repeated bytes btctransaction = 2;


Maybe we can add a custom type for this. If we follow the "fortress" approach which is defined for BTCHeaderBytes and BTCHeaderHashBytes, then verification would happen in only one place, while we could efficiently display it as a hex.

tbh I am not sure this thing will ultimately be here. For now I keep it here to extract btc sender later, but I am not sure it is such each task to do, so maybe some other approach will be need (like having btc sender in the message directly ?).

That's why for now I would leave it like that and add the approach you mention in the future.

@fishermanymc suggested once to put the Babylon account to be rewarded into the OP_RETURN, mainly to do away with having to register the BTC->BBL mapping up front. It felt to me like that would be using a limited resource for something we can get around by other means. But it's an option.

vitsalis · 2022-07-14T06:58:42Z

x/btccheckpoint/btcutils/btcutils.go

- BlockHeader wire.BlockHeader
- Transaction *btcutil.Tx
- OpReturnData []byte
+ BlockHeader wire.BlockHeader


In the btclightclient module, the concept of wire.BlockHeaders has been entirely abstracted by the BTCHeaderBytes object. I suggest that you use that here.

sure.

I mean it makes not much difference to me, and if those are types which you prefer on light client level then lets go for it. Consider all following comments regarding those types solved 👍

Although at some point we may need to discuss also passing whole header at interface level, as a second source of headers from btccheckpoint module. (imo this is not critical right now, so we can just use the hash)

vitsalis · 2022-07-14T06:59:22Z

x/btccheckpoint/btcutils/btcutils.go

- OpReturnData []byte
+ BlockHeader wire.BlockHeader
+ // keeping header hash to avoid recomputing it everytime
+ BlockHash chainhash.Hash


Same, but with the BTCHeaderHashBytes object. It provides methods for efficiently performing the important operations that are performed by a chainhash.Hash module and even more.

vitsalis · 2022-07-14T07:15:06Z

x/btccheckpoint/keeper/msg_server.go

+ // TODO: light client lost knowledge of blockhash from previous epoch
+ // (we know that this is not rawSub as we checked that earlier)
+ // It means either some bug / or fork had happened. For now just move
+ // forward as we are not able to establish ancestry here


Maybe panic?

as above hmm this is maybe question about light client guarantees then. If the block was known to light client in the past, will it be known forever in the future even if it won't be included on the main chain ?

We do plan to have a pruning depth to forget old blocks after all, just to keep that in mind.

ok, so it is possible to forget once known block i.e I will leave it check for now

vitsalis · 2022-07-14T07:15:44Z

x/btccheckpoint/keeper/msg_server.go

+
+ lastHashFromSavedCheckpoint := hs[len(hs)-1]
+
+ // do not check err as all those hashes were checked in previous validation steps


Maybe check err and panic if it is set?

vitsalis · 2022-07-14T07:17:15Z

x/btccheckpoint/keeper/msg_server.go

+ if ed == nil {
+ newEd := types.NewEpochData(subKey)
+ m.k.SaveEpochData(sdkCtx, epochNum, &newEd)
+ m.k.SaveSubmission(sdkCtx, subKey, subData)
+ } else {
+ // epoch data already existis for given epoch, append new submission, and save
+ // submission key and data
+ ed.AppendKey(subKey)
+ m.k.SaveEpochData(sdkCtx, epochNum, ed)
+ m.k.SaveSubmission(sdkCtx, subKey, subData)
+ }


Suggested change

if ed == nil {

newEd := types.NewEpochData(subKey)

m.k.SaveEpochData(sdkCtx, epochNum, &newEd)

m.k.SaveSubmission(sdkCtx, subKey, subData)

} else {

// epoch data already existis for given epoch, append new submission, and save

// submission key and data

ed.AppendKey(subKey)

m.k.SaveEpochData(sdkCtx, epochNum, ed)

m.k.SaveSubmission(sdkCtx, subKey, subData)

}

if ed == nil {

newEd := types.NewEpochData(subKey)

ed = &newEd

} else {

// epoch data already existis for given epoch, append new submission, and save

// submission key and data

ed.AppendKey(subKey)

}

m.k.SaveEpochData(sdkCtx, epochNum, &newEd)

m.k.SaveSubmission(sdkCtx, subKey, subData)

vitsalis · 2022-07-14T07:19:01Z

x/btccheckpoint/types/expected_keepers.go

+ // BlockHeight should validate if header with given hash is valid and if it is
+ // part of known chain. In case this is true it shoudld return this block height
+ // in case this is false it should return error
+ BlockHeight(header chainhash.Hash) (uint64, error)


The btclightclient module does not have the concept of chainhash.Hash. Please use BTCHeaderHashBytes instead. You can get a BTCHeaderHashBytes from chainhash.Hash by using the NewBTCHeaderHashBytesFromChainhash() method, although I believe you can replace all usages of chainhash.Hash in the codebase with BTCHeaderHashBytes from what I've seen.

vitsalis · 2022-07-14T07:19:12Z

x/btccheckpoint/types/expected_keepers.go

+
+ // IsAncestor should check if childHash header is direct ancestor of parentHash
+ // if either of this header is not known to light clinet it should return error
+ IsAncestor(parentHash chainhash.Hash, childHash chainhash.Hash) (bool, error)


aakoshh · 2022-07-14T12:18:10Z

x/btccheckpoint/keeper/keeper.go

 }

-// TODO just return checkpoint if exists
-func (k Keeper) GetCheckpoint(ctx sdk.Context, e uint64) []byte {
+func (k Keeper) SaveSubmission(ctx sdk.Context, sk types.SubmissionKey, sd types.SubmissionData) {


Just wondering if instead of having a separate SaveEpochData and SaveSubmission there should be one AddSubmission(epochNumber, submission) that calculates the keys on its own - it could even get the epoch number from the submission but I'm not sure what the state of your thinking is about who knows what about the op_return content.

That way I could not use public methods of the keeper to:

set submission data with the wrong key

insert submission data that doesn't get recorded in the epoch

insert submission keys in the epoch that don't exist

I was also thinking about it , and currently have method checkAndSaveEpochData in msg_server with todo // TODO maybe move it to keeper
but I wanted to consider it after pr which will use the hooks from epoching and lightclient to manipulate submission data in someway. I imagine that SaveSubmission will come in handy on its own then but we will see.

aakoshh · 2022-07-14T13:01:36Z

x/btccheckpoint/keeper/msg_server.go

+ if !m.checkHashesAreAncestors(hs) {
+ // checkpoint blockhashes no longer form a chain. Cannot check ancestry
+ // with new submission. Move to the next checkpoint
+ continue
+ }


This should have been checked when the previous submission was admitted, no?

yup, although we are checking if this still holds true i.e if some rollback did happen which make checkpoint basically invalid and two blocks ended on different forks.

Not sure what is probability of this happening, imo really low, but at least we should log it/ be aware of it somehow. (logging will be added in separate pr)

Hm, so the checkpoint has blocks that used to be in an ancestor-descendant relationship - that can't change during the reorg 🤔

You are right that a checkpoint could be "torn", with the first block still on the main chain and the second no longer on it. But that just means that if the current block wants to use this as its previous checkpoint, it still has to be a descendant of the second block.

not sure I understand, but as I see it the whole chain of checkpoints needs to be valid i.e if we receive submission with
blocks b9 and b10 then:

we first check that b10 is child of b9

retrieve checkpoint for previous epoch. Lest say we got submission with blocks b7 and b8

we check that b7 is still ancestor of b8 . It was when we admitted it to db, we just check if it did not change in the meantime due to some reorgs etc. Ultimately, I am not sure how probable it would happen and how to best handle it, for now I am just stop treating such checkpoint as valid one. Probably checkpoining module should be informed about it but I am still thinking it thorugh.

Now that we established that b9 and b10 have ancestry relationship, and that b7 and b8 still have ancestry relationship we only check that b8 is ancestor of b9 (which is basically this check anc, err := m.k.IsAncestor(*lastHashFromSavedCheckpoint, rawSub.GetFirstBlockHash()))

Imo at some point I will revisit point 3 to be sure how to best handle it.

What I was trying to say is that it's impossible for b8 to suddenly not be a descendant of b7 because of a reorg. Block can't just change their parents. I thought what you mean is that b7 is still on the main chain but b8 isn't.

Say we have this chain:

b7 - b8 - b9 - b10 \ b8' - b9' - b10' - b11'

if someone sends us a submission in [b9, b10], and we're looking for a previous checkpoint and find a candidate in [b7, b8], all we need to check is that b9 <- b10 and b8 <- b9 because b7 <- b8 will always hold. However, it is possible that the longest chain already ends at b11', so while b7 is still in play, b8 is orphaned.

That doesn't mean we can't accept the submission, since in theory we could switch back to the upper fork.

aakoshh · 2022-07-14T13:03:46Z

x/btccheckpoint/keeper/msg_server.go

+ // TODO: SaveEpochData and SaveSubmission should be done in one transaction.
+ // Not sure cosmos-sdk has facialities to do it.
+ // Otherwise it is possible to end up with node which updated submission list
+ // but did not save submission itself.


This will happen during transaction processing, so what will happen is that Cosmos will create a doubly cached data store, where 1st of cache is for the whole block processing and 2nd is for individual transactions, so they can be rolled back in case of failure, and committed at the end of the block.

See https://docs.cosmos.network/master/core/store.html#cachemultistore and https://tutorials.cosmos.network/academy/2-main-concepts/multistore-keepers.html

But I think ideally the "keeper" should be responsible for maintaining the integrity of the data model, otherwise it doesn't really live up to its name.

thanks for links, I will go through them. In perfect world I would use something like WriteBatch to ensure write will be atomic, but I did not see anything like this in sdk. I definitely intend to look for it at some point after completing whole flow.

Yeah they take care of that for you I think. The commit happens here, and this is where the cache is initiated, called from BeginBlock.

aakoshh · 2022-07-14T13:13:33Z

x/btccheckpoint/keeper/msg_server.go

+ if epochNum > 0 {
+ // this is valid checkpoint for not initial epoch, we need to check previous epoch
+ // checkpoints
+ previousEpochData := m.k.GetEpochData(sdkCtx, epochNum-1)


IIRC from @SebastianElvis PRs the Genesis block is epoch 0, and then block 1 is in epoch 1, so the first epoch probably won't have a predecessor.

oh ok. So I will check for ancestry here only if epochNum > 1 👍

aakoshh · 2022-07-14T13:20:56Z

x/btccheckpoint/types/msgs.go

 }

 func (m *MsgInsertBTCSpvProof) ValidateBasic() error {
- if _, err := sdk.AccAddressFromBech32(m.Submitter); err != nil {
+ address, err := sdk.AccAddressFromBech32(m.Submitter)


So the submitter isn't coming from the Bitcoin transaction? That would mean that any hyena can steal someone's BTC submission. This is why I added the Registered Submitter entity to the schema, and an initial registration step where submitters can tell us where they want the rewards to go. I took it for granted that we would get the address to reward from the BTC transaction.

You commented it's difficult to get out. Why so?

So apparently it is not really easy to get sender of bitcoin transaction due to multiple reasons. Relevant links:

https://en.bitcoin.it/wiki/From_address

https://bitcoin.stackexchange.com/questions/28182/how-to-find-the-change-sender-address-given-a-txid

That why I kinda skipped this part of data (i.e save tx to db but not do anything special with it), and intend to return to it when working on rewards.

I also intend to research topic a bit then. Maybe we can somehow restrict transaction shape to easier retrive sender, or add additional data to message, I am not sure for now.

Submitter address is saved mainly to refund him for gas.

aakoshh · 2022-07-14T13:24:58Z

x/btccheckpoint/types/types.go

+ var tBytes [][]byte
+ tBytes = append(tBytes, rsc.Proof1.TransactionBytes)
+ tBytes = append(tBytes, rsc.Proof2.TransactionBytes)


I looked at this for a while to realise it's not appending the two transactions into one array.

Is there no shorthand syntax, like [][]byte { rsc.Proof1.TransactionBytes, rsc.Proof2.TransactionBytes }?

aakoshh

Great, the logic is easy to follow!

I have only two blockers:

The first epoch will not have a previous one to look for.
The submitter field should only be used to tell Cosmos who should sign the Babylon transaction, but these relayer guys are not the ones getting the compensation; that should be the submitter who paid for BTC. The relayer could get some Babylon so they can keep paying gas but I have no idea from whom and how, and when do we consider a duplicate submission spam.

aakoshh

Great progress 👏

So we discussed that the submitter is not necessarily the one we want to reward for the BTC fees incurred, but fixing that is an issue of further research.

fishermanymc requested a review from gitferry July 9, 2022 09:31

KonradStaniec added 2 commits July 13, 2022 17:47

Btccheckpoint oracle schema

c607e76

Clean up

7c37d4e

KonradStaniec force-pushed the btcheckpoint-schema branch from 942d411 to 7c37d4e Compare July 13, 2022 16:12

KonradStaniec requested review from vitsalis and aakoshh July 13, 2022 16:15

KonradStaniec marked this pull request as ready for review July 13, 2022 16:15

vitsalis reviewed Jul 14, 2022

View reviewed changes

aakoshh reviewed Jul 14, 2022

View reviewed changes

Pr comments

11f604e

aakoshh approved these changes Jul 15, 2022

View reviewed changes

KonradStaniec merged commit 09b04f9 into main Jul 16, 2022

KonradStaniec deleted the btcheckpoint-schema branch July 16, 2022 06:06

KonradStaniec pushed a commit that referenced this pull request Oct 25, 2023

btcstaking: compose witness of a slashing tx (#43)

dcf3b45

	// there would have the same block hash and different indexes, but each blockshash
	// they would have the same block hash and different indexes, but each blockhash

	// is only 33 (1 bytes for prefix encoding and 32 byte hash), so there should
	// is only 33 (1 byte for prefix encoding and 32 byte hash), so there should


		lastHashFromSavedCheckpoint := hs[len(hs)-1]

		// do not check err as all those hashes were checked in previous validation steps

Btccheckpoint oracle schema and processing #43

Btccheckpoint oracle schema and processing #43

Conversation

KonradStaniec commented Jun 30, 2022 • edited Loading

vitsalis left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KonradStaniec Jul 14, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aakoshh left a comment

Choose a reason for hiding this comment

aakoshh left a comment

Choose a reason for hiding this comment

KonradStaniec commented Jun 30, 2022 •

edited

Loading

KonradStaniec Jul 14, 2022 •

edited

Loading