Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Btccheckpoint oracle schema and processing #43

Merged
merged 3 commits into from
Jul 16, 2022
Merged

Conversation

KonradStaniec
Copy link
Contributor

@KonradStaniec KonradStaniec commented Jun 30, 2022

Adds schema and full processing of btccheckpoint message.
The schema looks like follows:
Each checkpoint submission can be uniquely identified by hashes of block headers in which transactions are included and by indexes of those transactions therefore key for each submission is:

TransactionKey := (BlockHash, TransactionIdx)
SubmissionKey := List[TransactionKey]

Such keys enables easy check for duplicated submissions by just after parsing provdied message.

Another requirement is easy access to previous epoch checkpoints. Given that each submission key is pretty small i.e currently around 80bytes (2x32byte hash + 2x4bytes index + some proto overhead) we can easily store in database list of those i.e have mapping

epoch -> List[SubmissionKey]

Even list of 100000 submissions would only have around 8mb which is still reasonable to to hold as single value in store, and load into memory at once. Although more realistically I won't expect even 1000 of such unique submissions as each submission represent separate valid bitcoin transaction with the same checkpoint, just spread around different blocks/transactions. With such encoding we can get all info about checkpoints from previous epoch with just on db lookup. Another advantage of such encoding, is that order of the List[SubmissionKey] can represent order of submissions.

There is also values we hold with each submitted checkpoint and for now it mostly data required to pay up the reward.

So full schema looks like:

DataTypes:
TransactionKey := (BlockHash, TransactionIdx)
SubmissionKey := List[TransactionKey]
SubmissionsValue := (BabylonAccAddress, List[BtcTransactions])

Mappings:
epoch -> List[SubmissionKey]
SubmissionKey -> SubmissionsValue

Currently there is no pruning of List[SubmissionKey], but it is easy to imagine and end block handler which checks get lates epoch submission keys, checks them agains light client, and prune the ones which lost its validity.

Comment:
For data structures encoding I am using protobuf encoding, If we would like to use less space some structures can be encoded by simple concatenation i.e blockHash1 || txIndex1 as big endian bytes || blockhash2 || txIndex2 as big endian bytes.

@KonradStaniec KonradStaniec marked this pull request as ready for review July 13, 2022 16:15
Copy link
Member

@vitsalis vitsalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Some minor comments. Most of those involve the usage of the BTCHeaderBytes and BTCHeaderHashBytes custom objects in order to abstract the usage of wire.BlockHeader and chainhash.Hash objects. However, the type updates kidn of fall outside the scope of this PR so it is fine with me if we decide to implement this in another PR.

// Each submission can generally be identified by this list of (txIdx, blockHash)
// tuples.
// Note: this could possibly be optimized as if transactions were in one block
// there would have the same block hash and different indexes, but each blockshash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// there would have the same block hash and different indexes, but each blockshash
// they would have the same block hash and different indexes, but each blockhash

// tuples.
// Note: this could possibly be optimized as if transactions were in one block
// there would have the same block hash and different indexes, but each blockshash
// is only 33 (1 bytes for prefix encoding and 32 byte hash), so there should
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// is only 33 (1 bytes for prefix encoding and 32 byte hash), so there should
// is only 33 (1 byte for prefix encoding and 32 byte hash), so there should

// TODO: Maybe it is worth recovering senders while processing the InsertProof
// message, and store only those. Another point is that it is not that simple
// to recover sender of btc tx.
repeated bytes btctransaction = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can add a custom type for this. If we follow the "fortress" approach which is defined for BTCHeaderBytes and BTCHeaderHashBytes, then verification would happen in only one place, while we could efficiently display it as a hex.

Copy link
Contributor Author

@KonradStaniec KonradStaniec Jul 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh I am not sure this thing will ultimately be here. For now I keep it here to extract btc sender later, but I am not sure it is such each task to do, so maybe some other approach will be need (like having btc sender in the message directly ?).

That's why for now I would leave it like that and add the approach you mention in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fishermanymc suggested once to put the Babylon account to be rewarded into the OP_RETURN, mainly to do away with having to register the BTC->BBL mapping up front. It felt to me like that would be using a limited resource for something we can get around by other means. But it's an option.

BlockHeader wire.BlockHeader
Transaction *btcutil.Tx
OpReturnData []byte
BlockHeader wire.BlockHeader
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the btclightclient module, the concept of wire.BlockHeaders has been entirely abstracted by the BTCHeaderBytes object. I suggest that you use that here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure.

I mean it makes not much difference to me, and if those are types which you prefer on light client level then lets go for it. Consider all following comments regarding those types solved 👍

Although at some point we may need to discuss also passing whole header at interface level, as a second source of headers from btccheckpoint module. (imo this is not critical right now, so we can just use the hash)

OpReturnData []byte
BlockHeader wire.BlockHeader
// keeping header hash to avoid recomputing it everytime
BlockHash chainhash.Hash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, but with the BTCHeaderHashBytes object. It provides methods for efficiently performing the important operations that are performed by a chainhash.Hash module and even more.

// TODO: light client lost knowledge of blockhash from previous epoch
// (we know that this is not rawSub as we checked that earlier)
// It means either some bug / or fork had happened. For now just move
// forward as we are not able to establish ancestry here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe panic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above hmm this is maybe question about light client guarantees then. If the block was known to light client in the past, will it be known forever in the future even if it won't be included on the main chain ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do plan to have a pruning depth to forget old blocks after all, just to keep that in mind.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so it is possible to forget once known block i.e I will leave it check for now


lastHashFromSavedCheckpoint := hs[len(hs)-1]

// do not check err as all those hashes were checked in previous validation steps
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe check err and panic if it is set?

Comment on lines 190 to 200
if ed == nil {
newEd := types.NewEpochData(subKey)
m.k.SaveEpochData(sdkCtx, epochNum, &newEd)
m.k.SaveSubmission(sdkCtx, subKey, subData)
} else {
// epoch data already existis for given epoch, append new submission, and save
// submission key and data
ed.AppendKey(subKey)
m.k.SaveEpochData(sdkCtx, epochNum, ed)
m.k.SaveSubmission(sdkCtx, subKey, subData)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if ed == nil {
newEd := types.NewEpochData(subKey)
m.k.SaveEpochData(sdkCtx, epochNum, &newEd)
m.k.SaveSubmission(sdkCtx, subKey, subData)
} else {
// epoch data already existis for given epoch, append new submission, and save
// submission key and data
ed.AppendKey(subKey)
m.k.SaveEpochData(sdkCtx, epochNum, ed)
m.k.SaveSubmission(sdkCtx, subKey, subData)
}
if ed == nil {
newEd := types.NewEpochData(subKey)
ed = &newEd
} else {
// epoch data already existis for given epoch, append new submission, and save
// submission key and data
ed.AppendKey(subKey)
}
m.k.SaveEpochData(sdkCtx, epochNum, &newEd)
m.k.SaveSubmission(sdkCtx, subKey, subData)

// BlockHeight should validate if header with given hash is valid and if it is
// part of known chain. In case this is true it shoudld return this block height
// in case this is false it should return error
BlockHeight(header chainhash.Hash) (uint64, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The btclightclient module does not have the concept of chainhash.Hash. Please use BTCHeaderHashBytes instead. You can get a BTCHeaderHashBytes from chainhash.Hash by using the NewBTCHeaderHashBytesFromChainhash() method, although I believe you can replace all usages of chainhash.Hash in the codebase with BTCHeaderHashBytes from what I've seen.


// IsAncestor should check if childHash header is direct ancestor of parentHash
// if either of this header is not known to light clinet it should return error
IsAncestor(parentHash chainhash.Hash, childHash chainhash.Hash) (bool, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

}

// TODO just return checkpoint if exists
func (k Keeper) GetCheckpoint(ctx sdk.Context, e uint64) []byte {
func (k Keeper) SaveSubmission(ctx sdk.Context, sk types.SubmissionKey, sd types.SubmissionData) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering if instead of having a separate SaveEpochData and SaveSubmission there should be one AddSubmission(epochNumber, submission) that calculates the keys on its own - it could even get the epoch number from the submission but I'm not sure what the state of your thinking is about who knows what about the op_return content.

That way I could not use public methods of the keeper to:

  • set submission data with the wrong key
  • insert submission data that doesn't get recorded in the epoch
  • insert submission keys in the epoch that don't exist

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also thinking about it , and currently have method checkAndSaveEpochData in msg_server with todo // TODO maybe move it to keeper
but I wanted to consider it after pr which will use the hooks from epoching and lightclient to manipulate submission data in someway. I imagine that SaveSubmission will come in handy on its own then but we will see.

Comment on lines +155 to +159
if !m.checkHashesAreAncestors(hs) {
// checkpoint blockhashes no longer form a chain. Cannot check ancestry
// with new submission. Move to the next checkpoint
continue
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have been checked when the previous submission was admitted, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, although we are checking if this still holds true i.e if some rollback did happen which make checkpoint basically invalid and two blocks ended on different forks.

Not sure what is probability of this happening, imo really low, but at least we should log it/ be aware of it somehow. (logging will be added in separate pr)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, so the checkpoint has blocks that used to be in an ancestor-descendant relationship - that can't change during the reorg 🤔

You are right that a checkpoint could be "torn", with the first block still on the main chain and the second no longer on it. But that just means that if the current block wants to use this as its previous checkpoint, it still has to be a descendant of the second block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I understand, but as I see it the whole chain of checkpoints needs to be valid i.e if we receive submission with
blocks b9 and b10 then:

  1. we first check that b10 is child of b9
  2. retrieve checkpoint for previous epoch. Lest say we got submission with blocks b7 and b8
  3. we check that b7 is still ancestor of b8 . It was when we admitted it to db, we just check if it did not change in the meantime due to some reorgs etc. Ultimately, I am not sure how probable it would happen and how to best handle it, for now I am just stop treating such checkpoint as valid one. Probably checkpoining module should be informed about it but I am still thinking it thorugh.
  4. Now that we established that b9 and b10 have ancestry relationship, and that b7 and b8 still have ancestry relationship we only check that b8 is ancestor of b9 (which is basically this check anc, err := m.k.IsAncestor(*lastHashFromSavedCheckpoint, rawSub.GetFirstBlockHash()))

Imo at some point I will revisit point 3 to be sure how to best handle it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I was trying to say is that it's impossible for b8 to suddenly not be a descendant of b7 because of a reorg. Block can't just change their parents. I thought what you mean is that b7 is still on the main chain but b8 isn't.

Say we have this chain:

b7 - b8 - b9 - b10
   \
     b8' - b9' - b10' - b11'

if someone sends us a submission in [b9, b10], and we're looking for a previous checkpoint and find a candidate in [b7, b8], all we need to check is that b9 <- b10 and b8 <- b9 because b7 <- b8 will always hold. However, it is possible that the longest chain already ends at b11', so while b7 is still in play, b8 is orphaned.

That doesn't mean we can't accept the submission, since in theory we could switch back to the upper fork.

Comment on lines +184 to +187
// TODO: SaveEpochData and SaveSubmission should be done in one transaction.
// Not sure cosmos-sdk has facialities to do it.
// Otherwise it is possible to end up with node which updated submission list
// but did not save submission itself.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will happen during transaction processing, so what will happen is that Cosmos will create a doubly cached data store, where 1st of cache is for the whole block processing and 2nd is for individual transactions, so they can be rolled back in case of failure, and committed at the end of the block.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think ideally the "keeper" should be responsible for maintaining the integrity of the data model, otherwise it doesn't really live up to its name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for links, I will go through them. In perfect world I would use something like WriteBatch to ensure write will be atomic, but I did not see anything like this in sdk. I definitely intend to look for it at some point after completing whole flow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah they take care of that for you I think. The commit happens here, and this is where the cache is initiated, called from BeginBlock.

Comment on lines 253 to 256
if epochNum > 0 {
// this is valid checkpoint for not initial epoch, we need to check previous epoch
// checkpoints
previousEpochData := m.k.GetEpochData(sdkCtx, epochNum-1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC from @SebastianElvis PRs the Genesis block is epoch 0, and then block 1 is in epoch 1, so the first epoch probably won't have a predecessor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh ok. So I will check for ancestry here only if epochNum > 1 👍

}

func (m *MsgInsertBTCSpvProof) ValidateBasic() error {
if _, err := sdk.AccAddressFromBech32(m.Submitter); err != nil {
address, err := sdk.AccAddressFromBech32(m.Submitter)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the submitter isn't coming from the Bitcoin transaction? That would mean that any hyena can steal someone's BTC submission. This is why I added the Registered Submitter entity to the schema, and an initial registration step where submitters can tell us where they want the rewards to go. I took it for granted that we would get the address to reward from the BTC transaction.

You commented it's difficult to get out. Why so?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So apparently it is not really easy to get sender of bitcoin transaction due to multiple reasons. Relevant links:

That why I kinda skipped this part of data (i.e save tx to db but not do anything special with it), and intend to return to it when working on rewards.

I also intend to research topic a bit then. Maybe we can somehow restrict transaction shape to easier retrive sender, or add additional data to message, I am not sure for now.

Submitter address is saved mainly to refund him for gas.

Comment on lines 71 to 73
var tBytes [][]byte
tBytes = append(tBytes, rsc.Proof1.TransactionBytes)
tBytes = append(tBytes, rsc.Proof2.TransactionBytes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at this for a while to realise it's not appending the two transactions into one array.

Is there no shorthand syntax, like [][]byte { rsc.Proof1.TransactionBytes, rsc.Proof2.TransactionBytes }?

Copy link
Contributor

@aakoshh aakoshh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, the logic is easy to follow!

I have only two blockers:

  • The first epoch will not have a previous one to look for.
  • The submitter field should only be used to tell Cosmos who should sign the Babylon transaction, but these relayer guys are not the ones getting the compensation; that should be the submitter who paid for BTC. The relayer could get some Babylon so they can keep paying gas but I have no idea from whom and how, and when do we consider a duplicate submission spam.

Copy link
Contributor

@aakoshh aakoshh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great progress 👏

So we discussed that the submitter is not necessarily the one we want to reward for the BTC fees incurred, but fixing that is an issue of further research.

@KonradStaniec KonradStaniec merged commit 09b04f9 into main Jul 16, 2022
@KonradStaniec KonradStaniec deleted the btcheckpoint-schema branch July 16, 2022 06:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants