Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zoneconcierge: replace IBC-Go with Babylon's fork, and implement DB schemas #184

Merged
merged 13 commits into from
Nov 7, 2022

Conversation

SebastianElvis
Copy link
Member

@SebastianElvis SebastianElvis commented Nov 1, 2022

Fixes BM-250

This PR implements the DB schema for ZoneConcierge, and replaces the IBC-Go dependency with our fork.

The verification logics of these CRUD operations will be done in the msg server, and will be implemented in subsequent PRs

Proposed DB schema

ZoneConcierge aims at maintaining all indexed headers that have a valid quorum certificate for all CZs. Each IndexedHeader only includes the header's position on Babylon's ledger, and the header's position on CZ ledger, so that one can efficiently generate inclusion proofs on Babylon for it and find it on CZ ledger. In addition we need to store all forks that have valid QCs.

In light of this, this PR proposes the following DB schema. The IndexedHeader is defines as follows:

// IndexedHeader is the metadata of a CZ header
message IndexedHeader {
  // chain_id is the ID of the chain
  string chain_id = 1;
  // hash is the hash of this header
  bytes hash = 2;
  // height is the height of this header on CZ ledger
  // (hash, height) jointly provides the position of the header on CZ ledger
  uint64 height = 3;
  // babylon_block_height is the height of the Babylon block that includes this header
  uint64 babylon_block_height = 4;
  // babylon_tx_hash is the hash of the tx that includes this header
  // (babylon_block_height, babylon_tx_hash) jointly provides the position of the header on Babylon ledger
  bytes babylon_tx_hash = 5;
}

Then we have the following indices:

  • canonical chain index:
    • prefix: chain ID
    • key: height
    • value: IndexedHeader
  • fork index:
    • prefix: chain ID
    • key: height
    • value: a list of IndexedHeader
  • chain info index:
    • key: chain ID
    • value: chain info, which includes the latest canonical header and the latest fork

@SebastianElvis SebastianElvis marked this pull request as ready for review November 2, 2022 06:37
// chain_id is the ID of the chain
string chain_id = 1;
// blocks is the list of blocks on this fork
repeated IndexedHeader blocks = 3;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am probably missing how this will be used, so question here, do you known how large those forks will be ? Is canonical chain also a fork, so this would end up with all CZ blocks from the genesis ?

Also maybe some doc would be good explaining what exactly is this Fork, as vanilla tendermint does not have forks unless there are more that 1/3 malicious nodes,so I assume this is this kind of malicious forks ?

Copy link
Member Author

@SebastianElvis SebastianElvis Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for asking. I will add more doc on these message formats.

You are right that Fork represents a malicious fork. Each Fork object includes a list of IndexedBlock, starting from the diverging block to the end of the fork. In KVStore (more specifically, fork_indexer.go), the key is the height of the diverging block and the value is a Fork object.

For example, assuming the following blockchain:

A <- B <- C <- D <- E
          \ -- D1 <- E1

Then the fork will be {chain_id, [C, D1, E1]} where each item is in struct IndexedBlock. The correspopnding KVStore entry will be 2 -> {chain_id, [C, D1, E1]}.

You are also right that normally Tendermint / Cosmos SDK chains do not have forks under less than 1/3 maclicious voting power. Here we are considering CZs might have dishonest majority, thus forks with valid quorum certificates can exist. The ZoneConcierge module aims at indexing all headers with valid quorum certificates, and its underlying Tendermint layer stores the actual block headers. Babylon receives these headers via IBC, and our IBC-Go fork will be modified to not stop working upon forks with valid quorum certificates.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some discussions with @fishermanymc, it looks like the current Fork struct is not compatible with our IBC-Go implementation. IBC-Go does not store fork headers and the QC of a non-first header on a fork cannot be verified due to the absence of the first fork header. Consequently, ZoneConcierge can only store the first header on a fork, but not the later ones on this fork.

Thus, I propose we change the meaning of Fork. Instead of representing all headers on a fork, we use Fork to denote all fork headers at the same height. For example, consider the following blockchain

A <- B <- C <- D <- E
          \ -- D1
          \ -- D2

Then Fork will be {chain_id, [D1, D2]}, and the corresponding KVStore will be 3 -> {chain_id, [D1, D2]}. How do you think about this proposal?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general looks good to me. (and I like new docs 👍 )

One more question I have, is this possible for this adversarial majority to create enough of this forked headers that of Fork data structure will be large enough that it will not fit in working memory, like 1M of headers ? or is it not practical as each header need to be trasported by relayer to babylon ? (so feed needs to be paid etc.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible (the Fork data structure is in KVstore not memory though). In fact such attack can exploit any of the KVStore, simply by putting as many things in a certain KVStore as possible. You are right that exploiting Fork may not be optimal, depending on the amount of data that can be spammed by using a unit of tx fee.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, Forks is stored in kvstore. What I am worried a bit is attack where a lot of idexed headers will be created at some height and then GetForks function will be called which will read all of those indexed headers into memory.

But from what I see indexed header is small enough ( < 100bytes) so probably this is not practical in any way. i.e someone would need to create over 10m fork headers and certain height and then transport them over ibc to do it.

@fishermanymc
Copy link

fishermanymc commented Nov 3, 2022 via email

@SebastianElvis
Copy link
Member Author

why chain_id is needed in 3 -> (chain_id, [D1, D2])?

It's not strictly needed, since each IndexedHeader also carries it. Will remove 👍

@KonradStaniec KonradStaniec changed the base branch from main to dev November 5, 2022 08:17
Copy link
Contributor

@gitferry gitferry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Two major concerns:

  1. In my understanding, ibc uses client_id other than chain_id to identify the counterpart. See here. Should we keep it consistent?
  2. I see that most of the CURD operations have no errors returned and just panic upon errors. Would that cause any safety problems? Say one calls the APIs maliciously causing the node to panic

// IndexedHeader is the metadata of a CZ header
message IndexedHeader {
// chain_id is the unique ID of the chain
string chain_id = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we use chain_id rather than client_id? I thought the latter is used more often to identify the counterpart

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding Chain ID can uniquely identify a blockchain, while client ID is somewhat one-time identifier for chain clients. Our ZoneConcierge module maintains the index of headers/forks for each chain, so probably we need to use chain ID. Wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong argument against using chain_id. They may not make any difference. I just thought IBC assigns unique client ids to each chain that it connects, even if two chains have the same chain_id. Your call.

// Such forks exist since Babylon considers CZs might have dishonest majority.
// Also note that the IBC-Go implementation will only consider the first header in a fork valid, since
// the subsequent headers cannot be verified without knowing the validator set in the previous header.
message Fork {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the description, it should be Forks?

}

// ChainInfo is the information of a CZ
message ChainInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the feeling that ibc cares more about Client but the relayer cares more about Chain. Is that true? Should we use ClientInfo here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar as above comment, each chain has a unique chain ID that is queryable online (e.g., here), but can have different client IDs in IBC connections with different zones.

// ChainInfo is the information of a CZ
message ChainInfo {
// chain_id is the ID of the chain
string chain_id = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dito


func (k Keeper) InsertHeader(ctx sdk.Context, chainID string, header *types.IndexedHeader) {
store := k.canonicalChainStore(ctx, chainID)
store.Set(sdk.Uint64ToBigEndian(header.Height), k.cdc.MustMarshal(header))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean we only store one header at a specific height?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the canonical chain store, yep


func (k Keeper) setChainInfo(ctx sdk.Context, chainInfo *types.ChainInfo) {
store := k.chainInfoStore(ctx)
store.Set([]byte(chainInfo.ChainId), k.cdc.MustMarshal(chainInfo))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If marshalling chainInfo fails, it will just panic? Is it safe to panic here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be safe. This function is only used internally by the module. If the marshal attempt is wrong, then it can only be programming errors (e.g., failed to configure the marshaler, etc..)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically it is true, but I think maybe it's better to keep an error as the return if we are writing something to the DB because we maybe need to do some checks before writing. Leaving an error return makes it easier for us to add more checks in the future.

chainInfo.LatestHeader = header
}
k.setChainInfo(ctx, chainInfo)
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that there will never be any errors returned

}
chainInfo.LatestFork = fork
k.setChainInfo(ctx, chainInfo)
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dito

store := k.forkStore(ctx, chainID)
forkBytes := store.Get(sdk.Uint64ToBigEndian(height))
if len(forkBytes) == 0 {
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing here, shouldn't it return an error?

}

// InsertForkHeader inserts a forked header to the list of forked headers at the same height
func (k Keeper) InsertForkHeader(ctx sdk.Context, chainID string, header *types.IndexedHeader) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, no errors returned?

@SebastianElvis
Copy link
Member Author

Thanks for the review Gai! I agree that the some of the CRUD functions should return errors. Initially I considered making all sanity checks in the msg server but returning errors here gave us more flexibility,. I have added error types and made the functions to handle errors when necessary. Pls feel free to have a look again :)

Copy link
Contributor

@gitferry gitferry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. Only one thing from my side. Please use store.Has to check existence because store.Get will return nil for non-existing items.

func (k Keeper) InitChainInfo(ctx sdk.Context, chainID string, latestHeader *types.IndexedHeader) error {
store := k.chainInfoStore(ctx)
// the chain info should not exist at this point
if chainInfoBytes := store.Get([]byte(chainID)); len(chainInfoBytes) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that store.Get returns nill if the relevant value does not exist. I would suggest that we first use store.Hash to check its existence.


func (k Keeper) GetChainInfo(ctx sdk.Context, chainID string) (*types.ChainInfo, error) {
store := k.chainInfoStore(ctx)
chainInfoBytes := store.Get([]byte(chainID))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Use store.Has to check its existence first

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Didn't know this API before 👍 will update

// GetForks returns a list of forked headers at a given height
func (k Keeper) GetForks(ctx sdk.Context, chainID string, height uint64) *types.Forks {
store := k.forkStore(ctx, chainID)
forksBytes := store.Get(sdk.Uint64ToBigEndian(height))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some here. Use store.Has to check its existence

@SebastianElvis SebastianElvis merged commit 2ad72d4 into dev Nov 7, 2022
@SebastianElvis SebastianElvis deleted the replace-ibc-go branch November 7, 2022 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants