-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
draft: bridge analysis #15
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll review focused on the content and structure. I think an editing pass is
needed, but if the structure is to change a lot, that would be premature
optimization.
First question: do you have plans to expand on this, or this is a full draft?
It would be good to take existing bridges and classify them using the proposed
taxonomy.
I think the structure is not entirely clear. We should focus on the taxonomy on
bridges first, then for each kind of bridge, look at the various criteria you
detailed (I would perhaps call them "considerations" rather than "metrics", they
can't really be measured most of the time and are pretty subjective). We should
discuss the characteristics there, the tradeoffs, and give examples from
existing bridges where applicable.
Schematically, here's the classification I'd propose based on our earlier call:
- Base forward bridges
- native bridges
- light client bridges
- validator bridges
- economic security optimistic bridges
- Overlay bridges
- reverse bridges (for permissionless actions)
- destination-chain challenges optimistic bridges
I'm renaming "action request bridge" to "reverse bridge" here, lmk if you think
that makes more sense.
I like writing to organize my thoughts, so here's a small writeup that I made to
sort out my thoughts on the above classification. Feel free to reuse all or part
of that:
{start writeup}
There are fundamentally two kinds of bridges, which differ based on the action
to be performed on the destination chain:
- Permissioned action: The action on the destination bridge must ascertain that
another action occured on the source bridge. - Permissionless action: The action on the destination bridge is permissionless,
no need to check anything.
Using the token transfer use case, the simplest possible bridge is the one where
Alice sends tokens to Bob on the source chain, and Bob sends tokens to Alice on
the destination chain. This is safe for Bob, who always receives the tokens on
the source chain, but unsafe for Alice, who has to trust that Bob will send the
tokens on the destination chain.
As a result, there is a need to prove on the source chain that the token
transfer (in general: the permissionless action) occured on the destination
chain.
Therefore, executing permissionless actions safely requires the existence of a
bridge from the destination to the source chain.
Let's call bridges for permissioned actions a "forward bridge" and a bridge for
permissionless actions a "reverse bridge". Reverse bridges are rely on the
existence of a forward bridge in the reverse direction (destination to source).
(Note: "reverse bridges" is what I called "action-request bridges" before. I
think the new term might be a bit clearer?)
Here we focus on forward bridges, and will talk about reverse bridges later.
(Other document / PR / task.)
The core of a forward bridge is relaying some kind of data form the source chain
to the destination chain. This could be just a binary acknowledgement that an
action (e.g. token transfer) occured, or abitrary data, that can even be
executed as a transaction on the destination chain.
Usually a distinction is made between "arbitrary message passing bridges" (AMB)
and asset bridges, but once we peel back the distinction between forward and
reverse bridges, the distinction doesn't make much sense — all the
characteristics are the same no matter what kind of data is transmitted. We'll
use the term "message" to designate this data.
Forward bridges can be:
-
Native bridges — this is what rollups have, where nodes on the destination
chain run a node of the source chain. Using ZK we can alleviate the need to
re-execute and validate blocks, but it still necessary to track a source of
truth for the chain inputs (at the very least some kind of
commitment/blockhash to the data availability layer). -
Light client bridges — where either nodes of the destination chain, or contracts
deployed on that chain verify the consensus of the source chain. These could
use normal compute or zk proofs. -
Validator bridges — messages are signed by a commitee of validator (this can
be more or less decentralized, from a small multisig to full-blown
decentralized blockchain) -
Optimistic bridges — messages are relayed by a permissioned set of actors, but
can be challenged by anyone. Fraud can be established on the source chain by
posting a incorrectly relayed message signed by a permissioned relayer. Safety
on the destination chain relies on the existence of another safe forward
bridge from source to destination to relay the challenge. Or, in the absence
of this, the optimistic bridge only provides economic security (the faulty
relayer can be slashed on the source chain).
Because optimistic bridges either rely on other bridges or only provide economic
security, they are qualitatively different from the other forward bridges.
We can arrive at the following classification:
- Base forward bridges
- native bridges
- light client bridges
- validator bridges
- economic security optimistic bridges
- Overlay bridges
- reverse bridges (for permissionless actions)
- destination-chain challenges optimistic bridges
{end writeup}
Finally, I'd de-emphasize messaging formats, I think that's better left for
another section (there are other related questions).
Regarding push/pull distinction, I think that'd fit better as a note of the
advantage of native bridges: they don't need dedicated relayers.
I was a bit confused about the "admin rights" parts. One it doesn't really feel
fundamental (if you relay a message, you can just check a signature there for
instance, or limit what the bridge can do), two it's quite specific to that
scenario (though the concern generalizes, it's just safety in general).
I think "keeping safe owners in sync" is maybe too complex of an example,
because it invites concerns that are specific to that use case. e.g. what if
someone does an action using the old key after the key has been updated on the
source chain, but not yet updated on the destination chain.
Thanks for your input. For now, it was just a draft to especially get your and others' feedback. Let me try to address your points. I like the distinction and naming for forward bridges and reverse bridges. However, it is a bit complex to understand the origin of reverse bridging if one doesn't know the explanation. In this analysis, I actually would only care about forward bridges first, and continue to reverse bridges later as it mostly serves as an optimization for forward bridges for specific use cases.
Analyzing and classifying existing bridges, I will also address. But I think it makes sense to align on the taxonomy first and then enter into categorizing existing bridges.
What is a destination bridge and source bridge? Or do you mean source and destination chain?
Let's call bridges for permissioned actions a "forward bridge" and a bridge for I find this explanation a bit difficult to understand because there are some implications made which might not be known by every reader. By reading it is unclear why there is a need to prove on the source chain that the token transfer happened on the destination chain. The proof is typically needed by a protocol which Alice is using instead of directly sending the tokens to Bob in the first place. This protocol (the bridge) typically depends on the outcome of a forward bridge to distribute Alice's tokens to the rightful owner (in this case Bob). I would try to explain more why we need a proof on the source chain in the first place. More philosophically reverse bridges only need the "reverse" part cause the actions are triggered on another (source) chain. That means that incentivization for relayers/executors typically also happens on the source chain. In order claim the incentive, the relayer has to prove his service on the source chain, thus needs a forward bridge in the reverse direction.
In my draft I tried to be even a bit more specific (or primitive) by defining that the goal is not to relay data rather than proving the state of another chain. I think that the "data" we want to relay is actually the state of the other chain (or part of it). Everything else doesn't make sense. Everything what is not part of the state (so it's off chain data) could directly be send to the destination chain so it must be part of the state what we want to send. Let me know what you think about this.
IMHO AMBs and forward bridges are the same just different naming. Asset bridges are a use case specific bridge which could be a forward bridge or reverse bridge. So it looks to me that we try to define more general terms here.
In general, I agree with the categorization but I would like to reiterate the definition of these because I think implicit judgments were taken that are general to every (or more bridges). Let me try to express my thoughts:
I think it is important that we do not confuse bridge architectures and design parameters which are applicable to all (or most) bridge architectures and thus more orthogonal to it rather than part of an architecture.
Regarding message formats and the given example: Message has a networth of 1B USD. Light client: The source chain validators equivocate and create a malicious block where the money goes to the validators. Slashing due to equivocation would have to be less than 1B USD. Now theoretically the system is not incentive-aligned anymore. Validators: Similar to the above Optimistic bridge: Challengers get bribed to not challenge. Note, that even if the system is not incentive-aligned in this particular action it might be more reasonable to not cheat if you take future income into account. However, I'm happy to discuss a better description of the actual message type (arbitrary data) and explain the impact of the data's indirect value in regard to the metrics (or considerations) evaluation. |
After your answer, I will try to merge the results of our discussion |
On another note, I think that there are a lot of deep unknown water behind the statement |
Totally true, open to suggestions here.
I think it might be good to mention / explain them early, in the sense that if we're trying to do a taxonomy, we can say: "there are 2 kinds of bridges", and then focus on the forward bridges first. Otherwise, any kind of boundary will draw and people will be screaming in their heads "they forgot this or that".
Yes, my bad!
Yes, we should expand this to explain the presence of a protocol, that would make more sense. We can build on top of this simple scenario, this was my intent, but it's too implicit. So in this simplest scenario, Alice needs to trust Bob. Instead we replace this with a protocol that is essentially an escrow, and Bob needs to bridge in the reverse direction to prove he sent the fund on the destination chain, and unlock the escrow.
Agreed. On the other hand, being too abstract is confusing. I think in this case it might be better to start with the example then generalize? We could also start with the generalization and give the example, but my intuition is it would flow less nicely in this case.
In the abstract this is true, but it can be confusing as it suggests a model where states roots are being relayed and then things proven against them via Merkle proof (or zk proofs of inclusions). This is undesirable on rollups because of high calldata bosts. I guess we can make this point, I'm not entirely sure it's important (as you say, it's doesn't really make sense to bridge something that isn't in the state — though I'll say calldata can also be relevant here!) but it doesn't hurt.
I agree with what you wrote there, but my point was that there is a design of economic bridge that is ONLY secured by economic incentives. i.e. a relayer can lie and the action will go through with no reversal, but if that happens the relayer will be slashed. This design can work when your design doesn't allow the relayer to profit much more than your slashing (by putting limits in place). That's why I would single out these optimistic bridges. Other designs can add economic security in the form of slashing, but they have other means of ensuring security (decentralization, proofs, etc).
I don't really see how this can work without a forward bridge, excepted in the economic security-only scenario (in that scenario, fraudulent messages are executed). If you want to prevent the fraudulent message from being executed, you'll need to relay from the source chain (where the challenge can be conducted, on the basis of signatures) to the destination chain. I got a bit lost in the sauce of your explanation on how to sidestep this issue with permissionless validators. I'm not quite clear on how this system would work. Without another bridge from the source chain, it's impossible to tell if a challenge is valid or not. And whatever "authority" (say a majority vote) you select to say if a challenge is valid is de facto a forward bridge (albeit it might be a very weird/unusual/slow one). This is what Across does with UMA as its challenge validator set. UMA is not generally understood as a bridge, but it fulfills this role in this case.
I'm not sure this is an issue: The existing validator set can sign off on upgrades, whether adding/removing members from the validator set, or updating a protocol for a light client.
Totally, let's just talk of arbitrary data, and then mention that this data call be calldata that will reinterpreted as a transaction on the destination chain.
I think that example with how that looks for the different bridges is pretty good, let's include that somewhere! |
👍
I agree. Let's start with an example and then generalize it.
Wow 🤯 I never thought about that merklizing messages into a merkle tree could be actually more expensive than just forwarding message per message in the case of rollups. But you are completely right. Having that realized, I suggest that we stick to the example of sending data (or messages) one by one. However in the case of light client bridges, we have a state tree by default, right? So we kind of have to go with that. Nevertheless it's not that crucial to the analysis I think, so I'm happy to be more specific here and less abstract and calling it data or messages. I will add another comment regarding the second part. |
I have thought about how I could explain a bit better what I'm concerned about when making design choices on one type of bridge but not on others. I understand that we have to make a tradeoff between being very abstract and providing examples, but what I would really like to avoid is that we imply design choices to one type of bridge but not to others and therefore inevitably skew the analysis results. Some examples: Challenging
I noticed that in your arguments you describe that challenges happen on the source chain. I think this is a fundamental flaw. Challenges must happen on the destination chain, because this is the place where a claim or proposal was made. Please note, that I'm not saying that this is also the place where the challenge can be resolved and that's why I would like to introduce the notions of challenging and slashing (or resolving). Definition of PermissionedIn your example, you described an optimistic bridge where there is a permissioned set of relayers. I would like to mention, that this again already a design choice made, which influences the analysis. It could very well be a permissionless set of relayers. I think you mentioned permissioned, cause we need to make sure that relayers have something at stake. And the stake must be stored where the source of truth is, and that's the source chain. Ergo, the target chain cannot allow new messages by any relayer, it must be a whitelisted address by the target chain bridge contracts to make sure that the address has staked a security deposit on the source chain. |
In general, I think it completely solid to categorize forward bridges in
I see two sub-categories in these four.
Why do I make this distinction? In the simplest form, one could argue that the former verifies the state ON the destination chain, while the latter can only verify the state and also misbehavior on the source chain. Building reverse bridges like the ones you mentioned (i.e. I will now try to make my point on why I'm worried that we make special design choices to one type of architecture and not to the other, even though we could and thus do not only assess the architecture but rather the parameters we chose. I will try to explain it by comparing validator and optimistic-based bridges. I will start with the isolated mechanism in an abstract manner and completely trust-based with no to few design choices made. I will then continue to apply an economic model to both. Based on applying parameters to both architectures, you will see the fundamental differences of the architectures. In the end, I will add additional considerations which are outside of the fundamental economic mechanisms of the architectures but influence the economic design in the long term and are worth mentioned. ComparisonLet's assume the following situation. We have Chain A, also named source chain, and chain B also named target/destination chain. There is the need of sending messages from chain A to chain B where both chains do not not the state of the other chain at all. Technically, you can imagine an inbox where messages are registered and an outbox on the target chain where messages are incoming. Messages are sequentially ordered by nonces. Validator bridgesIn its very essence, validator bridges have two different actors. Relayers forward messages to the target chain. Validators have the authority to validate the message (Note, that in reality a lot of validator bridges compose the role of the relayer and the validator duty into one actor, but technically it's two different actions). Only after a message is validated by the validators, the message is considered correctly delivered and it can be used by the target chain (Either it's an encoded execution and it gets executed or another contract can read the data and do with it whatever it wants). If it's not validated, the message is considered non-delivered (basically as if it weren't relayed at all). I will make the following assumption: In order to validate a message 2/3 of the validators need to authorize the message by signing it. Optimistic bridgesOptimistic bridges also have two actors. Again, the relayer who forwards any message to the target chain. And the challenger, who will be able to challenge the integrity of the message. If a message is not challenged, then it is considered correctly delivered after a time threshold X. After the threshold (BUT NOT BEFORE) the message can be executed. I will make an assumption here for now: Challengers have the authority to stop any message by challenging. So it only needs one challenger to stop the execution by challenging before the timeout is reached (this is how Nomad worked btw). With this very isolated example you can easily see the following: Validator bridges need to actively authorize a message and are invalid by default, whereas optimistic bridges are valid by default and need to be actively invalidated. Stage 1: Trust-BasedIn stage 1, we simply assume that we have a chosen set of relayers, validators and challengers who we trust that they behave correctly in their respecitve duties. Obviously, that's not what we aim for. But here is a good moment to make some observations. Validator bridgesSafety is ensured if 1/3 of the validators behave correctly. Relayers don't influence this cause messages are invalid by default. Safety is broken if 2/3 of the validator set misbehave and collude. Liveness is ensured if 2/3 of the validator behave correctly. At least one relayer need to continue to work (1 out of N). Maybe I can spoil that there does not need to be a "relayer set", anyone can be a relayer because messages are invalid by default, thus we have a 1 out of infinity and we can neglect a non-functioning relayer set. Optimistic bridgesSafety is ensured if 1 out of N challengers challenge invalid messages. Theoretically, relayers do not really influence safety. However, since messages are valid by default, in reality relayers do influence safety cause they put pressure on the challengers. Thus in practice, it is important to limit the possibility to relay. Liveness is ensured if all challengers behave correctly. Since only one challenger can invalidate a message we need all challengers to behave correctly in order to guarantee liveness in this scenario. This is obviously not a nice result for a bridge. One can clearly see here already, that attacks can be especially reflected by putting a high load on the bridge and their actors or guardians. And since we also need to take transaction fees into account, it is obvious that we need to put an economic model around these architectures to make a more realistic scenario. Stage 2: Economic-basedIn this stage we incentivize correct behavior by applying an economic model to the actors and users/attackers. Economic models are analyzed by game theory and the goal is to align incentives of all actors with the goals of the system itself (the bridge). Economic incentives in it's simplest form are either slashing deposit or rewards. It's worth mentioning that misbehavior can only be proven on the source chain as this is the source of truth. If this assumption is clear, then it is also clear that any stake/security deposit must be placed on the source chain. However, misbehavior only ever occurs on the target chain. We also assume that the target chain is aware of all actors who correctly deposited their stake on the source chain. So in other words the target chain knows who put up some stake and thus is part of a specific role. Also we assume that every action of any actor is signed off with a signature by this actor (we need this for slashing later). Validator bridgeValidators will put a stake on the source chain. As mentioned there is no need for relayers to put in any stake as they only harm themselves by paying tx fees if they register invalid messages on the target chain. Safety attacks can be proven by the signatures. Misbehavior cannot be stopped (IOW the invalid message will be executed) but punishment can be enforced. Liveness attacks cannot be slashed directly. There could be a mechanism where anyone could request the signature for message N on the source chain and if the signature is not provided in time X all validators would be slashed, however, it also needs to be economically incentivized when to request a signature for message N. I think it's possible but let's not overcomplicate things just now. Optimistic bridgesRelayers and Challengers will need to put some stake on the source chain. Whenever misbehavior occurs, actions can be slashed directly. Relayers: When relayers relay an invalid message, slashing on the source chain is applicable. It is not directly slashable if challengers do not invalidate an invalid message. Again here, an additional mechanism could be introduced where a user could request a challenge signature on the source chain where the time lock is shorter than the optimistic time lock on the target chain which would validate the message. Safety attacks are a combination of relaying an invalid message (slashable event) and not invalidating it (slashable only with the additional mechanism). Liveness attacks are directly slashable. In theory, malicious challengers would lose all their money at one point, and thus, a liveness attack would only be temporary. With the economic model in place, you could derive some direct numbers on what economic value needs to be taken to break the security assumptions of both architectures. It is important to mention that this is mostly theoretical as other factors need to be taken into account if one would want to do a holistic analysis. But for now, we can analyze the pure mechanism. For example, imagine a validator set of 1000 with a stake of 32 ETH (~66k USD). To break safety, 2/3 of validators need to be willing to lose their stake which is 44M USD. This could be described as the theoretical bandwidth of the bridge. Whereas in the case of having 1000 Challengers, one would need to bribe all challengers to break safety which is 66M USD. These numbers are mostly theoretical but provide a good ballpark. Using fallback forward bridgesYou mentioned optimistic bridges using another forward bridge, this is something that can be applied to both, validator and optimistic bridges. In both cases, I would frame it as a shortcut to optimize some of the shortcomings of these architectures. Validator bridgesWe can see that one would need to bribe 1/3 of validators to break liveness. In such a case another forward bridge as a fallback could continue the bridge. Optimistic bridgesHere we can see that we currently have assumed that the challenger does not actually "challenge", it can invalidate a message. This leads to the fact that only one malicious challenger is needed to attack the liveness of a bridge. This can be a fundamental drawback for a bridge because as soon as the value to be extracted from delaying the execution is greater than the punishment, it is economically viable to do so. In reality, the bridge itself could be attacked by delaying the execution and thus harming user experience. In both cases, the fundamental drawback is that the bridge now inherits the trust assumptions of the fallback bridge. Additional considerationsUp until now, I have described both bridge architectures very theoretically and in an isolated manner. In reality, multiple additional considerations need to be taken into account if one would want to do a proper analysis. Indirect viability of correct behaviorTypically an attack is a one-time event, which can last for multiple messages but will only be temporary. After that either the bridge stops to exist or very least the attacker is removed from its duties. I would argue that in most cases it's the former. Who pays the challengers?It is known that there is a fundamental flaw in the design of optimistic bridges. Challengers only earn in the unhappy case. However, to earn, they need to run the software all the time. Unless there is a more sophisticated model to pay challengers activity, it is not economically viable or IOW incentive aligned to be a challenger. Especially in the case of permissionless sets, challengers need to put down some stake on the source chain which incurs opportunity costs. Challenge gamesThis is just a note, but you may wonder why a challenger should be allowed to completely invalidate any message instead of entering into some sort of challenge phase. Invalidating is the trivial answer. On top, you can build additional challenge mechanisms as you wish. Updating the actors' setsAs you mentioned in your previous comments, I assumed that in both cases, validator and optimistic, the bridge itself would update its own sets on the target chain. In other words, if actors join or leave roles by depositing or withdrawing their stakes, it would result in a message being sent via this very bridge to update its own whitelisted sets on the target chain. Here it should be mentioned that validator-based bridges could freeze the validator set by freezing the bridge if 1/3 collude and we don't have a direct economic incentive to change that. In the optimistic-based bridge, the challenger would lose all of his stakes eventually. Then a message would be sent that he is going to be removed from the set, cause he lost all his stakes. However, he could still attack liveness by invalidating this message. A common solution to that is that any challenger wouldn't be allowed to challenge a message where he is going to be removed as a challenger. |
Addressing some of your top comments now, will review the rest later this week.
This is probably easier to discuss on specific examples, indeed. We totally should assess & discuss economic security on all types of bridges. What I'm underscoring with the distinction between an "economic-security-only optimistic bridge" and other kind of bridges is that other kind of bridges have a mechanism to ensure thruthiness of claims (e.g. validator quorum) beyond trusting the word of a single/centralized actor. In an "economic-security-only optimistic bridge", you have no such mechanism, so a SINGLE misbehaving entity can post a bad message, and there is NO WAY to challenge this. So the entirety of the security of the bridge relies on the fact that the misbehaving entity must not be able to derive more profits from doing so than what it stands to lose. This is a pretty fundamental constraint, I believe, which other types of bridges do not have. You can usually add economic security via slashing on top of another mechanism, but you would not require the bond to cover the entirety of the potential economic damage (because often that's the entire TVL of the bridge). This is why I believe "economic-security-only optimistic bridge" deserve their own distinct section.
The source of truth is indeed the source chain. Resolving a destination chain challenge will involve bridging from the source chain — hence the need for a forward bridge. If you cannot get the information from the source chain, then you cannot resolve the challenge. You can have other mechanisms to resolve the challenge. e.g. an UMA oracle vote. But then UMA (or any other such mechanism) basically acts like a very slow bastardized forward bridge that is only invoked in case of challenge. Optmistic bridges with permissionless relayers don't make sense, which is why I'm making that assumption.
I agree with this. Permissioned is not the right term for what I was talking about — "bonded" would be a better term. Optimistic bridges never make sense with unbonded relayers (economic-only because it's the only security mechanism, the other ones because unbonded actors can grief without costs). |
layer zero
bridges / not to confuse with the brand)