Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop inclusion on too many disputes #790

Closed
eskimor opened this issue Aug 18, 2022 · 13 comments
Closed

Stop inclusion on too many disputes #790

eskimor opened this issue Aug 18, 2022 · 13 comments
Assignees
Labels
I1-security The node fails to follow expected, security-sensitive, behaviour.

Comments

@eskimor
Copy link
Member

eskimor commented Aug 18, 2022

One important attack vector on disputes is to try to overwhelm the system with disputes, with the goal to prevent one particular dispute from concluding (the one which would actually resolve "invalid"). E.g. imagine an attacker hacked a few honest nodes and makes them trigger disputes at a high rate (e.g. dispute each and every candidate) to bring the system to its limit and once the attacker is sure the system can no longer keep up, try to back an invalid candidate.

There are of course a couple of counter measures in place, but we should add another safe guard. An important observation is that disputes are supposed to be exceptional, so a high rate of disputes is very suspicious and the only reasons that come to mind for such an event are:

  1. Load testing on a test network
  2. An attack such as described above
  3. A bug, which triggers honest validators to do validation mistakes

We don't need to worry about 1 obviously, we should worry about 2 and keep in mind 3.

Another observation is that we only care about disputes on unfinalized blocks. An attacker can dispute already finalized blocks, but honest validators will not participate in such disputes, hence the attack described above should not be possible with such disputes.

This means for triggering disputes at a high rate an attacker has to rely on unfinalized blocks, which are limited in number - hence the attacker can not hold back disputes for too long and try to fire them all at once, but will be required to produce them continuously - as candidates get included.

The final ingredient: Candidates can only get disputed (with priority at least) if they have been included on some chain, but this is something we can control - regardless of malicious nodes:

Proposal

When importing bitfields in a block, we take into account the number of active disputes we are aware of. If they surpass some conservative number MAX_ACTIVE_DISPUTES_INCLUSION_RATE_LIMIT honest block produces will stop including bitfields in a block and block importers will reject any block which does anyway. This way we limit the supply of new candidates that could get disputed, hence with a carefully picked number for above constant we ensure the dispute system cannot get overwhelmed. Considering that disputes result in serious slashes this number can be made quite low.

Considerations

Why active disputes?

I would suggest to base the rate limiting (not accepting new bitfields) on currently active disputes on the chain. With active meaning, disputes which either have not been concluded yet or have concluded just recently. One could also suggest to base the decision on the number of dispute votes/disputes that get imported in the very block. The issue here is that malicious block producers can decide to just import less disputes to not hit the limit, which would result in a less effective rate limiting.

Why inclusion?

We could also stop backing candidates, but preventing inclusion is actually more effective.

  1. Only included candidates can get disputes with priority.
  2. Backing is implicitly rate limited by stopping inclusion, as new candidates can only get backed once the core gets freed via timeout.
  3. By keeping the core occupied we back-pressure on the backing pipeline, which results in validators to stop backing new candidates (after a short while for asynchronous backing), which makes sense as we don't intend to process them on chain, meaning less wasted work. This in turn frees up resources for processing those disputes.

If we stopped putting backed candidates on chain, the full backing pipeline would keep working and would continue to produce candidates that then just go to waste.

What about the third cause (bugs)?

In case lots of disputes are happening because of bugs, rate limiting makes sense also. While we are still processing disputes at a fast rate, it certainly is good to rate limit disputes a bit that are actually unjustified. Although bugs might be local to only particular parachains. While we could adjust the threshold in a way, that a single parachain or even two cannot realistically reach it, only filtering out candidates for some particular parachains in case disputes are isolated is worth considering.

DoS?

Doesn't this open up a DoS vector, if we stop processing parachain blocks? Yes! But it is ridiculously cost inefficient. Let's say we set MAX_ACTIVE_DISPUTES_INCLUSION_RATE_LIMIT to 10 and max age for active disputes to 1 block. This means you get 10 slashes for making parachains skip two blocks!

Forks

Honest nodes only consider a limited number of forks, therefore an adversary crafting many forks, will not succeed in getting candidates included on those forks, hence they cannot get disputed.

Attacking Finality

An attacker could try via some means to slow down finality in order to concentrate an attack: E.g. wait 100 blocks (preventing finality of those) and then raise disputes for all those candidates (10_000 for 100 candidates per block) all at once.

One good news here is that if finality is halted, block production will slow down. On the other hand, once those disputes are released - candidate inclusion will get stopped immediately, meaning we will catch up eventually. With 6 sessions time for resolving disputes, this eventually will be very likely in time on production networks. If not, we will still roll back the chain when in doubt.

Conclusion

While it is still a good idea to resolve disputes fast, we should not play cat-and-mouse with attackers if we don't have to.

Conceptually it makes sense as well: Should we really keep building chains as if nothing happened, although we know that we are under some intense attack/have serious bugs in the system?

@eskimor
Copy link
Member Author

eskimor commented Aug 18, 2022

@burdges Any reasons this would be a bad idea? Considerations from your end on MAX_ACTIVE_DISPUTES_INCLUSION_RATE_LIMIT value and the maximum age we consider a dispute "active"? Goal would be to minimize effects malicious block producers can have on the inclusion blocking.

Even a max age of 1 is already a sensible protection. While an attacker could quite likely make all open disputes conclude in a single block, another malicious block producer right afterwards would be required to could not import any new disputes, then another malicious block producer would be needed not including any disputes, but only bitfields - assuming we also consider disputes imported in the same block. So even a value of 1, requires 3 dishonest block producers in a row to weaken the inclusion rate limit.

@eskimor
Copy link
Member Author

eskimor commented Aug 18, 2022

@rphmeier I would be interested in your thoughts as well.

@eskimor
Copy link
Member Author

eskimor commented Aug 18, 2022

DoS argument: If validators are hacked as in the introduction, they don't care about the slash. So this could indeed be used to cause DoS - triggering thousands of disputes would likely result in some level of DoS as well though.

@burdges
Copy link

burdges commented Aug 19, 2022

@AlistairStewart won't like impacting liveness but yeah I'm fine with blocking inclusions on disputes. We'll move to Sergi's deterministic WASM metering, which makes 3 less likely.

We'd block only parachains with disputes on them? We could block only parachains with two or more disputes..

We do not currently put disputes on chain, right? I've forgotten how this works now..

If disputes are off-chain, then relay chain blocks could simply delay inclusion for one block with an "excessive disputed flag". If enough say this then everything runs fine, except for that parachain. If only some say this, then anyone building upon their block can move the chain forward, but anyone believing there are too many disputes won't build upon or finalize that extension.

@eskimor
Copy link
Member Author

eskimor commented Aug 19, 2022

About Liveness: To improve matters here, we can also deploy additional mechanisms for disabling spamming validators and such. Anyhow limiting inclusion when the rate of disputes reaches some threshold is a good last resort mechanism in my opinion. If Polkadot is really making use of it's scalability, then the system by definition cannot keep up with disputes for each and every candidate. We could hope that the system is slowing down under such conditions anyway, but guarantees on dispute resolultion would likely get weakend.

I would block all parachains: We are talking about a spam condition here: Disputes can only be triggered by validators and they don't care about what parachain a candidate belongs to - they can dispute anything. For the bug case, it could be that only candidates of a particular parachain causes the bug. On the other hand disputes on only a single parachain, will not overwhelm the system so would not even trigger the blocking threshold. If more parachains are affected, it becomes more and more unrealistic that it is an isolated issue and filtering for particular parachains seems not to be worth the effort.

We do put dispute votes on chain - if only to have proof for slashing. So we have evidence on chain about the amount of disputes happening and can have a consensus based decision on counter measures.

@burdges
Copy link

burdges commented Aug 20, 2022

I agree disputes spam against all parachains sounds like one attack technique. It's not 100% clear how you leverage this attack yet, and clearly the bug problem exists, hence my question about whether its worth singling out parachains.

We do not believe this complicates consensus then I guess, since disputes already go on-chain. We just need rules like:

  1. if a parachain has two disputes on-chain then we do not back or include on this parachain until all its disputes are resolved or timeout.
  2. If three parachains have disputes then we block all inclusion & backing until we resolve them all and have no active disputes, and invalid dispute slashing increases from 1% (or whatever) to 20% (or 50% or whatever).

We already have a scheme for correlated slashing, but the correlation effect in 1 could stay small, while the correlation effect in 2 matters for liveness. We should ideally decide whether the existing scheme suffices or whether it needs further work. We should ideally eventually decide if we need cross slashing type correlations too, like if grandpa and disputes interact.

@pepyakin
Copy link
Contributor

I think actually we need to consult with the parachains team to see how much they could be affected by the loss of liveness.

One thing that comes to mind is CDP-like approaches, where if the para is stopped, it could lead to under collateralization. If we take a general purpose programmable chain (anything that supports permissionless deployed smart-contracts), then there opened ended list that might allow extracting profit from affecting the liveness.

@pepyakin
Copy link
Contributor

Sorry if a stupid question, but why do we care about disputes in finalized blocks at all? If there is a legit dispute in a finalized block, there is nothing that protocol can do about it, AFAIK.

@eskimor
Copy link
Member Author

eskimor commented Sep 13, 2022

Sorry if a stupid question, but why do we care about disputes in finalized blocks at all? If there is a legit dispute in a finalized block, there is nothing that protocol can do about it, AFAIK.

Exactly it happens that I just clarified that in the guide today.

Where do we say here that we do? (Can't find it)

@pepyakin
Copy link
Contributor

My question was prompted by this sentence:

An attacker can dispute already finalized blocks, but those disputes will be treated with lower priority than disputes for unfinalized blocks, hence the attack described above should not be possible with such disputes.

@eskimor
Copy link
Member Author

eskimor commented Sep 14, 2022

Thanks! Yeah that is obsolete already, as the definition of best-effort got changed. Fixed.

@eskimor
Copy link
Member Author

eskimor commented Jan 11, 2023

With recent updates nodes handled huge load of disputes way better than before. In particular they naturally started back pressuring on backing and inclusion themselves. All the system did was slowing down, which is exactly what this ticket is about. Hence, we can dramatically reduce priority of this ticket and might punt on it completely with further testing and documenting/enforcing the back pressuring behavior.

@Sophia-Gold Sophia-Gold transferred this issue from paritytech/polkadot Aug 24, 2023
@the-right-joyce the-right-joyce added I1-security The node fails to follow expected, security-sensitive, behaviour. T9-parachains_protocol and removed I2-security labels Aug 25, 2023
claravanstaden pushed a commit to Snowfork/polkadot-sdk that referenced this issue Dec 8, 2023
* Delete obsolete code

* Reword some comments
@eskimor
Copy link
Member Author

eskimor commented Apr 5, 2024

Importing dispute already takes precedence.

@eskimor eskimor closed this as completed Apr 5, 2024
@github-project-automation github-project-automation bot moved this from Backlog to Completed in parachains team board Apr 5, 2024
bkchr pushed a commit that referenced this issue Apr 10, 2024
* Adding message relayer scripts, reformating send message scripts

* Addressing PR feedback

* Update README.md

Valid .

Co-authored-by: Hernando Castano <HCastano@users.noreply.github.com>

* Fixing send-message-from-rialto-millau

* Fixing send message script from millau to rialot

Co-authored-by: Hernando Castano <HCastano@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I1-security The node fails to follow expected, security-sensitive, behaviour.
Projects
Status: Completed
Development

No branches or pull requests

5 participants