-
Notifications
You must be signed in to change notification settings - Fork 728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop inclusion on too many disputes #790
Comments
@burdges Any reasons this would be a bad idea? Considerations from your end on Even a max age of 1 is already a sensible protection. While an attacker could quite likely make all open disputes conclude in a single block, another malicious block producer right afterwards would be required to could not import any new disputes, then another malicious block producer would be needed not including any disputes, but only bitfields - assuming we also consider disputes imported in the same block. So even a value of 1, requires 3 dishonest block producers in a row to weaken the inclusion rate limit. |
@rphmeier I would be interested in your thoughts as well. |
DoS argument: If validators are hacked as in the introduction, they don't care about the slash. So this could indeed be used to cause DoS - triggering thousands of disputes would likely result in some level of DoS as well though. |
@AlistairStewart won't like impacting liveness but yeah I'm fine with blocking inclusions on disputes. We'll move to Sergi's deterministic WASM metering, which makes 3 less likely. We'd block only parachains with disputes on them? We could block only parachains with two or more disputes.. We do not currently put disputes on chain, right? I've forgotten how this works now.. If disputes are off-chain, then relay chain blocks could simply delay inclusion for one block with an "excessive disputed flag". If enough say this then everything runs fine, except for that parachain. If only some say this, then anyone building upon their block can move the chain forward, but anyone believing there are too many disputes won't build upon or finalize that extension. |
About Liveness: To improve matters here, we can also deploy additional mechanisms for disabling spamming validators and such. Anyhow limiting inclusion when the rate of disputes reaches some threshold is a good last resort mechanism in my opinion. If Polkadot is really making use of it's scalability, then the system by definition cannot keep up with disputes for each and every candidate. We could hope that the system is slowing down under such conditions anyway, but guarantees on dispute resolultion would likely get weakend. I would block all parachains: We are talking about a spam condition here: Disputes can only be triggered by validators and they don't care about what parachain a candidate belongs to - they can dispute anything. For the bug case, it could be that only candidates of a particular parachain causes the bug. On the other hand disputes on only a single parachain, will not overwhelm the system so would not even trigger the blocking threshold. If more parachains are affected, it becomes more and more unrealistic that it is an isolated issue and filtering for particular parachains seems not to be worth the effort. We do put dispute votes on chain - if only to have proof for slashing. So we have evidence on chain about the amount of disputes happening and can have a consensus based decision on counter measures. |
I agree disputes spam against all parachains sounds like one attack technique. It's not 100% clear how you leverage this attack yet, and clearly the bug problem exists, hence my question about whether its worth singling out parachains. We do not believe this complicates consensus then I guess, since disputes already go on-chain. We just need rules like:
We already have a scheme for correlated slashing, but the correlation effect in 1 could stay small, while the correlation effect in 2 matters for liveness. We should ideally decide whether the existing scheme suffices or whether it needs further work. We should ideally eventually decide if we need cross slashing type correlations too, like if grandpa and disputes interact. |
I think actually we need to consult with the parachains team to see how much they could be affected by the loss of liveness. One thing that comes to mind is CDP-like approaches, where if the para is stopped, it could lead to under collateralization. If we take a general purpose programmable chain (anything that supports permissionless deployed smart-contracts), then there opened ended list that might allow extracting profit from affecting the liveness. |
Sorry if a stupid question, but why do we care about disputes in finalized blocks at all? If there is a legit dispute in a finalized block, there is nothing that protocol can do about it, AFAIK. |
Exactly it happens that I just clarified that in the guide today. Where do we say here that we do? (Can't find it) |
My question was prompted by this sentence:
|
Thanks! Yeah that is obsolete already, as the definition of best-effort got changed. Fixed. |
With recent updates nodes handled huge load of disputes way better than before. In particular they naturally started back pressuring on backing and inclusion themselves. All the system did was slowing down, which is exactly what this ticket is about. Hence, we can dramatically reduce priority of this ticket and might punt on it completely with further testing and documenting/enforcing the back pressuring behavior. |
* Delete obsolete code * Reword some comments
Importing dispute already takes precedence. |
* Adding message relayer scripts, reformating send message scripts * Addressing PR feedback * Update README.md Valid . Co-authored-by: Hernando Castano <HCastano@users.noreply.github.com> * Fixing send-message-from-rialto-millau * Fixing send message script from millau to rialot Co-authored-by: Hernando Castano <HCastano@users.noreply.github.com>
One important attack vector on disputes is to try to overwhelm the system with disputes, with the goal to prevent one particular dispute from concluding (the one which would actually resolve "invalid"). E.g. imagine an attacker hacked a few honest nodes and makes them trigger disputes at a high rate (e.g. dispute each and every candidate) to bring the system to its limit and once the attacker is sure the system can no longer keep up, try to back an invalid candidate.
There are of course a couple of counter measures in place, but we should add another safe guard. An important observation is that disputes are supposed to be exceptional, so a high rate of disputes is very suspicious and the only reasons that come to mind for such an event are:
We don't need to worry about 1 obviously, we should worry about 2 and keep in mind 3.
Another observation is that we only care about disputes on unfinalized blocks. An attacker can dispute already finalized blocks, but honest validators will not participate in such disputes, hence the attack described above should not be possible with such disputes.
This means for triggering disputes at a high rate an attacker has to rely on unfinalized blocks, which are limited in number - hence the attacker can not hold back disputes for too long and try to fire them all at once, but will be required to produce them continuously - as candidates get included.
The final ingredient: Candidates can only get disputed (with priority at least) if they have been included on some chain, but this is something we can control - regardless of malicious nodes:
Proposal
When importing bitfields in a block, we take into account the number of active disputes we are aware of. If they surpass some conservative number
MAX_ACTIVE_DISPUTES_INCLUSION_RATE_LIMIT
honest block produces will stop including bitfields in a block and block importers will reject any block which does anyway. This way we limit the supply of new candidates that could get disputed, hence with a carefully picked number for above constant we ensure the dispute system cannot get overwhelmed. Considering that disputes result in serious slashes this number can be made quite low.Considerations
Why active disputes?
I would suggest to base the rate limiting (not accepting new bitfields) on currently active disputes on the chain. With active meaning, disputes which either have not been concluded yet or have concluded just recently. One could also suggest to base the decision on the number of dispute votes/disputes that get imported in the very block. The issue here is that malicious block producers can decide to just import less disputes to not hit the limit, which would result in a less effective rate limiting.
Why inclusion?
We could also stop backing candidates, but preventing inclusion is actually more effective.
If we stopped putting backed candidates on chain, the full backing pipeline would keep working and would continue to produce candidates that then just go to waste.
What about the third cause (bugs)?
In case lots of disputes are happening because of bugs, rate limiting makes sense also. While we are still processing disputes at a fast rate, it certainly is good to rate limit disputes a bit that are actually unjustified. Although bugs might be local to only particular parachains. While we could adjust the threshold in a way, that a single parachain or even two cannot realistically reach it, only filtering out candidates for some particular parachains in case disputes are isolated is worth considering.
DoS?
Doesn't this open up a DoS vector, if we stop processing parachain blocks? Yes! But it is ridiculously cost inefficient. Let's say we set
MAX_ACTIVE_DISPUTES_INCLUSION_RATE_LIMIT
to 10 and max age for active disputes to 1 block. This means you get 10 slashes for making parachains skip two blocks!Forks
Honest nodes only consider a limited number of forks, therefore an adversary crafting many forks, will not succeed in getting candidates included on those forks, hence they cannot get disputed.
Attacking Finality
An attacker could try via some means to slow down finality in order to concentrate an attack: E.g. wait 100 blocks (preventing finality of those) and then raise disputes for all those candidates (10_000 for 100 candidates per block) all at once.
One good news here is that if finality is halted, block production will slow down. On the other hand, once those disputes are released - candidate inclusion will get stopped immediately, meaning we will catch up eventually. With 6 sessions time for resolving disputes, this
eventually
will be very likely in time on production networks. If not, we will still roll back the chain when in doubt.Conclusion
While it is still a good idea to resolve disputes fast, we should not play cat-and-mouse with attackers if we don't have to.
Conceptually it makes sense as well: Should we really keep building chains as if nothing happened, although we know that we are under some intense attack/have serious bugs in the system?
The text was updated successfully, but these errors were encountered: