-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check on-chain for attestations at validator launch #7985
Comments
I will advocate no flag. We should eliminate the number of the flags for both beacon node and validator. More flags == more doc support == more room for errors and unnecessary frictions. What if the db doesn't have previous epoch info but the block attestation has? Can you also run this idea in the eth2 discord? Will be good to get opinions from other implementor teams |
Thanks for the tag, I wasn't allowed. I think if the db doesn't have info from the previous epoch but the block has it the validator should stop and assume either the db is corrupted (had a dirtly stop for example) or there's another validator running the same keys. In both cases you should stop to prevent such attestations from happening. If you do not want to add an extra flag that is fine too. In principle if it was a corrupt db, 3 epochs into the future you'd be able to relaunch. And if you had another validator attesting, then you still do not want to launch anyway. So I am not against your suggestion of not having this flag. |
Flag might be useful if the user intentionally understands the risk and still wants to procee. Example: 1.) epoch 100: attest on computer A, stop validator I'm not sure how really applicable the above case is but it's a trade off that's worth looking at |
@arnetheduck for pointed me to this implementation of Nimbus which is pretty much the same as I am suggesting except they default for 2 epochs instead of three. I haven't read in detail because I do not know their code, but an implementation detail (that I admittedly might be reading wrong) is that they wait and listen for 2 epochs before they start broadcasting instead of asking the network for the last 2 epochs blocks and checking attestations. |
Is there a reason for 2 or 3 epochs? Technically, you could be slashed as far back as the weak subjectivity period. (i think) |
Yes but this should only happen in a forked environment. It seems like a good compromise since this request will impact every launch of the validator. Notice that this is not meant to be a replacement of the slashing protection but rather a safeguard against other client using the same keys. These are the only slashings we've seen so far |
In addition to nimbus implementation here's lighthouse's Also, I think this newer issue can be safely closed since it is most likely to be covered by any solution to this issue |
Ok thanks, shifting over discussion to #8513 . |
I would've closed the newer issue instead as this one actually has a discussion |
Ah sorry, misread this . Porting over message to this issue. |
I wrote a short design doc on something with similar goals a while back: However the main difference here is that instead of listening in on gossip we instead just search all attestation from the last N |
I like this approach with the endpoiint to check for sanity. I just wanted to point out that there is an economical tradeoff between security and the price payed by not attesting at launch. While it's true that all current slashings would have been prevented by this method of looking back a few epochs and gathering attestations. It worries me a little in the long term. A validator is much more likely to be slashed in a forked/non-finalizing scenario. Currently we have seen validators running multiple instances of the same keys and not being slashed for a long time cause they were attesting precisely on the same data. This is less likely to happen on a forked scenario. In this situation, we will be dropping blocks and possibly attestations will not be included later if they were from different forks, so that when we request the attestations for the last n epochs and we don't get any from our key, in fact we may have been attesting anyway on a different fork. Having said so, I think the approach of looking back is probably better anyway, even when forked we would expect at least 1 attestation to be included in later blocks in our current fork, even if they were on a different fork at the time of attesting. I have no intuition nor hard numbers about how many attestations went actually missing on Medalla. |
I love this ideal: it's easier to implement
|
Will implement the change based on this Design |
I wanted to leave here a method that I think has many advantages and outweighs the minor disadvantages which I enumerate below. In short the method consists of
This will stop any validator with a doppelganger running somewhere and an included attestation with target bigger than
The problem in a) is not what this issue tries to solve: running two validators on different computers while the network is highly forked is dangerous anyway. We are trying to simply prevent the slashings by operator mistakes. The problem in b) is completely negligible given that this is the order of magnitude of all active validators now. For this to happen while at the same time being running on two machines at the same time the probability is just infinitesimal. b) can be completely mitigated by replacing positive balance by The advantages of this methods are several on the other hand:
One argument that can be said is that if the user is running hundreds of validators, and they are running on two machines, this method will be slower, since it will require 3 states, while the method of looking at back blocks will likely find already an attestation in the very first block fetched. But this argument is flawed, because for this user, in the vast majority of cases that he's relaunching, will only be launching in one machine and it will require grabbing 64 blocks and going through the loop on bitfields on each one of them to check that it hasn't attested. |
Good and definitely much faster than the brute search through attestations.
|
This would be a fair assumption I guess, if the amount is less than 2 then going forward by skipping the check would be fine then.
Slashing by operator mistakes do happen and they are much more likely to happen in cases where the network is highly forked. For that reason we should default to the most secure method, which would be to check On the cost of this, I would disagree that this is cheaper/etc. Iterating through attestations in each block is much cheaper compared to regenerating the state. All methods so far do require us to regenerate states and then use that. We have cached committees during shuffling, so we simply can use that to check for each validator that has attested/or not. I am not against having the balance check instead, but it will require us to change again in Altair. Also any future forks that might change validator accounting in anyway will have to be handled correctly here. Doing the attestation check is more of a 'direct' /in-protocol check to validate a particular validator's participation versus doing a balance check which is more indirect here. |
@ahadda5 0.5% ^ 2 does not equal to 0.5% * 0.5 . |
@nisdas ahh correct 2.5 *10^ -5 |
No, I am checking
No, if the validator was stopped less than 2 epochs ago (I mentioned 2 instead of three in my message) then if balance changed we cannot infer that we have a duplicate because attestations for
If you miss head and target and your attestation is included in the next slot you obtain
The cost of computing the actual lower bound adds an unnecessary complication: you need to factor in participation rate and base reward which are slot and fork dependent. This will require going block by block which defeats the purpose of my proposal and it does not give you any real benefits. Instead, what you can do is simply decide to go back |
This could be worth a try, although it would still be vulnerable to future changes in the spec. After thinking through I am more amenable to using this method. |
@nisdas ok i will create another pull . Explain the following in new design please
What i'll do is write up a detailed design and come back before i start coding. |
@ahadda5 There isn't a need to redo this in a new pull request. You essentially just need to change what the beacon rpc method does. Previously it was checking all attestations in blocks for a a validator's index. Now you just need to check a validator's balance for the relevant state. Everything else stays the same.
N is basically your epochs since the validator's last recorded history. Ex: Validator was last observed in Epoch 8 and current epoch is 12. So we check from epoch 9 - 12.
You just need to check if the balance delta for a validator is more than 0 across the N epochs.
Sure thing, this doesn't change the basic structure of the feature much. The check is boxed into the beacon rpc method. |
For some reason replies to this thread got buried in my mailbox:
I think this is wrong, on current situations, having a validator offline for several epochs will cost a lot unnecessarily, which is what my proposal wants to avoid: I want to set |
I think we might be referring to differing things here. I was referring to the historical lookback when a validator starts up while you are referring to the doppelganger waiting period. |
No indeed I'm referring to the lookback as well since I want to avoid requesting blocks/states |
@potuz for the case where Head- Target < 2 ,why can't we employ the signed block search approach. It guarantees if we have duplicate activity. |
That is all my proposal: to look at balance changes and nothing else! but you are not going to see balances on the blocks, balances are on the beacon states. EDIT: Ohh Apologies, I misread the statement: you mean to use signed blocks for HEAD-TARGET < 2. I think if the last target was HEAD then it may be impossible to discern a duplicate instance running from our own instance. If the last target was HEAD-1 then it may be possible to see a vote for HEAD and call that a duplicate entry. I am not sure if it's worth the trouble implementing this. |
Gotcha, I guess this is configurable. We can test it out to see which parameter value would be desirable. |
🚀 Feature Request
Description
In the event of having a corrupted
validator.db
an easy and cheap safeguard would be to grab attestations from the last 3 epochs from the beacon-node and check if the validating indexes appear there. This would prevent having a double attestation in most cases. This test can be run only at launchtime and if the validator.db does not show any activity for the last/current epoch. So that the impact will be minimal in long running validators.I'd like to see what the team think of possible design patterns before submitting a PR, but I think this can be as easy as pulling the last 3 epochs blocks from the node, and comparing the attestations for the indexes of the validating accounts. And stopping if something is found on-chain that is not found on the
validator.db
with a possible flag--disable-launchtime-check
or whatever to override this behavior.Additional Info
Discussion in Discord: https://discord.com/channels/476244492043812875/730447046930071672/782041664327909417
The text was updated successfully, but these errors were encountered: