-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sys 3925 fix primary validator #367
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
pallets/avn/src/lib.rs
Outdated
}; | ||
|
||
if index >= validators.len() { | ||
return Err(Error::<T>::InvalidValidatorIndex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an open question: Instead of throwing an error here, can we reset the index to 0 and start again? Will this cause any issues? I know we should never get in this state but if we can self heal in a safe way it would be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense.
We should not have any issues by resetting the index, going back to zero in the end of the array would be the correct fix for this error because it will follow the modular approach we currently have when advancing index.
Thanks for the comment, I think we should apply it.
OperationType::Avn => counters.avn as usize, | ||
}; | ||
|
||
if index >= validators.len() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit pick: if this ever happens on a regular basis, the validator at position 0 will be chosen more frequently than everyone else. We could use an alternative rule (to not always be 0) but that would make it more complex, so I'm ok with the 0 as long as we understand this will skew probabilities if it we frequently change the validator set. That is not expected to happen, so I guess it's just an academic point.
OperationType::Avn => counters.avn as usize, | ||
}; | ||
|
||
if index >= validators.len() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably ok to have this check here, but this is an indication that we did not do our work correctly somewhere else.
This can only happen if the validators have changed (one has been removed) after we've last chosen the primary collator. But then, that means that one action left the system in a broken state, which is not good.
How to solve this?
- ensure that all actions that are potentially destructive clean up after them. Removing validators has to update the primary collator.
Analysis: I don't like this, because it means we have to know, at each specific site (removing validators) all the other parts of the logic that depend on it. It is not the competence of the validators manager to know that, and so it should not be this doing the update of primary collator.
- make the operations associated with primary collator resilient to changes in their parameters:
This should be preferable. Ideally, we would not have to this check to update the primary collator in all the functions that make use of that storage item. Currently, we have this one (is_primary) but we semantically doing this verification and update here feels wrong.
This function should only check if a validator is the primary validator. But if there is ambiguity about what that primary is, because the value in storage may be inconsistent, then I think we should have instead a facade for that storage item, a function whose purpose it is to ensure it always reports a valid value.
This whole logic of checking could be migrated there. From the point of view of meaning of the code, it would be more correct. But we would have to discipline ourselves NOT to use the storage item directly.
From a performance aspect, every time we wanted to access the storage item we would do this check and update the item if needed. Most of times, we will do a couple of ifs and return the item's value.
The alternative, here, also incurs this extra cost, but every time we call is_primary
. So depending on how often we call each, the cost may be the same or not. But it never goes away. So I'd go with what makes more logical sense.
This PR aims to fix the validator chosen to perform certain tasks. Before we used to have this based on block number and that would always return the same validator if the given block number was a multiple of 10.
This new implementation fixes this issue by always incrementing the validator everytime we call
calculate_primary_validator
with the aim of balancing the validator resources.This PR includes:
calculate_primary_validator
).isPrimary
function.