-
Notifications
You must be signed in to change notification settings - Fork 56
PBTS: should synchrony parameters be adaptive? #371
Comments
My first response is to strongly agree that there should be a way to adapt the synchrony parameter over time. This is effectively a hard requirement. Without it, misconfigured consensus parameters can not easily be changed. I.e., if you set the If I'm reading correctly, there are two possible proposals here:
While proposal (1) was partially of my idea, I'm now more in favor if of (2). (2) allows for the liveness needed to update potentially misconfigured values while not completely losing synchrony in bad rounds. It also allows you to calculate how large the synchrony bound was for each height by using the |
In principle there would also be option:
Proposal (1) we should rule out as for such instances there will be no relation between block time and real time (or reference time) can be ensured. (2) ensures that there is a relation. The potential difference just becomes larger. (3) might be the most complex one to implement. But it would allow the keep the tight synchronization, e.g., in the case where the proposer of round 0 is crashed. At the same time it will ensure liveness after some rounds. |
While I think that using adaptive synchronous parameters means giving up from synchrony, I see the need for a way to prevent a system from halting due to bad choices of parameters. After sleeping on that matter, I tend to be more aligned with the option (3) proposed by Josef. Like option (1), it has no impact on the regular operation of the protocol, as the original (static) timely predicate will be adopted in practice. But, unlike option (1), we don't fully drop all timing bounds (and guarantees) in case of bad choices of parameters. For me, two questions remain open:
For the first question, we might consider the number of failed rounds that indicate that is actually For the first question, the possibilities are really open. We can just double In terms of implementation, I don't think it would be complicated, while it can be precisely defined. I assume that something like: |
OK, then let's introduce 'round_limit' and let's double the factor of 'MSGDELAY' in the timely function every 'round_limit' rounds. I guess 'round_limit' should be the same at all validators, so we might need it to make a consensus parameter? @williambanfield, what do you think? |
Maybe we can take the opportunity to adopt more clear/understandable names for this parameters. A first proposal:
|
I think that the idea of adding a 'round limit' value at which the A possible option would be to do it after 1/3+ of the validator set has had a chance to propose. This way, there must have been some correct process that attempted to propose but that was not able to. Thoughts? |
Not really a suggestion, but this value or parameter must be known by light clients or anyone planning to use block times for something. Associating it to the number of validators seems reasonable, but this also means that its value will be dynamic... |
A point for having this round limit as a parameter is to consider other uses it can have (possibly in the future), other than relaxing synchronous requirements. |
I do think that this makes sense and I like the idea of doubling it after 1/3+ of the validating power have proposed untimely headers.
At the moment the light client never reads the consensus params (which it definitely could do) to understand the evidence age (and derive the trusting period from) nor to understand the timing params (which it uses as part of verification). They are currently all manually set by the operator and need to be updated each time these parameters change. We could definitely consider something more dynamic in the future.
I think there's probably some validity to what your'e saying but I'm more of the opinion that we shouldn't be using future uses that we have currently don't have an idea about as justification for adding parameters. Adding fields is generally a lot easier than removing them so it's often better to wait until there are actual solid use cases before introducing a field. |
I don't think that this proposal is possible, because it requires knowledge of why I round didn't succeed. And this, at my understanding, will be an internal derivation that will be hard to "export". Of course, we can consider an approximation of that 1/3+ by considering |
If adding it as parameter adds too much complexity, can we just fix |
While I agree with this solution, I still have to ask: why 10? |
To clarify here, I don't think we need to actually know why the consensus round failed for us to use it, since it's just an estimate. Picking a value like 10 is fine with me as well. This value just exists to ensure liveness in the case of misconfiguration, not to really be theoretically perfect. |
Not sure about 10 ;-) From a theoretic viewpoint liveness should be ensured for all values of We could also try to actually find a good approximation for +1/3, e.g., assuming the current voting powers, how long it takes to complete "round-robin" that includes all validators as proposer, and perhaps try to understand what that means if the number of validators goes from 150 to 200? |
iiuc, we want to start increasing |
It should never.
Definitely. The proposed approach is a workaround for tolerating bad consensus parameters. The point that @williambanfield pointed from the beginning is that it will not be possible to update |
This is a very good point, that we should address. My opinion in this regard is that a bad This was one of the reasons for which I prefer the approach of not making this parameter actually adaptive, i.e., not updating it often, after each failed round, so that when it is relaxed (after |
* PBTS: second version of system model * PBTS: new model referred in algorithm spec * PBTS: removed model discussion from algorithm spec * PBTS: corrections on the ystem model * PBTS: a pretty complex problem statement * PBTS: minor fixes on the problem spefication * PBTS: liveness part of problem specification * PBTS: link updated, outdated note on sysmodel_v1 * Update spec/consensus/proposer-based-timestamp/pbts-algorithm_002_draft.md Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com> * Apply William's suggestions from code review Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com> * PBTS: new discussion and definition for accuracy * Apply Josef's suggestion from code review Co-authored-by: Josef Widder <44643235+josef-widder@users.noreply.github.com> * PBTS: some tags added to sysmodel * PBTS: motivation and link to Issue #371 * PBTS: fixing lint error Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com> Co-authored-by: Josef Widder <44643235+josef-widder@users.noreply.github.com>
I came up with a simple prototype for using the amount of voting power seen on the network to determine the synchrony for the block timestamp. This prototype can be seen here: This would add the requirement that light clients wishing to verify the synchrony of a particular block be able to determine the validator 'priorities' in a trusted way at a given height. This can be queried via the The light client would then run the proposer selection algorithm starting at round 0 up to the round at which the block was decided, keeping track of the total amount of voting power seen. Every time the light client observes that This mechanism adds some complexity to light clients over the hardcoded alternative, but has the upside of being more capable of adapting to the size of the network. I think that if we don't mind adding this complexity to light clients, this is a somewhat cleaner alternative. |
Nice work, I have to say that his solution looks surprisingly simple, in the sense that I assumed it would require more lines, more complex tricks. From a specification view-point, however, I would prefer to alter the In other words, I would made this change more visible to the code reader, which is a least clever solution, but clearer. |
From the light client point of view, my first question is: how much it cares about the In the regular operation (one or few rounds per height) we don't need all this complexity. But if the round at which a block was decided in not zero (or is above some limit), the client should ask himself why, and then use the method to retrieve how many times Again, I think this solution is great, but I would make it clearly a non-regular case, something inside an if clause in order to make this clear, to possibly generate some log message, and allow the client to react to this, maybe skipping this block as it is not as reliable as a block decided in round zero (one, or two). |
I think what we should do now is to precisely specify what we can expect from a block, e.g., if the commit is for round 1 the time is within However, in a next step, once we have the precise properties of the time in the block with respect to the round number, we can plan to add complexity to the lightclient and/or the analysis of the lightclient security. |
I was worried about this as well, which is part of why I did the prototype: I wanted to see if it was really that bad, although, the complexity is somewhat masked by the fact that the proposer-selection algorithm is already implemented here.
Yeah, we should definitely log this to the user and make it clear when it changes. The updated
Agreed, this is not the regular case and we can make that obvious to users and readers of the code. I am waiting on user feedback from 1-2 application developers, but still believe that this 'thirdsSeen' mechanism is a good way forward. |
After further consideration, I think that my solution will not work for the following reason: Since this solution is proving to be even more difficult than I had anticipated I would advocate against using it and instead using a simple hardcoded value. To build confidence in this decision, I spoke with @ValarDragon from Osmosis who seem similarly doesn't see much issue with the hardcoded value. I also spoke with @liamsi who is still investigating but seems to be leaning in a similar direction. Let's hardcode 10 rounds before relaxation. We can re-consider during the next release if needed. |
This change adds logic to double the message delay bound after every 10 rounds. Alternatives to this somewhat magic number were discussed. Specifically, whether or not to make '10' modifiable as a parameter was discussed. Since this behavior only exists to ensure liveness in the case that these values were poorly chosen to begin with, a method to configure this value was not created. Chains that notice many 'untimely' rounds per the [relevant metric](#7709) are expected to take action to increase the configured message delay to more accurately match the conditions of the network. closes: tendermint/spec#371
Implemented in tendermint/tendermint#7739, closing this issue. |
This change adds logic to double the message delay bound after every 10 rounds. Alternatives to this somewhat magic number were discussed. Specifically, whether or not to make '10' modifiable as a parameter was discussed. Since this behavior only exists to ensure liveness in the case that these values were poorly chosen to begin with, a method to configure this value was not created. Chains that notice many 'untimely' rounds per the [relevant metric](tendermint/tendermint#7709) are expected to take action to increase the configured message delay to more accurately match the conditions of the network. closes: tendermint/spec#371
In order to ensure that blocks have timestamps that reflect the actual real time at which a block was proposed, PBTS adopts two synchrony parameters for validating proposal times:
PRECISION
bounds the different of values read from clocks of processes at the same instant of real time. In terms of validating proposals, it is meant to prevent Byzantine validators from assigning to proposal times in the future.MSGDELAY
bounds the time it takes for aProposal
message be delivered to all (correct) validators. In terms of validating proposals, it is meant to prevent Byzantine validators from assigning to proposals times in the past.While meant to prevent Byzantine proposers from assigning arbitrary times to proposed values, an equivocate choice of values for this parameters can represent a liveness issue. This is particularly relevant for the
MSGDELAY
parameter, expected to be a conservative estimation of the maximum delay for broadcasting aProposal
. If a Tendermint network adopts a value for theMSGDELAY
parameter that is likely to be violated in real executions, correct validators may reject proposals from correct proposers because they are deemeduntimely
. More specifically, a correct validator will assume that the proposer assigned a timestamp in the past to the proposal, a malicious behavior, while the proposer was actually correct but theProposal
took longer than the expected to be received.An alternative to circumvent this scenario, i.e. bad choices for synchronous parameters, is to relax the assumptions as rounds of consensus fail, so that eventually the timely requirements will not prevent a correct validator from accepting a value proposed (and timestamped) by a correct proposer.
A reference of how to relax timing parameters is already present in Tendermint, as the duration of the timeouts is a increasing function of the round number. As rounds fail, and new rounds are required, the timeouts become longer, so that eventually they stop expiring prematurely. The same approach could be adopted for the synchrony requirements for PBTS, but we would need to specify how the bounds will be relaxed as the round number increases.
Another alternative, mentioned in discussions, is to restrict the adoption of timing requirements to validate proposal times to a number of rounds, and then drop them. This would represent a similar approach, conceptually speaking, but instead of slowly relaxing the requirements, they will "fully" relaxed (dropped) at some point.
The text was updated successfully, but these errors were encountered: