-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fine tunning view 0 timeouts based on last block timestamp (limited by a maximum offset) #626
Conversation
That's true Vitor... looks like first backup timeout for primary should happen sooner... picture is improved now: |
neo/Consensus/ConsensusService.cs
Outdated
// unsigned int does not need the need the next line, right? TODO | ||
theoreticalDelay = theoreticalDelay < 0 ? 0 : theoreticalDelay; | ||
Log($"(long)-theoreticalDelay:{(long)-theoreticalDelay}"); | ||
reference_block_time.AddSeconds((long)-theoreticalDelay); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be differences in system time between consensus nodes, and the error is likely to reach a dozen seconds. So this adjustment may not help the consensus time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tested with delays and timestamps not synced and it helped, Erik. It really does not completely solve the problem, but it is an easy and elegant formula that may help us at this moment without making complex changes (for a better solution we should consider a real advance on PrepReq based on expected delays of next actions - but this requires more insights).
Many users and projects will like if blocks are closer to 15s, furthermore, unexpectedly, this also helped a lot in local privatenet, ensuring more stable block times, even with 1s blocks.
There are no side effects of this Draft PR (based on A1), we are still using local timestamps for writing blocks, this is just a minor limited adjustment on latency occurred during the 3~4 communications needed for creating last block.
- A1: As we may noticed, currently, there are rare blocks that are less than 15s (meaning that few timestamps are advanced).
This PR will not change this, because the theoreticalDelay
only advances PrepareRequest
when timestamp of last block is delayed compared to Local UTC time, which is still used for time-stamping.
In summary, it is a fine-tuning on the timestamp from the time that last block was signed until next primary receives it, limited to maxDelayToAdvance
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which limit do you recommend for maxDelayToAdvance
, @erikzhang?
Talking with Jeff, the idea of setting it to 1/3 of blocktime also appeared, but I still prefer static based on values found in the mainnet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be differences in system time between consensus nodes, and the error is likely to reach a dozen seconds. So this adjustment may not help the consensus time.
@erikzhang how is it possible? system clocks don't naturally get differences up to dozen seconds, unless something is very wrong with the system. Standard NTP get precision of 100ms, or am I very wrong? Timed protocols like this also depend on having a precise timer, so if we don't have that, it needs to be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very common in winter, if there is no automatic timing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very common in winter, if there is no automatic timing.
I see that timing drifts depending on temperature, several seconds a day. I'm just saying that automatic timing should be necessary for this kind of algorithms to work well... perhaps it gives some security issue, to have a trusted source of time, but I don't see any way around that. Anyway, it's outside of the scope of this discussion here.
@vncoelho without a precise timing, there's little we could change now, better keep only simple changes, such as improving primary changeview timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got, brother.
However, this formula does not change the way we timeout, it is just an error correction, it has no side effects that I detected, just benefits.
But maybe I am missing some case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change that line in another PR and then we keep this discussion here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #627
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very common in winter, if there is no automatic timing.
The CN nodes are not running outside in harsh environments though are they? Also, if CN operators were generally supposed to use a source of time that is synced to an atomic clock, shouldn’t this be the goal. Syncing by using agreement between separately controlled reputable NTP servers that represent atomic clock time should be sufficient, right?
neo/neo/Consensus/ConsensusService.cs Line 532 in c78adfb
I suggest that we only keep this line. |
In the first moment, this PR would be just this line, as you suggested. But as far as I saw there are no side effects, are there? |
@erikzhang, @shargon, @jsolman, @igormcoelho, let's continue that discussion here about setting max times and other possible adjustments. Maybe Peeeter @PeterLinX could give us a hand here when possible. Peter, it would be great to check the value of this delay in Testnet, I will add a console.writeline. |
Guys, do you think it's possible to set https://github.com/neo-project/neo/blob/master/neo/Consensus/ConsensusService.cs#L271 This would be ideal in terms of all clocks, and would make sure system is well synchronized. However it could cause more changeviews than usual, due to network delays. What's your thought on that? |
I talked to Vitor about this offline, it is not a great idea to allow it to adjust it more than 30% of the block time compared to when it is received in my opinion, since the clocks may have drifted, but if the difference is less than a threshold it seems reasonable to me. |
@jsolman, by drifted you mean the next primary or a comparison between CN clocks? |
@vncoelho both: the next primary time relative to the previous is a comparison between CN clocks. I think the change seems fine when clamped to the 30% threshold. |
@jsolman, maybe we could |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment. I think this change is reasonable for slightly improving block times. Since it only decreases block times up to 30%, and generally should just correct for network delays, it seems reasonable to me.
+1 |
I moved from draft in order to follow master easier (these drafts still needs this feature to update from branch). In case we decide to merge it, maybe we should test on testnet before merging with, at least, one or two nodes. Unless you think it's really safe. |
neo/Consensus/ConsensusService.cs
Outdated
if (localClockTimestamp > lastBlockTimestamp) | ||
{ | ||
var currentTheoreticalDelay = localClockTimestamp - lastBlockTimestamp; | ||
Log($"localClock:{localClockTimestamp}\nlastBlockTimestamp:{lastBlockTimestamp}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clean these logs before merging.
Talking with @igormcoelho we just remembered that backups should also be adjust in order to keep better view change synchronization. |
@@ -270,11 +274,30 @@ private void OnConsensusPayload(ConsensusPayload payload) | |||
private void OnPersistCompleted(Block block) | |||
{ | |||
Log($"persist block: {block.Hash}"); | |||
block_received_time = TimeProvider.Current.UtcNow; | |||
reference_block_time = TimeProvider.Current.UtcNow; | |||
theoreticalDelay = calcNetworkLatencyOrClockDriftWithMaxOffset(reference_block_time.ToTimestamp(), block.Timestamp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest renaming:
theoreticalDelay
toblockReceivedOffset
calcNetworkLatencyOrClockDriftWithMaxOffset
toCalculateAllowedBlockReceivedOffset
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both looks good, thanks.
Log($"localClock:{localClockTimestamp}\nlastBlockTimestamp:{lastBlockTimestamp}"); | ||
Log($"diff:{currentTheoreticalDelay}"); | ||
double maxTimeoutOffset = Math.Ceiling(Blockchain.SecondsPerBlock * 0.3); | ||
Log($"maxTimeoutOffset:{maxTimeoutOffset}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should combine the log statements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These logs will be probably removed, only for tests now.
Log($"diff:{currentTheoreticalDelay}"); | ||
double maxTimeoutOffset = Math.Ceiling(Blockchain.SecondsPerBlock * 0.3); | ||
Log($"maxTimeoutOffset:{maxTimeoutOffset}"); | ||
currentTheoreticalDelay = currentTheoreticalDelay > (uint) maxTimeoutOffset? (uint)maxTimeoutOffset : currentTheoreticalDelay; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of a ternary this is cleaner as:
if (currentTheoreticalDelay > maxTimeoutOffset) currentTheoreticalDelay = maxTimeoutOffset;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the compiler it will be the same I think, right @igormcoelho?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was saying it looks cleaner.
var currentTheoreticalDelay = localClockTimestamp - lastBlockTimestamp; | ||
Log($"localClock:{localClockTimestamp}\nlastBlockTimestamp:{lastBlockTimestamp}"); | ||
Log($"diff:{currentTheoreticalDelay}"); | ||
double maxTimeoutOffset = Math.Ceiling(Blockchain.SecondsPerBlock * 0.3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this method should just return a double, then you don't have to round this up, and you don't have to cast it to a double outside the method where it is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
I don't like these changes. If the system time of the consensus nodes is inconsistent, the timeout for each node to change view will be inconsistent after these changes. And if there is a node whose system time is slower than other nodes, then the block time may be periodically oscillated. |
Let's put this rocket on, @erikzhang...aahauahauah
|
@vncoelho I agree with Erik though. Also a CN could send block time maliciously 3 seconds in the past to shorten the next validator’s time to accept transactions. |
Yes, but this is easy to detect, @jsolman. Check my last comment, I think that there will be no oscilation because the header is still signed without the offset. I will try to design some tests in these next days. |
This is incorrect. There would be oscillation of block time durations with this code; a node that has a time ahead of others will consistently adjust to using the block time as the block received time, which will mean it will consistently create shorter block times than the others. For the change to not cause inconsistent block times would require that there is a protocol keeping the node times in sync with atomic clock time. I do think that such a protocol should be employed on consensus nodes though in some manner. As it stands the fundamental assessment of what will happen with node times out of sync with this change would be block times of oscillating duration by up to 30% of a block time. Whether that is an arguable trade off for having block times generally not exceed the configured block time is debatable. |
As I am trying to formulate the analysys, my above comment describes that after a node with clock advanced the next Primary with older clock will release the block normally. Even nowadays (without this PR), if we had a clock advanced we would see the "oscilation" behavior (overshoot), because the timeout out would be correct but timestamp newer, thus, the block would appear slower when created by that CN. I think that there will be no oscilation (steady-state error will keep the same). In this sense, a node with advanced clock would alone publish faster blocks (for example, instead of a 20s it would be 10s for that CN). The formula does not look like to create an avalanche effect, it can transform an overshoot to an undershoot for that specific drifed clock (which would be pontual). Maybe I missed something, but check the case described above once more. |
@vncoelho I think you missed something. Without this change all nodes just depend on the network latency of when they receive the block, this currently is not affected by clock drift unless it drifts so much that the current time is more than a block time sooner than the previous CN node that created a block. |
@jsolman I created this summary with an example, check if there is any error in the logic or values. I still did not see more steady-state oscillation (in the opposite, I even see less) with this PR, as explained, just a change that filter/cuts timeout in one direction. Without this PR//block 1 timestamp // primary has a drift of 5s // primary has clock synced precisely // primary has a drift of 3s // primary has clock synced precisely The following would be the block time currently reported: With this PR//block 1 timestamp // primary of block 2 timestamp when received b_1_t // primary of block 3 timestamp when received b_2_t // primary has a drift of 3s // primary has clock synced precisely This would be the block time reported with this PR. |
@erikzhang, check this last example above. |
The same example with negative drifts and atomic clock time (view by observers inside the P2P protocol): Without this PR//block 1 timestamp // primary has a drift of 5s negatively // primary has clock synced precisely // primary has a drift of 3s negativelly // primary has clock synced precisely The following would be the block time currently reported: With this PR//block 1 timestamp // primary of block 2 timestamp when received b_1_t // primary of block 3 timestamp when received b_2_t // primary has a drift of 3s negativelly // primary has clock synced precisely This would be the block time reported with this PR. |
Talking to @jsolman he suggested to include atomic block time in the numbers as well, which now shows a more favorable result to the current code. However, delays were not considered. I will soon update a new table with delays. Thus, the discussion turned to be a trade-off between block timestamps or block time as observed by an atomic clock. However, the current code does not consider delays between relaying
|
I think we have two different things here, that require two separate discussions. For (1), I'm not very sure of the impacts, since clocks may drift and we still don't have a guaranteed time on CN. But for (2), I think it's reasonable to do now, since it does not affect behavior in any negative aspect. |
I agree broda @igormcoelho. I have been thinking about (2) for plenty of hours.
At the end, we would use all this information and apply it quite similarly to what is being proposed here, but the theoretical basis behind the adjustment would be different. |
Leave it closed until we finish evaluation of the current dBFT and its behavior. |
* update * update * 修复几处尖括号不显示的问题 * update
Currently, the formula is the following:
Considering 15 second blocks: 15 << 1 is 30; 15 << 2 is 60; 15 << 3 is 120; 15 << 4 is 240
Figure extracted from the draft of the Yellow Paper.
In the first round all backups timeout with 30s.
Primary has 2 different timeouts:
This is a draft of the idea, we also would like to include a modification at the OnPersistCompleted function (which could be in another PR), focusing on considering last block timestamp instead of current time when blocks are received.
By printing both timestamps we can see a delay between real timestamp that the last block was created and the one used:
[19:44:40.405] persist block: currentLocalTimeStamp:1551987883 previousBlockTimeStamp:1551987880 [19:44:40.405] persist block: diff:3
We could have a logic that uses the previousBlockTimeStamp if it is lower than a maximum value, let's say 4-5 seconds.
For the Mainnet we could consider this limit close to the average delay that we are seeing on real operation, close to 5seconds.
5 seconds also sounds reasonable in accordance with some statistics avaialable.
Checking some websites( https://www.dotcom-tools.com/internet-backbone-latency.aspx and https://wondernetwork.com/pings) we found the worse case between Beijing around 400ms and Australia to Johannesburg around this as well.
Currently, we have around 3~4 necessary messages (prepReq+prepBackups+Commits+Block(if you did not relay it)) to achieve consensus plus the different sources that we need to receive.
In this sense, we could consider 1.2 seconds as a standard for ping, but we also have some clocks that are really synchronized plus our payloads that are heavier than a normal ping.
Considering http://monitor.cityofzion.io/ we can consider cases where pings of 1000ms can be seen.