context: add simple rtt estimator #140

roman-khimov · 2025-01-10T12:02:34Z

After b3c1c3b the only source of block time inaccuracy left is PrepareRequest propagation. In general it's undetermined and can vary from node to node and from round to round. But in practice it's relatively stable and can be averaged out over some number of samples. While backup node can't know when request was sent to it, it can measure what happens to its request when it's primary since healthy nodes are expected to respond immediately making request/response pretty similar to ping/pong roundtrip. This data also can't be perfect, messages can travel different ways from A to B and B to A on a P2P network, but in practice it's good enough and it's much better than having no data at all.

Notice that this timer adjustment scheme uses local timer only, we trust our local clock inevitably and we don't need to mess with any view of time that other nodes have (including the one from the block header).

It can be attacked by malicious peer or node delaying messages, but as we expect 2F+1 nodes to be correct one this is enough to collect proper number of responses and move on, seriously delayed response will be outdated by the time it comes then. And if any intermediary can delay messages to all nodes for arbitrary time on our P2P network we can't have any reliable timing anyway.

To me it fixes as much of neo-project/neo#2918 as only possible.

codecov · 2025-01-10T12:04:34Z

Codecov Report

Attention: Patch coverage is 82.35294% with 3 lines in your changes missing coverage. Please review.

Project coverage is 58.49%. Comparing base (27db04c) to head (47f0422).
Report is 2 commits behind head on master.

Files with missing lines	Patch %	Lines
rtt.go	70.00%	2 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #140      +/-   ##
==========================================
+ Coverage   58.31%   58.49%   +0.17%     
==========================================
  Files          32       33       +1     
  Lines        2262     2279      +17     
==========================================
+ Hits         1319     1333      +14     
- Misses        859      861       +2     
- Partials       84       85       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

After b3c1c3b the only source of block time inaccuracy left is PrepareRequest propagation. In general it's undetermined and can vary from node to node and from round to round. But in practice it's relatively stable and can be averaged out over some number of samples. While backup node can't know when request was sent to it, it can measure what happens to its request when it's primary since healthy nodes are expected to respond immediately making request/response pretty similar to ping/pong roundtrip. This data also can't be perfect, messages can travel different ways from A to B and B to A on a P2P network, but in practice it's good enough and it's much better than having no data at all. Notice that this timer adjustment scheme uses local timer only, we trust our local clock inevitably and we don't need to mess with any view of time that other nodes have (including the one from the block header). It can be attacked by malicious peer or node delaying messages, but as we expect 2F+1 nodes to be correct one this is enough to collect proper number of responses and move on, seriously delayed response will be outdated by the time it comes then. And if any intermediary can delay messages to all nodes for arbitrary time on our P2P network we can't have any reliable timing anyway. To me it fixes as much of neo-project/neo#2918 as only possible. Signed-off-by: Roman Khimov <roman@nspcc.ru>

roman-khimov · 2025-01-10T12:21:56Z

Synthetic tests relative to #56 (which is a part of the master branch now). Single node is not affected by this change, so it's omitted.

4 nodes, 5s time, no delays, nothing to talk about:

The most interesting case, 5s, 200ms delay with 1000 RPS load (much lower than max). Adjustments are clearly seen there and they don't come to the full effect yet (70 blocks are needed), but TPS is the same (as expected):

Now pushing this network to its limits, 5s, 200ms delay and no RPS boundaries. Suddenly we're regaining some of TPS metrics lost in #56 (with better block time):

Let's make it faster with 1s block time, without delays, it's slightly better:

Now 1s and 200ms delay:

Overall it's just better or about the same in every synthetic test.

roman-khimov · 2025-01-10T12:54:42Z

And testnet. Real network scattered all over the globe, 6 countries for 7 CNs, 1s target time with 200-300ms rtt. This also includes data for #56 isolated (but only two nodes out of seven!), since original here is pre-lastBlock change. Almost no transactions, 1000 blocks sample:

Zoom in for 100 blocks:

We're deviating by ~5% average with 20-30%% block time rtt, good enough for me.

AnnaShaleva · 2025-01-10T16:02:58Z

We need to publish an article.

roman-khimov added this to the v0.3.2 milestone Jan 10, 2025

roman-khimov requested a review from AnnaShaleva as a code owner January 10, 2025 12:02

roman-khimov force-pushed the rtt branch from b9e4def to 47f0422 Compare January 10, 2025 12:05

AnnaShaleva approved these changes Jan 10, 2025

View reviewed changes

AnnaShaleva merged commit d718d92 into master Jan 10, 2025
12 checks passed

AnnaShaleva deleted the rtt branch January 10, 2025 16:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

context: add simple rtt estimator #140

context: add simple rtt estimator #140

roman-khimov commented Jan 10, 2025

codecov bot commented Jan 10, 2025 •

edited

Loading

roman-khimov commented Jan 10, 2025 •

edited

Loading

roman-khimov commented Jan 10, 2025 •

edited

Loading

AnnaShaleva commented Jan 10, 2025

context: add simple rtt estimator #140

context: add simple rtt estimator #140

Conversation

roman-khimov commented Jan 10, 2025

codecov bot commented Jan 10, 2025 • edited Loading

Codecov Report

roman-khimov commented Jan 10, 2025 • edited Loading

roman-khimov commented Jan 10, 2025 • edited Loading

AnnaShaleva commented Jan 10, 2025

codecov bot commented Jan 10, 2025 •

edited

Loading

roman-khimov commented Jan 10, 2025 •

edited

Loading

roman-khimov commented Jan 10, 2025 •

edited

Loading