Paused time is slightly nondeterministic #3179

sfackler · 2020-11-26T00:50:47Z

Version

└── tokio v0.3.4
    └── tokio-macros v0.3.1

Platform
Linux DESKTOP-DHO88R7 4.19.104-microsoft-standard #1 SMP Wed Feb 19 06:37:35 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Description

The time control exposed under the test-util feature allows test code involving the advance of time to run without actually needing to wait around for wall-clock time to move. In theory it should also allow those tests to be completely deterministic, with time advancing exactly the same way each time the test runs (assuming no other sources of entropy in the test of course).

However, it appears that this is not actually the case right now. The first Sleep after time is paused will observe a slightly larger amount of time passing than requested, and that surplus is nondeterministic between runs:

use tokio::time::{self, Instant, Duration};

#[tokio::main]
async fn test() {
    time::pause();
    for _ in 0..5 {
        let start = Instant::now();
        time::sleep(Duration::from_secs(1) / 11).await;
        println!("woke after {:?}", start.elapsed());
    }
}

fn main() {
    test();
    println!("------");
    test();
}

woke after 91.0741ms
woke after 91ms
woke after 91ms
woke after 91ms
woke after 91ms
------
woke after 91.1666ms
woke after 91ms
woke after 91ms
woke after 91ms
woke after 91ms

I believe this is because the timer wheel has 1ms resolution, and when time pauses the instant is at some random point in the middle of a millisecond. That random interval ends up being added to the first sleep call.

Making the paused time infrastructure absolutely deterministic would be helpful for things like simulations of behavior of an async library in various scenarios. If the execution of the simulation is deterministic, you can know that any change must have happened due to whatever modification you were making to the library itself rather than being random noise.

The text was updated successfully, but these errors were encountered:

sfackler · 2020-11-26T16:06:55Z

As a workaround, you can just stick a sleep immediately after pausing time to make everything after deterministic.

kstrafe · 2020-12-07T22:36:04Z

@sfackler sleeping directly after the pause doesn't appear to help. Tried with 0 and 1 ms.

carllerche · 2020-12-09T18:46:40Z

I don't think fixing this requires a breaking change, so I will remove the 1.0 tag.

sfackler · 2021-01-31T01:32:26Z

In 1.0, the new timer implementation is very slightly nondeterministic, and it doesn't seem like the 1ms sleep fixes things. If you run the program below twice, you often (but not always) can see the output diverge even though it should in theory be entirely deterministic.

use rand::SeedableRng;
use rand::{rngs::StdRng, Rng};
use tokio::runtime;
use tokio::time::{self, Duration, Instant};

fn main() {
    let runtime = runtime::Builder::new_current_thread()
        .enable_time()
        .build()
        .unwrap();

    runtime.block_on(async {
        time::pause();

        time::sleep(Duration::from_millis(1)).await;
        let mut rng = StdRng::seed_from_u64(1);

        let start = Instant::now();
        for _ in 0..1000 {
            let delay = rng.gen_range(Duration::from_secs(0)..Duration::from_secs(1));

            println!("{:?}: sleeping for {:?}", start.elapsed(), delay);
            time::sleep(delay).await;
            println!("{:?}: awoke", start.elapsed());
        }
    });
}

Here's diff -y output at the point at which they diverged for me locally:

...
49.92s: sleeping for 55.984549ms                                49.92s: sleeping for 55.984549ms
49.977s: awoke                                                  49.977s: awoke
49.977s: sleeping for 919.966044ms                              49.977s: sleeping for 919.966044ms
50.897s: awoke                                                | 50.898s: awoke
50.897s: sleeping for 165.606289ms                            | 50.898s: sleeping for 165.606289ms
51.063s: awoke                                                | 51.064s: awoke

The two runs eventually diverge when one wakes up 1ms later than the other, even though the sequence of sleep durations was identical in both cases.

sfackler · 2021-01-31T02:25:16Z

I think the right approach here may be to force the test-util clock to operate at 1ms resolution, even when unpaused. That would avoid any time travel when pausing/unpausing but also ensure there's no "drift" of sub-millisecond values that accumulate in the wheel.

carllerche · 2021-01-31T04:29:47Z

@bdonlan any thoughts?

sfackler · 2021-01-31T15:57:00Z

I think the actual problem is that the time driver creates its ClockTime, which is used to convert Instants to integers immediately when the runtime is created, before it has a chance to pause time. This means that there is a nondeterministic delay between when that offset is created and time is paused. That small delay means that milliseconds will round just slightly differently when the Instants passed in are close enough to a millisecond boundary with respect to the delay.

As an example, we can shrink the example program above down to just this:

use tokio::runtime;
use tokio::time::{self, Duration};

fn main() {
    let runtime = runtime::Builder::new_current_thread()
        .enable_time()
        .build()
        .unwrap();

    runtime.block_on(async {
        time::pause();

        time::sleep(Duration::from_millis(1)).await;
    });
}

And adding println!("tick {:?}, {}", dur, ms) to ClockTime::instant_to_tick, we see this output:

$ cargo run
tick 2.031699ms, 2
tick 31.7µs, 0
parking from 0 to 2
tick 2.0317ms, 2
$ cargo run
tick 2.031199ms, 2
tick 31.2µs, 0
parking from 0 to 2
tick 2.0312ms, 2

Here you can see that while we're asking to sleep for just 1ms, which should be rounded up to 1.999999ms by deadline_to_tick, the calculcated duration is actually slightly larger, with the delta being 0.0001 different between the two runs. I think that extra time is the time between runtime creation and when time is paused.

The time driver stores an Instant internally used as a "base" for future time calculations. Since this is generated as the Runtime is being constructed, it previously always happened before the user had a chance to pause time. The fractional-millisecond variations in the timing around the runtime construction and time pause cause tests running entirely in paused time to be very slightly deterministic, with the time driver advancing time by 1 millisecond more or less depending on how the sub-millisecond components of the `Instant`s involved compared. To avoid this, there is now a new option on `runtime::Builder` which will create a `Runtime` with time "instantly" paused. This, along with a small change to have the time driver use the provided clock as the source for its start time allow totally deterministic tests with paused time. Closes tokio-rs#3179

sfackler added C-bug Category: This is a bug. A-tokio Area: The main tokio crate labels Nov 26, 2020

Darksonn added the M-time Module: tokio/time label Nov 26, 2020

carllerche added this to the v1.0 milestone Dec 8, 2020

carllerche removed this from the v1.0 milestone Dec 9, 2020

sfackler mentioned this issue Jan 31, 2021

Upgrade to tokio 1.0 palantir/conjure-rust-runtime#91

Merged

carllerche assigned bdonlan Jan 31, 2021

sfackler mentioned this issue Jan 31, 2021

Make test-util paused time fully deterministic #3492

Merged

Darksonn closed this as completed in #3492 Feb 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Paused time is slightly nondeterministic #3179

Paused time is slightly nondeterministic #3179

sfackler commented Nov 26, 2020

sfackler commented Nov 26, 2020

kstrafe commented Dec 7, 2020

carllerche commented Dec 9, 2020

sfackler commented Jan 31, 2021 •

edited

Loading

sfackler commented Jan 31, 2021

carllerche commented Jan 31, 2021

sfackler commented Jan 31, 2021 •

edited

Loading

Paused time is slightly nondeterministic #3179

Paused time is slightly nondeterministic #3179

Comments

sfackler commented Nov 26, 2020

sfackler commented Nov 26, 2020

kstrafe commented Dec 7, 2020

carllerche commented Dec 9, 2020

sfackler commented Jan 31, 2021 • edited Loading

sfackler commented Jan 31, 2021

carllerche commented Jan 31, 2021

sfackler commented Jan 31, 2021 • edited Loading

sfackler commented Jan 31, 2021 •

edited

Loading

sfackler commented Jan 31, 2021 •

edited

Loading