test(subscriber): add initial integration tests #452

hds · 2023-07-18T15:39:25Z

The console-subscriber crate has no integration tests. There are some
unit tests, but without very high coverage of features.

Recently, we've found or fixed a few errors which probably could have
been caught by a medium level of integration testing.

However, testing console-subscriber isn't straight forward. It is
effectively a tracing subscriber (or layer) on one end, and a gRPC
server on the other end.

This change adds enough of a testing framework to write some initial
integration tests. It is the first step towards closing #450.

Each test comprises 2 parts:

One or more "expcted tasks"
A future which will be driven to completion on a dedicated Tokio runtime.

Behind the scenes, a console subscriber layer is created and it's server
part is connected to a duplex stream. The client of the duplex stream
then records incoming updates and reconstructs "actual tasks". The layer
itself is set as the default subscriber for the duration of block_on
which is used to drive the provided future to completioin.

The expected tasks have a set of "matches", which is how we find the
actual task that we want to validate against. Currently, the only value
we match on is the task's name.

The expected tasks also have a set of expectations. These are other
fields on the actual task which are validated once a matching task is
found. Currently, the two fields which can have expectations set on them
are the wakes and self_wakes fields.

So, to construct an expected task, which will match a task with the name
"my-task" and then validate that the matched task gets woken once, the
code would be:

ExpectedTask::default()
    .match_name("my-task")
    .expect_wakes(1);

A future which passes this test could be:

async {
    task::Builder::new()
        .name("my-task")
        .spawn(async {
            tokio::time::sleep(std::time::Duration::ZERO).await
        })
}

The full test would then look like:

fn wakes_once() {
    let expected_task = ExpectedTask::default()
        .match_name("my-task")
        .expect_wakes(1);

    let future = async {
        task::Builder::new()
            .name("my-task")
            .spawn(async {
                tokio::time::sleep(std::time::Duration::ZERO).await
            })
    };

    assert_task(expected_task, future);
}

The PR depends on 2 others:

fix(subscriber): correct retain logic #447 which fixes an error in the logic that determines whether a task
is retained in the aggregator or not.
feat(subscriber) expose server parts #451 which exposes the server parts and is necessary to allow us to
connect the instrument server and client via a duplex channel.

This change contains some initial tests for wakes and self wakes which
would have caught the error fixed in #430. Additionally there are tests
for the functionality of the testing framework itself.

The `console-subscriber` crate has no integration tests. There are some unit tests, but without very high coverage of features. Recently, we've found or fixed a few errors which probably could have been caught by a medium level of integration testing. However, testing `console-subscriber` isn't straight forward. It is effectively a tracing subscriber (or layer) on one end, and a gRPC server on the other end. This change adds enough of a testing framework to write some initial integration tests. It is the first step towards closing #450. Each test comprises 2 parts: - One or more "expcted tasks" - A future which will be driven to completion on a dedicated Tokio runtime. Behind the scenes, a console subscriber layer is created and it's server part is connected to a duplex stream. The client of the duplex stream then records incoming updates and reconstructs "actual tasks". The layer itself is set as the default subscriber for the duration of `block_on` which is used to drive the provided future to completioin. The expected tasks have a set of "matches", which is how we find the actual task that we want to validate against. Currently, the only value we match on is the task's name. The expected tasks also have a set of expectations. These are other fields on the actual task which are validated once a matching task is found. Currently, the two fields which can have expectations set on them are the `wakes` and `self_wakes` fields. So, to construct an expected task, which will match a task with the name `"my-task"` and then validate that the matched task gets woken once, the code would be: ```rust ExpectedTask::default() .match_name("my-task") .expect_wakes(1); ``` A future which passes this test could be: ```rust async { task::Builder::new() .name("my-task") .spawn(async { tokio::time::sleep(std::time::Duration::ZERO).await }) } ``` The full test would then look like: ```rust fn wakes_once() { let expected_task = ExpectedTask::default() .match_name("my-task") .expect_wakes(1); let future = async { task::Builder::new() .name("my-task") .spawn(async { tokio::time::sleep(std::time::Duration::ZERO).await }) }; assert_task(expected_task, future); } ``` The PR depends on 2 others: - #447 which fixes an error in the logic that determines whether a task is retained in the aggregator or not. - #451 which exposes the server parts and is necessary to allow us to connect the instrument server and client via a duplex channel. This change contains some initial tests for wakes and self wakes which would have caught the error fixed in #430. Additionally there are tests for the functionality of the testing framework itself.

hawkw

overall, this looks great, thanks for working on this! i left a bunch of relatively minor comments, but overall, I'd be happy to merge this change!

console-subscriber/tests/support/state.rs

console-subscriber/tests/support/task.rs

hds · 2023-08-01T23:23:49Z

overall, this looks great, thanks for working on this! i left a bunch of relatively minor comments, but overall, I'd be happy to merge this change!

@hawkw Thanks for the review!

Unfortunately, I just realised today that the CI is telling me that I've got a race condition (and I really thought I'd removed all of them) which I'm not seeing locally and is also behaving differently on different platforms - some platforms fail, some platform actually hang forever it seems. So this PR is going to need a bit of work to find and fix that issue.

After the test ends, we were waiting for a single further update before evaluating the actual tasks (vs. the expected tasks). Now we wait for 2 updates.

sigh.

hawkw · 2023-08-23T17:15:34Z

@hds is this branch ready for a review now?

hds · 2023-08-23T22:26:14Z

@hawkw unfortunately not yet. I've still got a problem with the test run on CI when executing on Ubuntu. It hangs because the message that the test has been run never gets to the instrumentation client.

Work has been busy lately, so getting to the bottom of this problem has been slow going.

hawkw · 2023-08-23T23:04:57Z

@hds okay, cool, thanks for letting me know!

This commit has the same changeset as the original commit.

Co-authored-by: Eliza Weisman <eliza@buoyant.io>

Rather than relying on all the tasks becoming visible N update iterations after the test ends, we spawn a signal task which we then look for. Once the test has completed (which will almost certainly happen first) and the signal task has been read, we finish parsing the current update and then finish immediately.

The one in the instrumentation client.

hds · 2023-09-05T16:01:20Z

@hawkw This one is ready for re-review when you have a moment. Thanks!

hawkw · 2023-09-05T16:11:47Z

@hawkw This one is ready for re-review when you have a moment. Thanks!

awesome, thanks for all the time you've spent on this!

hawkw

overall, this looks good to me! i left a bunch of small suggestions, but none of them are major blockers.

console-subscriber/tests/framework.rs

console-subscriber/Cargo.toml

console-subscriber/tests/support/mod.rs

hawkw · 2023-09-05T16:18:55Z

console-subscriber/tests/support/mod.rs

+pub(crate) fn assert_tasks<Fut>(expected_tasks: Vec<ExpectedTask>, future: Fut)
+where
+    Fut: Future + Send + 'static,
+    Fut::Output: Send + 'static,
+{
+    run_test(expected_tasks, future)
+}


tiny nit, take it or leave it: is there a reason this is a whole additional function, rather than just being a re-export of run_test? we could

pub use subscriber::run_test as assert_tasks;

if we want it to be named assert_tasks (but also, we could just name the original function that...)

It was more a case of putting the "internal public" functions together to make the documentation clearer. Otherwise we'd have "public" docs here and on the run_test function. Not sure what the best practice is in this case to be honest.

console-subscriber/tests/support/state.rs

console-subscriber/tests/support/subscriber.rs

console-subscriber/tests/support/task.rs

Co-authored-by: Eliza Weisman <eliza@buoyant.io>

Thanks!

hds · 2023-09-06T13:38:07Z

overall, this looks good to me! i left a bunch of small suggestions, but none of them are major blockers.

@hawkw Thank you so much for all these suggestions! I really appreciate the effort, the PR is much better for them.

The `console-subscriber` crate has no integration tests. There are some unit tests, but without very high coverage of features. Recently, we've found or fixed a few errors which probably could have been caught by a medium level of integration testing. However, testing `console-subscriber` isn't straight forward. It is effectively a tracing subscriber (or layer) on one end, and a gRPC server on the other end. This change adds enough of a testing framework to write some initial integration tests. It is the first step towards closing #450. Each test comprises 2 parts: - One or more "expected tasks" - A future which will be driven to completion on a dedicated Tokio runtime. Behind the scenes, a console subscriber layer is created and its server part is connected to a duplex stream. The client of the duplex stream then records incoming updates and reconstructs "actual tasks". The layer itself is set as the default subscriber for the duration of `block_on` which is used to drive the provided future to completioin. The expected tasks have a set of "matches", which is how we find the actual task that we want to validate against. Currently, the only value we match on is the task's name. The expected tasks also have a set of "expectations". These are other fields on the actual task which are validated once a matching task is found. Currently, the two fields which can have expectations set on them are `wakes` and `self_wakes`. So, to construct an expected task, which will match a task with the name `"my-task"` and then validate that the matched task gets woken once, the code would be: ```rust ExpectedTask::default() .match_name("my-task") .expect_wakes(1); ``` A future which passes this test could be: ```rust async { task::Builder::new() .name("my-task") .spawn(async { tokio::time::sleep(std::time::Duration::ZERO).await }) } ``` The full test would then look like: ```rust fn wakes_once() { let expected_task = ExpectedTask::default() .match_name("my-task") .expect_wakes(1); let future = async { task::Builder::new() .name("my-task") .spawn(async { tokio::time::sleep(std::time::Duration::ZERO).await }) }; assert_task(expected_task, future); } ``` The PR depends on 2 others: - #447 which fixes an error in the logic that determines whether a task is retained in the aggregator or not. - #451 which exposes the server parts and is necessary to allow us to connect the instrument server and client via a duplex channel. This change contains some initial tests for wakes and self wakes which would have caught the error fixed in #430. Additionally there are tests for the functionality of the testing framework itself. Co-authored-by: Eliza Weisman <eliza@buoyant.io>

A flakiness problem has been discovered with the `console-subscriber` integration tests introduced in #452. Issue #473 is tracking the issue. It has been observed that we only "miss" the wake operation event when it comes from `yield_now()`, but not when it comes from a task that yielded due to `sleep`, even when the duration is zero. it is likely that this is due to nature of the underlying race condition. This change removes all the calls to `yield_now()` from the `framework` tests, except those where we wish to actually test self wakes.

A flakiness problem has been discovered with the `console-subscriber` integration tests introduced in #452. Issue #473 is tracking the issue. It has been observed that we only "miss" the wake operation event when it comes from `yield_now()`, but not when it comes from a task that yielded due to `sleep`, even when the duration is zero. it is likely that this is due to nature of the underlying race condition. This change removes all the calls to `yield_now()` from the `framework` tests, except those where we wish to actually test self wakes. Additionally, all the sleeps have been moved out into a separate function which describes why we're using `sleep` instead of `yield_now` when either of them would be sufficient.

hds requested a review from a team as a code owner July 18, 2023 15:39

hds force-pushed the hds/subscriber-tests branch 2 times, most recently from aac5675 to b68ef76 Compare July 18, 2023 15:40

hds force-pushed the hds/subscriber-tests branch from b68ef76 to 42fb829 Compare August 1, 2023 15:34

fixed warning (unneeded mut)

813d020

hawkw reviewed Aug 1, 2023

View reviewed changes

hds added 2 commits August 17, 2023 13:11

wait for additional update from subscriber

82abe73

After the test ends, we were waiting for a single further update before evaluating the actual tasks (vs. the expected tasks). Now we wait for 2 updates.

single threaded tests to debug

5d3245d

hds mentioned this pull request Aug 18, 2023

DO NOT MERGE test(subscriber): additional testing of integration tests on CI #458

Closed

hds added 4 commits August 23, 2023 00:06

some more debugging stuff

c1a2565

Trace to stdout for CI debugging

b56c123

sigh.

enable all OSes

380345a

bit more tracing

695644e

hds and others added 9 commits August 24, 2023 13:38

output traces from within runtime under test to console

bd9c315

confirming a suspicion

a61bdd6

Increase test state channel capacity

76bb061

maybe avoid a data race

7b32fdc

undo all testing changes

3057541

This commit has the same changeset as the original commit.

Apply suggestions from code review

d774483

Co-authored-by: Eliza Weisman <eliza@buoyant.io>

Merge branch 'main' into hds/subscriber-tests

e24d484

improved rightward drift in update loop

fee7b0a

The one in the instrumentation client.

hds mentioned this pull request Sep 5, 2023

task: fix spawn_local source location for console tokio-rs/tokio#5984

Merged

hawkw approved these changes Sep 5, 2023

View reviewed changes

hds and others added 4 commits September 6, 2023 13:48

Apply suggestions from code review

514aba1

Co-authored-by: Eliza Weisman <eliza@buoyant.io>

Apply suggestions from code review (part 2)

4af221a

Co-authored-by: Eliza Weisman <eliza@buoyant.io>

Update console-subscriber/tests/support/subscriber.rs

7619741

Co-authored-by: Eliza Weisman <eliza@buoyant.io>

Suggestions from code review

6fce209

Thanks!

hds mentioned this pull request Sep 6, 2023

Don't send task names as strings #462

Open

hds merged commit 90ae016 into main Sep 6, 2023

hds deleted the hds/subscriber-tests branch September 6, 2023 13:42

hds mentioned this pull request Oct 11, 2023

Flaky console-subscriber integration tests #473

Open

hds mentioned this pull request Oct 11, 2023

test(subscriber): prefer sleep over yield_now in tests #475

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(subscriber): add initial integration tests #452

test(subscriber): add initial integration tests #452

hds commented Jul 18, 2023

hawkw left a comment

hds commented Aug 1, 2023

hawkw commented Aug 23, 2023

hds commented Aug 23, 2023

hawkw commented Aug 23, 2023

hds commented Sep 5, 2023

hawkw commented Sep 5, 2023

hawkw left a comment

hawkw Sep 5, 2023

hds Sep 6, 2023

hds commented Sep 6, 2023

test(subscriber): add initial integration tests #452

test(subscriber): add initial integration tests #452

Conversation

hds commented Jul 18, 2023

hawkw left a comment

Choose a reason for hiding this comment

hds commented Aug 1, 2023

hawkw commented Aug 23, 2023

hds commented Aug 23, 2023

hawkw commented Aug 23, 2023

hds commented Sep 5, 2023

hawkw commented Sep 5, 2023

hawkw left a comment

Choose a reason for hiding this comment

hawkw Sep 5, 2023

Choose a reason for hiding this comment

hds Sep 6, 2023

Choose a reason for hiding this comment

hds commented Sep 6, 2023