Missing logs when OTLP tests are run in parallel #1223

orhun · 2023-08-23T11:38:16Z

Problem Description

I have a logger server and I'm testing exporting trace data to it with an OTLP exporter.

/// Used for assertions.
#[derive(Debug, Eq, PartialEq)]
struct MinLogItem {
    level: LogLevel,
    fields: Value,
}

/// Spawns the logger server and returns the port number.
async fn spawn_server() -> u16 {
  // strip: implementation details
  // ...
}

#[tokio::test]
async fn test_get_logs1() {
    let deployment_id = "test-deployment1";
    let port = spawn_server().await;

    // Start a subscriber and generate some logs.
    generate_logs(port, deployment_id.into());

    // Get the generated logs
    let dst = format!("http://localhost:{port}");
    let mut client = LoggerClient::connect(dst).await.unwrap();
    let response = client
        .get_logs(Request::new(LogsRequest {
            deployment_id: deployment_id.into(),
        }))
        .await
        .unwrap()
        .into_inner();

    // Assert the results.
    let expected = vec![
        MinLogItem {
            level: LogLevel::Info,
            fields: json!({"message": "[span] log_things", "deployment_id": format!("\"{deployment_id}\"") }),
        },
        MinLogItem {
            level: LogLevel::Error,
            fields: json!({"message": "error"}),
        },
        MinLogItem {
            level: LogLevel::Warn,
            fields: json!({"message": "warn"}),
        },
    ];
    assert_eq!(
        response
            .log_items
            .into_iter()
            .map(MinLogItem::from)
            .collect::<Vec<_>>(),
        expected
    );
}

#[tokio::test]
async fn test_get_logs2() {
    let deployment_id = "test-deployment2";
    // the rest is identical with the test above
    // ...
}

/// Set up a tracing subscriber and produce logs.
fn generate_logs(port: u16, deployment_id: String) {
    let tracer = opentelemetry_otlp::new_pipeline()
        .tracing()
        .with_exporter(
            opentelemetry_otlp::new_exporter()
                .tonic()
                .with_endpoint(format!("http://127.0.0.1:{port}")),
        )
        .with_trace_config(opentelemetry::sdk::trace::config().with_resource(
            opentelemetry::sdk::Resource::new(vec![
                KeyValue::new("service.name", "test"),
                KeyValue::new("deployment_id", deployment_id.clone()),
            ]),
        ))
        .install_batch(opentelemetry::runtime::Tokio)
        .unwrap();
    let otel_layer = tracing_opentelemetry::layer().with_tracer(tracer);

    let _guard = tracing_subscriber::registry()
        .with(otel_layer)
        .set_default();

    log_things(deployment_id);
}

#[instrument]
fn log_things(deployment_id: String) {
    error!("error");
    warn!("warn");
}

When I run this with cargo test, some logs are missing from the second test which is weird because the tests shouldn't effect each other:

$ cargo test

running 2 tests
test test_get_logs2 ... FAILED
test test_get_logs1 ... ok

failures:

---- test_get_logs2 stdout ----
thread 'test_get_logs2' panicked at 'assertion failed: `(left == right)`

Diff < left / right > :
<[]
>[
>    MinLogItem {
>        level: Info,
>        fields: Object {
>            "deployment_id": String("\"test-deployment2\""),
>            "message": String("[span] log_things"),
>        },
>    },
>    MinLogItem {
>        level: Error,
>        fields: Object {
>            "message": String("error"),
>        },
>    },
>    MinLogItem {
>        level: Warn,
>        fields: Object {
>            "message": String("warn"),
>        },
>    },
>]

', logger/tests/repro.rs:168:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

failures:
    test_get_logs2

But if I run them in order with a single thread, they work as expected:

$ cargo test -- --test-threads 1

running 2 tests
test test_get_logs1 ... ok
test test_get_logs2 ... ok

Things that I tried

Switching to async batch mode with .install_batch(opentelemetry::runtime::Tokio)
Using multiple thread scheduler with #[tokio::test(flavor = "multi_thread")]
Exporting remaining spans with opentelemetry::global::shutdown_tracer_provider

No luck.

Reproduce steps

The code is available here: https://github.com/orhun/shuttle/blob/logger_tests_repro/logger/tests/repro.rs

$ git clone https://github.com/orhun/shuttle

$ cd shuttle

$ git checkout logger_tests_repro

$ cd logger/

$ cargo test

Additional context

The text was updated successfully, but these errors were encountered:

shaun-cox · 2023-08-23T11:58:12Z

I see you mentioned you tried opentelemetry::global::shutdown_tracer_provider(), but I'm wondering if you adapted it with tokio to run as a task to allow any possible other concurrent tasks to execute as well? (Otherwise, it just blocks the current thread.) A long shot here, but thought I'd mention it as I've had to do this in my code, particularly when using the TokioCurrentThread executor.

tokio::task::spawn_blocking(opentelemetry::global::shutdown_tracer_provider)
        .await

orhun · 2023-08-23T12:26:55Z

I just tried out your suggestion as follows:

    generate_logs(port, deployment_id.into()).await;
    tokio::task::spawn_blocking(opentelemetry::global::shutdown_tracer_provider)
        .await
        .unwrap();

I also tried running it after log_things(deployment_id); (by making the generate_logs async) but had no luck.

TommyCpp added bug Something isn't working triage:accepted Has been triaged and accepted. labels Aug 23, 2023

hdost added this to the Logging Bridge SDK Stable milestone Oct 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing logs when OTLP tests are run in parallel #1223

Missing logs when OTLP tests are run in parallel #1223

orhun commented Aug 23, 2023

shaun-cox commented Aug 23, 2023

orhun commented Aug 23, 2023

Missing logs when OTLP tests are run in parallel #1223

Missing logs when OTLP tests are run in parallel #1223

Comments

orhun commented Aug 23, 2023

Problem Description

Things that I tried

Reproduce steps

Additional context

shaun-cox commented Aug 23, 2023

orhun commented Aug 23, 2023