Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing logs when OTLP tests are run in parallel #1223

Open
orhun opened this issue Aug 23, 2023 · 2 comments
Open

Missing logs when OTLP tests are run in parallel #1223

orhun opened this issue Aug 23, 2023 · 2 comments
Labels
bug Something isn't working triage:accepted Has been triaged and accepted.

Comments

@orhun
Copy link

orhun commented Aug 23, 2023

Problem Description

I have a logger server and I'm testing exporting trace data to it with an OTLP exporter.

/// Used for assertions.
#[derive(Debug, Eq, PartialEq)]
struct MinLogItem {
    level: LogLevel,
    fields: Value,
}

/// Spawns the logger server and returns the port number.
async fn spawn_server() -> u16 {
  // strip: implementation details
  // ...
}

#[tokio::test]
async fn test_get_logs1() {
    let deployment_id = "test-deployment1";
    let port = spawn_server().await;

    // Start a subscriber and generate some logs.
    generate_logs(port, deployment_id.into());

    // Get the generated logs
    let dst = format!("http://localhost:{port}");
    let mut client = LoggerClient::connect(dst).await.unwrap();
    let response = client
        .get_logs(Request::new(LogsRequest {
            deployment_id: deployment_id.into(),
        }))
        .await
        .unwrap()
        .into_inner();

    // Assert the results.
    let expected = vec![
        MinLogItem {
            level: LogLevel::Info,
            fields: json!({"message": "[span] log_things", "deployment_id": format!("\"{deployment_id}\"") }),
        },
        MinLogItem {
            level: LogLevel::Error,
            fields: json!({"message": "error"}),
        },
        MinLogItem {
            level: LogLevel::Warn,
            fields: json!({"message": "warn"}),
        },
    ];
    assert_eq!(
        response
            .log_items
            .into_iter()
            .map(MinLogItem::from)
            .collect::<Vec<_>>(),
        expected
    );
}

#[tokio::test]
async fn test_get_logs2() {
    let deployment_id = "test-deployment2";
    // the rest is identical with the test above
    // ...
}

/// Set up a tracing subscriber and produce logs.
fn generate_logs(port: u16, deployment_id: String) {
    let tracer = opentelemetry_otlp::new_pipeline()
        .tracing()
        .with_exporter(
            opentelemetry_otlp::new_exporter()
                .tonic()
                .with_endpoint(format!("http://127.0.0.1:{port}")),
        )
        .with_trace_config(opentelemetry::sdk::trace::config().with_resource(
            opentelemetry::sdk::Resource::new(vec![
                KeyValue::new("service.name", "test"),
                KeyValue::new("deployment_id", deployment_id.clone()),
            ]),
        ))
        .install_batch(opentelemetry::runtime::Tokio)
        .unwrap();
    let otel_layer = tracing_opentelemetry::layer().with_tracer(tracer);

    let _guard = tracing_subscriber::registry()
        .with(otel_layer)
        .set_default();

    log_things(deployment_id);
}

#[instrument]
fn log_things(deployment_id: String) {
    error!("error");
    warn!("warn");
}

When I run this with cargo test, some logs are missing from the second test which is weird because the tests shouldn't effect each other:

$ cargo test

running 2 tests
test test_get_logs2 ... FAILED
test test_get_logs1 ... ok

failures:

---- test_get_logs2 stdout ----
thread 'test_get_logs2' panicked at 'assertion failed: `(left == right)`

Diff < left / right > :
<[]
>[
>    MinLogItem {
>        level: Info,
>        fields: Object {
>            "deployment_id": String("\"test-deployment2\""),
>            "message": String("[span] log_things"),
>        },
>    },
>    MinLogItem {
>        level: Error,
>        fields: Object {
>            "message": String("error"),
>        },
>    },
>    MinLogItem {
>        level: Warn,
>        fields: Object {
>            "message": String("warn"),
>        },
>    },
>]

', logger/tests/repro.rs:168:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

failures:
    test_get_logs2

But if I run them in order with a single thread, they work as expected:

$ cargo test -- --test-threads 1

running 2 tests
test test_get_logs1 ... ok
test test_get_logs2 ... ok

Things that I tried

  • Switching to async batch mode with .install_batch(opentelemetry::runtime::Tokio)
  • Using multiple thread scheduler with #[tokio::test(flavor = "multi_thread")]
  • Exporting remaining spans with opentelemetry::global::shutdown_tracer_provider

No luck.

Reproduce steps

The code is available here: https://github.com/orhun/shuttle/blob/logger_tests_repro/logger/tests/repro.rs

$ git clone https://github.com/orhun/shuttle

$ cd shuttle

$ git checkout logger_tests_repro

$ cd logger/

$ cargo test

Additional context

@shaun-cox
Copy link
Contributor

I see you mentioned you tried opentelemetry::global::shutdown_tracer_provider(), but I'm wondering if you adapted it with tokio to run as a task to allow any possible other concurrent tasks to execute as well? (Otherwise, it just blocks the current thread.) A long shot here, but thought I'd mention it as I've had to do this in my code, particularly when using the TokioCurrentThread executor.

tokio::task::spawn_blocking(opentelemetry::global::shutdown_tracer_provider)
        .await

@orhun
Copy link
Author

orhun commented Aug 23, 2023

I just tried out your suggestion as follows:

    generate_logs(port, deployment_id.into()).await;
    tokio::task::spawn_blocking(opentelemetry::global::shutdown_tracer_provider)
        .await
        .unwrap();

I also tried running it after log_things(deployment_id); (by making the generate_logs async) but had no luck.

@TommyCpp TommyCpp added bug Something isn't working triage:accepted Has been triaged and accepted. labels Aug 23, 2023
@hdost hdost added this to the Logging Bridge SDK Stable milestone Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage:accepted Has been triaged and accepted.
Projects
None yet
Development

No branches or pull requests

4 participants