Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: OTLP exporter fails to send data to backend requiring TLS #2008

Open
alanwest opened this issue Aug 9, 2024 · 4 comments
Open

[Bug]: OTLP exporter fails to send data to backend requiring TLS #2008

alanwest opened this issue Aug 9, 2024 · 4 comments
Assignees
Labels
documentation/examples Improvements or additions to documentation or examples

Comments

@alanwest
Copy link
Member

alanwest commented Aug 9, 2024

What happened?

The following code worked with opentelemetry/opentelemetry_sdk 0.23.0 and opentelemetry-otlp 0.16.0

I have the following environment variables set

  • OTEL_EXPORTER_OTLP_ENDPOINT=https://my-backend
  • OTEL_EXPORTER_OTLP_HEADERS=key-required-by-my-backend=value

Cargo.toml

[package]
name = "otlp_test"
version = "0.1.0"
edition = "2021"

[dependencies]
opentelemetry = "0.23.0"
opentelemetry-otlp = { version =  "0.16.0", features = ["tls", "tls-roots"] }
opentelemetry_sdk = { version = "0.23.0", features = ["rt-tokio-current-thread"] }
tokio = { version = "1.39.2", features = ["rt-multi-thread"] }

main.rs

use opentelemetry::{global::{self, ObjectSafeSpan}, trace::{Tracer, TracerProvider}, KeyValue};
use opentelemetry_sdk::{propagation::TraceContextPropagator, runtime, trace, Resource};

#[tokio::main]
async fn main() {
    global::set_text_map_propagator(TraceContextPropagator::new());

    let resource = Resource::new(vec![KeyValue::new(
        "service.name",
        "MyService"
    )]);

    let exporter = opentelemetry_otlp::new_exporter().tonic();

    let _tracer = opentelemetry_otlp::new_pipeline()
        .tracing()
        .with_exporter(exporter)
        .with_trace_config(trace::Config::default().with_resource(resource))
        .install_batch(runtime::TokioCurrentThread)
        .expect("failed to initialize trace pipeline");

    let tracer = global::tracer_provider().tracer("MyTracer");

    let mut span = tracer.start("MySpan");
    span.end();

    global::shutdown_tracer_provider();
}

Then I upgraded to opentelemetry 0.24.0/opentelemetry_sdk 0.24.1/opentelemetry-otlp 0.17.0

Cargo.toml

[package]
name = "otlp_test"
version = "0.1.0"
edition = "2021"

[dependencies]
opentelemetry = "0.24.0"
opentelemetry-otlp = { version =  "0.17.0", features = ["tls", "tls-roots"] }
opentelemetry_sdk = { version = "0.24.1", features = ["rt-tokio-current-thread"] }
tokio = { version = "1.39.2", features = ["rt-multi-thread"] }

main.rs

use opentelemetry::{global::{self, ObjectSafeSpan}, trace::{Tracer, TracerProvider}, KeyValue};
use opentelemetry_sdk::{propagation::TraceContextPropagator, runtime, trace, Resource};

#[tokio::main]
async fn main() {
    global::set_text_map_propagator(TraceContextPropagator::new());

    let resource = Resource::new(vec![KeyValue::new(
        "service.name",
        "MyService"
    )]);

    let exporter = opentelemetry_otlp::new_exporter().tonic();

    let tracer_provider = opentelemetry_otlp::new_pipeline()
        .tracing()
        .with_exporter(exporter)
        .with_trace_config(trace::Config::default().with_resource(resource))
        .install_batch(runtime::TokioCurrentThread)
        .expect("failed to initialize trace pipeline");

    global::set_tracer_provider(tracer_provider);

    let tracer = global::tracer_provider().tracer("MyTracer");

    let mut span = tracer.start("MySpan");
    span.end();

    global::shutdown_tracer_provider();
}

I receive the following error

OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (The service is currently unavailable): , detailed error message: Connecting to HTTPS without TLS enabled

I also tried

let exporter = opentelemetry_otlp::new_exporter().tonic().with_tls_config(tonic::transport::ClientTlsConfig::new());

But this produces a different error as it does not seem to load the system's root certs

OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (The service is currently unavailable): , detailed error message: invalid peer certificate: UnknownIssuer

API Version

0.24.0

SDK Version

0.24.1

What Exporter(s) are you seeing the problem on?

OTLP

Relevant log output

OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (The service is currently unavailable): , detailed error message: Connecting to HTTPS without TLS enabled
@alanwest alanwest added bug Something isn't working triage:todo Needs to be traiged. labels Aug 9, 2024
@lalitb
Copy link
Member

lalitb commented Aug 14, 2024

@alanwest I don't see any TLS config setting in the code above - are you setting any CA certificate(chain) in your example? Otherwise, the code is pretty similar to the basic-otlp example here - https://github.com/open-telemetry/opentelemetry-rust/tree/main/opentelemetry-otlp/examples/basic-otlp, and should work with new otel v24.

@lalitb lalitb self-assigned this Aug 14, 2024
@alanwest
Copy link
Member Author

I don't see any TLS config setting in the code above - are you setting any CA certificate(chain) in your example?

@lalitb Should I need to? I did not need to with v23. Is this a new requirement with v24?

My expectation is that I should not need to provide any special TLS config in the default case. That is, the system's certificates should be loaded. I assume this was what was happening prior to v24 and is also the behavior of the other language SDKs.

@lalitb
Copy link
Member

lalitb commented Aug 15, 2024

@alanwest From OpenTelemetry perspective, there is no change in how TLS config is handled. The only relevant change done in OpenTelemetry v0.24 is an upgrade to Tonic v0.12.0. There are a few changes regarding TLS in this release if it helps identify the issue:

https://github.com/hyperium/tonic/releases/tag/v0.12.0

--clip

tls: Add ability to add multiple ca certificates (hyperium/tonic#1724) (3457f92)
tls: Use rustls_pki_types::CertificateDer to describe DER encoded certificate (hyperium/tonic#1707) (96a8cbc)
tls: Remove tls roots implicit configuration (hyperium/tonic#1731) (de73617)

--

@alanwest
Copy link
Member Author

alanwest commented Aug 15, 2024

Thanks for the sleuth work!

The issue is specifically related to hyperium/tonic#1731. They acknowledge this as a breaking change in a follow up hyperium/tonic#1781. So in effect upgrading to this version also created a breaking change for opentelemetry-rust.

I was really close to a solution in my original post. This is the tracer provider config that works (I missed calling the new with_native_roots method introduced in tonic 1.12).

let tracer_provider = opentelemetry_otlp::new_pipeline()
        .tracing()
        .with_exporter(opentelemetry_otlp::new_exporter().tonic()
            .with_tls_config(tonic::transport::ClientTlsConfig::new().with_native_roots()))
        .with_trace_config(Config::default().with_resource(resource))
        .install_batch(runtime::Tokio)
        .expect("failed to initialize the trace pipeline");

Personally, I'd prefer the TonicExporterBuilder default to including native roots to achieve the same behavior prior to opentelemetry-rust upgrading to tonic 1.12. I think this would be more inline with what other languages are doing. That is, if https is used as the scheme for the endpoint, then configure a vanilla TLS config and include the native roots.

Though, at a bare minimum, I think documenting this new required usage in the examples would be helpful.

@cijothomas cijothomas added documentation/examples Improvements or additions to documentation or examples and removed bug Something isn't working triage:todo Needs to be traiged. labels Aug 16, 2024
philipcristiano added a commit to philipcristiano/rust_service_conventions that referenced this issue Sep 19, 2024
Tonic changed defaults that broke "how things worked" with otel

open-telemetry/opentelemetry-rust#2008

Now be more explicit to enable the previous behavior
philipcristiano added a commit to philipcristiano/rust_service_conventions that referenced this issue Sep 19, 2024
Tonic changed defaults that broke "how things worked" with otel

open-telemetry/opentelemetry-rust#2008

Now be more explicit to enable the previous behavior
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation/examples Improvements or additions to documentation or examples
Projects
None yet
Development

No branches or pull requests

3 participants