-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Errors not being reported for kong plugin Opentelemetry #13776
Comments
@pawandhiman10 can you attach you config file for the OTEL plugin? |
hello @pawandhiman10 I believe Datadog might be expecting slightly different fields based on this, are you using the OTLP ingestion with the datadog agent? In that case, I would expect the ingestion process of OTLP data to take care of parsing any errors and translating any fields as needed. I can confirm that errors are displayed correctly by other tools such as Jaeger and Grafana. Also: could you share an example of an error that is being reported by your system and you are expecting to be visible in the UI? |
@samugi Errors mean 5xx status code errors here. Yes, currently the ingestion is based on OLTP ingestion with Datadog agent. Could you point me to some reference docs here on how to get the errors (5xx) parsed? |
@pawandhiman10 in that case, it might be expected. 500 response status codes are not always reported as errors. Today errors are reported for failures (exceptions) that occur during the execution of plugin phase handlers, errors returned by the internal http client and dns failures. Could you test with a 500 status code that originates from an exception in a plugin? This can be tested with a plugin that throws an explicit error like: |
@samugi Seems like the error tag is missing with the 5xx status codes. Is there any way to add this tag with >=500 status code from the requests, |
@pawandhiman10 what you describe is currently expected. Status codes are reported (whether they 2xx, 4xx, 5xx, etc), but spans are not being set the error state when this happens. Errors are currently reserved for actual errors occurring during the execution of plugins code (exceptions) and a few other scenarios that don't include specific response codes. Would you be interested in setting an error state to your root span in case of 5xx status code from your upstream? If that is the case, such a change should probably be configurable, for example some might expect 4xx response codes from time to time, while others would consider them error conditions. We are always interested in improving/updating our tracing instrumentation, but I cannot guarantee this change in particular will be made. |
@pawandhiman10 if you have access to the collector configuration, you can use the
Here are the enums available, and the explanation why I've suggested |
@julianocosta89 This is managed by Datadog agent currently without any collector config. @samugi Could you also help me with setting up As of the error(5xx) reporting in Datadog, will experiment with the code. Thanks for that. |
@pawandhiman10 at the moment the sampling strategy is not configurable for this plugin, what behavior in particular are you attempting to configure? |
@samugi This is required I believe from the OTEL page to set the following env variables in Datadog. ref: Link Enabling opentelemetry send traces in Datadog with probablistic sampling and we need to control the sampling rate of these traces. For that the following envs need to be set: link I'm not quite sure on how this works but trying to reduce the ingested spans using this opentelemetry plugin. |
Just to give more context here, what happens is that if you want to use the Datadog with probabilistic sampling, you need to ensure all traces are sent to the Datadog agent, and the agent will do the sampling decision based on your configuration. The agent can only act on the traces that it receives. If different services have different sampling decisions, that will probably break some traces. |
@pawandhiman10 @julianocosta89 does the following describe your scenario?
Kong's OpenTelemetry plugin only supports head sampling today so your options for sampling, depending on your architecture choices, would be:
|
So setting |
that's right @julianocosta89 1 stands for 100%, i.e. all samples will be traced and reported. |
@pawandhiman10 that's what you are looking for |
Is there an existing issue for this?
Kong version (
$ kong version
)3.7.1
Current Behavior
We have installed this plugin for traces in Kong.
Plugin link
We are sending traces to Datadog with this. But the traces are only for requests and errors are not captured at all.
Expected Behavior
Errors should be captured and reported in the UI.
Steps To Reproduce
Install this plugin using Helm.
We have attached this plugin to a service with the kong annotations:
konghq.com/plugins: opentelemetry
Configuration reference for the kongplugin on the plugin definition page.
This is connected to otel configuration exposed by Datadog agent on http port 4318.
Anything else?
No response
The text was updated successfully, but these errors were encountered: