You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
We observed that Fault/Error metrics were not being generated for spans produced by AWS SDK instrumentation. Here are the details of the investigation by @thpierce :
Fault/Error metrics are generated based on the http.status_code attribute found in spans.
This attribute is being populated in the following call chain:
getStatusCode returns the status code from the response and passes it back to up onEnd, which will set the http.status_code attribute value accordingly.
The described workflow works exactly as expected when the AWS SDK calls an AWS API and gets back a response status 200 - it will construct a response object with that status code and return it, triggering the workflow.
However, if the API returns a non-200 status code (e.g. an error or fault code), the AWS SDK simply throws an exception. This means two things:
TracingExecutionInterceptor.afterExecution is not called at all, instead TracingExecutionInterceptor.onExecutionFailure is called, which does not call into HttpCommonAttributesExtractor.onEnd at all.
Even if onExecutionFailure called onEnd, the response would be null and getStatusCode would not be called.
The net result is that http.status_code is not set, so no metrics are produced. This is clearly by design as AwsSdkHttpAttributesGetter implements the generic HttpCommonAttributesGetter, which has the following JavaDoc for getStatusCode: "This is called from Instrumenter.end(Context, Object, Object, Throwable) only when response is non-null.".
Looking at other implementations of HttpCommonAttributesGetter like AkkaHttpClientAttributesGetter, we can see that getStatusCode would fail with a NPE if we called it with a null response, so this contract is assumed by other implementations.
Describe the solution you'd like
Fundamentally, the problem is that the common HTTP instrumentation code assumes that status codes can only be delivered via response objects, but the AWS SDK delivers status codes via exceptions.
We look forward to working with the community to provide a comprehensive solution to solve this problem.
In the short term, we have come up with a solution relying on the fact that the exception thrown by the AWS SDK is stored within the produced spans and is accessible in the AwsSpanMetricsProcessor, where we generate Fault/Error metrics.
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered:
Component(s)
No response
Is your feature request related to a problem? Please describe.
We observed that Fault/Error metrics were not being generated for spans produced by AWS SDK instrumentation. Here are the details of the investigation by @thpierce :
http.status_code
attribute found in spans.afterExecution
calls onHttpResponseAvailable, which calls HttpCommonAttributesExtractor.onEndonEnd
will call AwsSdkHttpAttributesGetter.getStatusCode only if the response is not nullgetStatusCode
returns the status code from the response and passes it back to uponEnd
, which will set the http.status_code attribute value accordingly.TracingExecutionInterceptor.afterExecution
is not called at all, instead TracingExecutionInterceptor.onExecutionFailure is called, which does not call intoHttpCommonAttributesExtractor.onEnd
at all.onExecutionFailure
calledonEnd
, the response would benull
andgetStatusCode
would not be called.http.status_code
is not set, so no metrics are produced. This is clearly by design asAwsSdkHttpAttributesGetter
implements the generic HttpCommonAttributesGetter, which has the following JavaDoc forgetStatusCode
: "This is called from Instrumenter.end(Context, Object, Object, Throwable) only when response is non-null.".HttpCommonAttributesGetter
like AkkaHttpClientAttributesGetter, we can see thatgetStatusCode
would fail with a NPE if we called it with a null response, so this contract is assumed by other implementations.Describe the solution you'd like
Fundamentally, the problem is that the common HTTP instrumentation code assumes that status codes can only be delivered via response objects, but the AWS SDK delivers status codes via exceptions.
We look forward to working with the community to provide a comprehensive solution to solve this problem.
In the short term, we have come up with a solution relying on the fact that the exception thrown by the AWS SDK is stored within the produced spans and is accessible in the AwsSpanMetricsProcessor, where we generate Fault/Error metrics.
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: