Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http.status_code attribute in span was not being populated when AWS APIs were returning non-200 status codes #8795

Open
scaugrated opened this issue Jun 23, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@scaugrated
Copy link

Is your feature request related to a problem? Please describe.

We were attempting to implement logic based on the http.status_code attribute found in spans and observed it was not being populated for spans coming out of the AWS SDK when the AWS APIs were returning non-200 status codes.

Here are the details of the investigation for this issue by @thpierce :

  • This http.status_code attribute is being populated in the following call chain:
  • The described workflow works exactly as expected when the AWS SDK calls an AWS API and gets back a response status 200 - it will construct a response object with that status code and return it, triggering the workflow.
  • However, if the API returns a non-200 status code (e.g. an error or fault code), the AWS SDK simply throws an exception. This means two things:
    • TracingExecutionInterceptor.afterExecution is not called at all, instead TracingExecutionInterceptor.onExecutionFailure is called, which does not call into HttpCommonAttributesExtractor.onEnd at all.
    • Even if onExecutionFailure called onEnd, the response would be null and getStatusCode would not be called.
  • The net result is that http.status_code is not set. This is clearly by design as AwsSdkHttpAttributesGetter implements the generic HttpCommonAttributesGetter, which has the following JavaDoc for getStatusCode: "This is called from Instrumenter.end(Context, Object, Object, Throwable) only when response is non-null.".
    • Looking at other implementations of HttpCommonAttributesGetter like AkkaHttpClientAttributesGetter, we can see that getStatusCode would fail with a NPE if we called it with a null response, so this contract is assumed by other implementations.

Describe the solution you'd like

Fundamentally, the problem is that the common HTTP instrumentation code assumes that status codes can only be delivered via response objects, but the AWS SDK delivers status codes via exceptions.

We look forward to have a comprehensive solution to solve this problem.

In the short term, we have come up with a solution relying on the fact that the exception thrown by the AWS SDK is stored within the produced spans and is accessible in the AwsSpanMetricsProcessor, where we generate Fault/Error metrics.

Describe alternatives you've considered

No response

Additional context

No response

@mateuszrzeszutek
Copy link
Member

This is closely related to #8453
If the AWS SDK instrumentation did not attempt to implement the HTTP semconv itself, and simply left it to the underlying HTTP instrumentation, this most likely wouldn't be a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants