Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deviation from semconv exception spec? #1491

Closed
mothershipper opened this issue Jan 26, 2021 · 23 comments · Fixed by #1492
Closed

Deviation from semconv exception spec? #1491

mothershipper opened this issue Jan 26, 2021 · 23 comments · Fixed by #1492
Assignees
Labels
area:trace Part of OpenTelemetry tracing pkg:SDK Related to an SDK package question Further information is requested
Milestone

Comments

@mothershipper
Copy link
Contributor

Just noticed that the trace implementation doesn't seem to follow the spec for exceptions -- looks like this package uses error as the prefix instead of exception. Looking at the JS library, they seem to be using exception as well, so I'm guessing it's the golang lib that's out of compliance.

Are there any plans or motivation to bring this into compliance? I don't know if there's a backwards compatible way of making this change, aside of perhaps recording two events, one with the error prefix, one with the exception prefix. That said, it might be best to just rip the bandaid off, rename the event and attributes and make it a (potentially) breaking change?

Happy to contribute these changes if I can get some direction on what you think is the best path forward :)

@XSAM
Copy link
Member

XSAM commented Jan 27, 2021

@mothershipper Good catch.

From Record Exception

The signature of the method is to be determined by each language
and can be overloaded as appropriate.

I'm not sure we can keep RecordError by appropriately changing the signature of Record Exception. After all, there is no term called Exception in Go; RecordException might confuse the users from Go.

I prefer we should correct semantic conventions but keep the RecordError.

@MrAlias @Aneurysm9 What do you think?

Related issue: #1104

@mothershipper
Copy link
Contributor Author

I'm on the fence about the method name, but I'm doubtful that Exception would confuse go developers -- if this were the case, I'd think adding a span event as an exception from a RecordError method would be equally as confusing.

My gut feeling is that the terms are relatively interchangeable to developers. For instance, JavaScript doesn't have an Exception as best I can tell either. They use an Error class, but the SDK provides a recordException method. That said, the JS library uses Exception in the method signature, but their Exception type is satisfied with a JS Error.

To me, the bigger concern with changing the name is that it is part of the public API and renaming would be a breaking change. Would you consider adding RecordException and deprecating RecordError over removal?

@MrAlias MrAlias added the question Further information is requested label Jan 28, 2021
@Aneurysm9
Copy link
Member

I don't think a change in method name is required. RecordException is optional in the spec and scoped to languages that use exceptions, which Go does not.

To facilitate recording an exception languages SHOULD provide a RecordException method if the language uses exceptions.

Instead, we have errors and provide an analogous RecordError method. Obviously, this wasn't intended to preclude us from having a convenience method, but I think it also makes clear that where exceptions are not part of a language different behavior and API surface is appropriate.

It may be a weird aesthetic thing, and I'm happy to be told I'm wrong if there's overwhelming thoughts to the contrary, but I don't feel that renaming RecordError to RecordException is desirable or beneficial. I would find it odd to encounter span.RecordException(err) throughout a codebase and I'd prefer not to do anything to encourage the conflating of errors and exceptions.

@mothershipper
Copy link
Contributor Author

@Aneurysm9 @XSAM Thank you both for the reviews!

I would find it odd to encounter span.RecordException(err) throughout a codebase and I'd prefer not to do anything to encourage the conflating of errors and exceptions.

Looking for clarity here -- would you have RecordError add an exception event with exception.* attributes? Or leave it as-is?

@Aneurysm9
Copy link
Member

I think I would prefer to leave it as-is, though I can see the argument for using exception.* to align downstream with attributes produced in a heterogeneous environment.

@mothershipper
Copy link
Contributor Author

For reference the majority of opentelemetry SDKs have a RecordException with exception.* semantics, though, most of these languages do have throw/catch semantics:

The only ones without:

  • C++ - Not implemented either way
  • PHP - Not implemented either way
  • go - see above ;)

I think the strongest point I can make is that the Rust SDK supports record_exception, despite not having exceptions. This issue was brought up (I'm assuming before the Rust SDK addressed this) in the spec as well.

From a trace-visualization perspective, our end-users just want to know why an operation failed, they don't much care if it was an exception or an error. Our users want to see the type, message and source/stack-trace (if applicable) when a failure escapes a span. I don't think the spec currently provides a standard way of doing that outside of the exception event, and it's definitely easier on the various exporter(s) if golang were to conform with the other SDKs, at least by using exception.* semantics, regardless of what the method is called.

@Aneurysm9
Copy link
Member

I've added an agenda item for the SIG meeting tomorrow to discuss this as I'd like to get input from a broader set of users and approvers. My personal preference at this point is to retain the RecordError() function but have it produce exception.* attributes on an exception event. This is partly selfish as I'm leaving a team with a reasonably large body of OTel-instrumented code with broad use of RecordError(), but also because I believe it aligns better with the Go idiom.

@mothershipper
Copy link
Contributor Author

Thanks Anthony!

For what it's worth, I don't feel as strongly about the method name as I do about the behavior (producing the exception event). There is value in consistency, but also in avoiding breaking changes / not inflicting maintenance work on the community.

@MrAlias
Copy link
Contributor

MrAlias commented Feb 11, 2021

From the SIG meeting today: we will also want to update the Event name that is created when a panic is recovered from (an re-paniced):

errorEventName = "error"

@MrAlias MrAlias added release:after-ga area:trace Part of OpenTelemetry tracing pkg:SDK Related to an SDK package and removed release:after-ga labels Feb 12, 2021
@MrAlias MrAlias added this to the RC1 milestone Feb 12, 2021
@bhautikpip
Copy link
Contributor

bhautikpip commented Feb 16, 2021

Hi @MrAlias ,

what is the action item here? Are we going to keep error or follow the spec and keep it as exception? I just wanted to highlight one more point is if we are going to keep error then we might have to introduce error fields especially for golang here for our exception transformation for awsxrayexporter. I am not sure if that would be ideal. This would be the case for anyone who wants to transform error/exception to the specific backend formats at the collector level.

@MrAlias
Copy link
Contributor

MrAlias commented Feb 17, 2021

@bhautikpip the plan is to go with the strategy describe by @Aneurysm9 above: #1491 (comment) with the addition of ensuring the panic related event that is created is created with the proper name.

@mothershipper I have assigned this to you because you had an open issue to address this. Please let me know if you are still able to work on this and update the linked PR with this decided direction.

@mothershipper
Copy link
Contributor Author

@MrAlias Happy to help! My bandwidth is a little tight this week, but I'll be able to get to it either over the weekend or early next week. If I'm lucky I might get to it sooner :)

@mothershipper
Copy link
Contributor Author

@MrAlias @Aneurysm9 @XSAM I've updated and rebased the PR -- seeing some test flakiness in unrelated code (TestExactMerge or something like that IIRC). I squashed the commits after the failure and it's looking like that run will pass, but wanted to flag that if it wasn't already on your radar.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 8, 2021

To facilitate recording an exception languages SHOULD provide a RecordException method if the language uses exceptions.

Based on the included conditional (Go does not use exceptions so it is not required to comply with this recommendation) and the possibility that errors might one day have their own specified method it seems like we might be putting our future selves into a hard position if we create an error with the RecordError method that records an event as an exception. Because of this, I'm not sure we should move forward with the linked PRs approach to unify the event type.

Additionally, this makes me wonder if we should be exposing a RecordError method on a Span if the specification may one day define a conflicting form of that method. This becomes further complicated if we consider not including this method and users are left to their own choosing how they want to represent an error. This would be less that ideal.

I do think panics can be thought of an exceptions. In most languages they exist as runtime or segfault exceptions. We currently capture a panic by default when a span ends and record an event (something the linked PR was updating to unify the name). It might make sense to add a RecordPanic method to the Span interface to satisfy this recommendation from the specification. E.g.

// RecordPanic returns a function that can be called to record any active panic as an Event.
RecordPanic(attrs ...attributes.KeyValue) func()

The returned func can be a closure that can be used something like this ...

// ...
ctx, span := tracer.Start(ctx, "panicingFunc")
defer span.RecordPanic(attributes.String("blame", "MrAlias"))
// ...

Conversely, I also think we could explicitly not support exceptions as they are not used in the language.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 8, 2021

Added an agenda item to the 2020-03-11 SIG meeting to get feedback.

@seh
Copy link
Contributor

seh commented Mar 9, 2021

Adding to this, panics in Go aren't always errors; they are sometimes used as non-local control flow constructs, like signaled conditions in Common Lisp.

In some cases, even errors aren't really errors, per se; they're a partially out-of-band return value that communicates something that the other return values don't communicate clearly enough, such as having reached EOF before being able to fill a buffer when reading from an input stream.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 9, 2021

Adding to this, panics in Go aren't always errors; they are sometimes used as non-local control flow constructs, like signaled conditions in Common Lisp.

This seems like a good reason to add the RecordPanic method directly instead of just adding an event to a span on end if a panic happens. This would provide the user control over this creation so if they are using panics like signaled conditions (i.e. marshal from the JSON package)

@mothershipper
Copy link
Contributor Author

@MrAlias I can't see the conversation from slack (not currently part of the CNCF slack), but curious if it would help me understand why OTEL would want differentiate between errors and exceptions with separate events. Was this a discussion at the spec level, or just regarding the golang SDK?

I guess I'm feeling like we're hung up on whether or not the language calls it an error or exception and not looking at what the end user is trying to do/see by calling RecordError or RecordException. As there are no standard span attributes for recording metadata for errors or exceptions (type, location, etc.), RecordException and the exception event are the only means provided by the spec for recording that additional metadata.

I don't see a benefit to having both an "error" event and "exception" event with a majority of overlapping attributes. Anybody trying to visualize it would likely handle the two events in the same way, assuming they even notice the second event type.

@ericmustin
Copy link

👋 Hi friends, just a bit confused here on this comment #1491 (comment). So as it stands now is the idea that Go will just not be recording events in a way where type/stacktrace etc will get recognized by anything else in the opentelemetry ecosystem? That feels sub optimal. Exception events are what basically all exporters look for, here's an example of a popular exporter that only looks at exception events to determine cause and add the additional useful metadata. I don't believe there's a similar error event convention? Feels like we're sacrificing quality of adoption today for future hypothetical work. https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/e2af59e6cbd5f3e384a5e85d980c7a1fab22b663/exporter/awsxrayexporter/translator/cause.go#L43-L81

@MrAlias
Copy link
Contributor

MrAlias commented Mar 23, 2021

@MrAlias I can't see the conversation from slack (not currently part of the CNCF slack), but curious if it would help me understand why OTEL would want differentiate between errors and exceptions with separate events. Was this a discussion at the spec level, or just regarding the golang SDK?

It is at the specification level.

As there are no standard span attributes for recording metadata for errors or exceptions (type, location, etc.), RecordException and the exception event are the only means provided by the spec for recording that additional metadata.

I agree there is a gap in the specification but overloading the use of the specified RecordException will paint us into a corner when the expected future improvements that were discussed in the original PR come along.

This has been an open question at the specification level from the start. It seems like we will likely need to get some guidance here from specification. I'll try and add it to the agenda for next week's specification SIG meeting (cannot add it now unfortunately given the combined nature of the agenda).

@Aneurysm9
Copy link
Member

@MrAlias given the discussion at the Spec SIG Tuesday, are we clear to move forward with retaining the RecordError() function but having it produce exception.* attributes on an exception event, with functional options holding open a path for modification of that behavior in the future if other error-type events are specified? I believe that behavior is currently implemented in #1492.

@MrAlias
Copy link
Contributor

MrAlias commented Apr 1, 2021

@Aneurysm9 yes, that is correct 👍

@ericmustin
Copy link

hooray

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:trace Part of OpenTelemetry tracing pkg:SDK Related to an SDK package question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants