Improved Error format #6676
Replies: 5 comments 8 replies
-
Another change I'd like to start staging is to transition from |
Beta Was this translation helpful? Give feedback.
-
I feel like this is getting over complicated. is finding where an error occurred really that hard? Do we really need another concept? seems like trying to rebuild stack, which we will still have? The only problem i ever had finding the location of was asserts without messages in cases where a function had multiple asserts, or all the function names had been minified after the last public method I'm also not a huge fan of getting rid of error type, and i don't think it is synonymous with error.name. error.name seems more like the error code concept. rather than invest in changing this i'd rather see us use more structured error apis between our layers. then all thrown errors are just truly unknown things that should be investigated |
Beta Was this translation helpful? Give feedback.
-
I spoke with @anthony-murphy directly, and agreed to take waypoints out for now. It's an additive change that can come later, if it becomes clear that it will be valuable. As of now, it may be me overgeneralizing a pattern that only appears in a few places. |
Beta Was this translation helpful? Give feedback.
-
My main feedback would be to find an incremental way to move forward and learn on the way. Quite often these types of changes are disruptive and we realize new constraints or better ways of doing something on the way, but redoing is expensive (compat cost). I think annotate pattern is good. I agree with Tony that number of places where we catch and retrhow should be low, but there are obvious places where it happens directly or indirectly (summarization is an example where it's done indirectly without throwing). Plus that will allow us to not use string concatenation, and instead add more props indicating "what" and maybe "why" (i.e. "socket.io" + "disconnect" + "transport close", but allow consumption / aggregation in Kusto based on whole thing or parts of it). As for wrapping - I agree with you that it should be avoided. That said, there might be places where code needs to communicate more generic error code to consumers (generic error(503) -> throttling error). I think when such cases happen, we should strongly consider logging original error as is but with additional correlation ID, and then raising new error with child correlation ID pointing to original error. That way we can connect the dots from telemetry, at the same time keeping API our consumers need to deal with clean (they should not care how we arrived to this state). Basically in places where logging & API (error reporting) are not aligned, we should not try to put a square peg in a round hole. |
Beta Was this translation helpful? Give feedback.
-
I think it's more nuanced than that. We can raise the disconnect event (and mark where it was raised in our code), but some of the payload comes from server. |
Beta Was this translation helpful? Give feedback.
-
[edit: removing waypoints for now, see previous version for context on that]
On the MS Office Fluid team we've been struggling with how difficult it is to aggregate across the error messages that we log (see #5426). I've spent quite some time in both the code and the production logs to learn what we can do differently in how we instrument errors to yield quicker insight when analyzing error signals at scale.
My proposal is to standardize on the following simple interface for errors raised or handled through the Fluid Framework:
We already use
errorType
throughout, both to enable type narrowing to a specific TypeScript type, and to give a fairly low-granularity categorization of our errors.fluidErrorCode
is new, and provides a property distinct frommessage
where we can further specify the "what" of an error (or sometimes the "where") with static strings that are well-suited for aggregation.Looking at how we instrument errors today, there are a few patterns that muddy the water today, which will be cleaned up as part of this effort. Here are some examples:
message
,stack
, and possiblyerrorType
and creating a new object with a particularerrorType
. (See usages of wrapError)After considering each use of
wrapError
, I don't believe we ever actually want to properly wrap an error where we change theerrorType
- theerrorType
is inherent to what originally went wrong. Rather, we are just looking to annotate the error for logging - or possibly mixing in other properties exposed via a TypeScript type. Annotating for logging can be achieved more plainly considering theILoggingError
interface, and adding a functionannotateError
:Note that
annotateError
will return the same error object, just with additional stuff mixed in. The exception being if the providederror
is not an Object, in which case we'll use it as themessage
of a newError
object.See #6764 for PR introducing
IFluidErrorBase
, and the evolution ofannotateError
: Two functions,annotateErrorObject
andnormalizeError
, both of which yield a validIFluidErrorBase
.Execution Plan
Here's the sequence of issues we're working through here:
IFluidErrorBase
#6750Beta Was this translation helpful? Give feedback.
All reactions