fix: Updated undici instrumentation to fix crash with trying to calculate exclusive duration on a segment that no longer exists #2884
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
v12.11.0 refactored our context manager and separated child segments from the current segment and stored them on the trace. Undici instrumentation had a bug when relying on the
undici_async_tracking
that took advantage of async_hooksexecutionAsyncResource
. It was keeping a reference to transaction and parent segment that existed from a previous request. I suspect this has to do with the keepAlive logic in undici where it'd return the same async id for different requests.When the trace for the new undici segment was being finalized it tried to calculate the exclusive time of the undici segment and it crashed with:
That was because its transaction and parent segments were from a previous request. The introduction of the
undici_async_tracking
was an attempt to fix an issue where if you had concurrent undici requests within a transaction the 2nd request's parent was the first and not the parent of both requests. This causes a slight inaccuracy in the parent/child relationship in a trace and transaction trace. There's no good way to fix this because undici relies on diagnostic channel and its behavior isn't the same as our traditional monkey patching instrumentation.I also add more assertions to existing tests around concurrent requests with a note stating that the parentId of the 2nd undici segment is the first request for a reason. Lastly, I added a test that reproduces the issue that this is fixing where two concurrent transactions are occurring and this is where the state confliation was existing in the old instrumentation.
Lastly, I "released" the feature flag for
undici_async_tracking
as there is no longer code wired up to support it.Related Issues
Closes #2881