-
Notifications
You must be signed in to change notification settings - Fork 614
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: fixes the logic in convertJaegerTraceToProfile function #1679
Conversation
size-limit report 📦
|
Codecov ReportBase: 66.72% // Head: 66.56% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #1679 +/- ##
==========================================
- Coverage 66.72% 66.56% -0.15%
==========================================
Files 156 156
Lines 5260 5274 +14
Branches 1202 1206 +4
==========================================
+ Hits 3509 3510 +1
- Misses 1745 1758 +13
Partials 6 6
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
In my example the whole request took 162ms, but your top flamegraph slab is 268ms. That doesn't seem to match your description. That doesn't seem right. |
@bobrik did you see the note at the bottom?
This is why there is a discrepancy between the request length and what the flamegraph shows... what do you think the expected behavior should be here since flamegraphs don't actually have a concept of "parallel" execution While the full request only took 162ms, the flamegraph shows 268ms because there was 268ms of "work" being done (just some of it was happening in parallel). This is the most accurate way to represent this situation that we could think of so would love to hear your thoughts on how you plan to use this flamegraph and what the expected output would be in your opinion |
Yes, I quoted from it. I don't agree with the "that's how much wall time was spent processing a request" framing.
Tracing is generally used to gauge latency (wall time), not work (which is usually a proxy for cpu time) as flamegraphs. I don't know if there's a good way to translate concurrent spans into flamegraphs. It might be acceptable to represent them as you do with your change visually, but show the actual timing values on the slabs. As a concrete example for my trace:
Here we scale width of If you can add some sort of warning to Let me know if this makes sense. |
hey @bobrik yes this makes sense and might be a good compromise we'll look into seeing what changes we'd need to make in order to allow the flamegraph component to make them "overwrite" the true values, but this may take a little while to dig into |
@petethepig how can I try this PR on my project? |
I published this PR as |
Hey @cescoferraro, even though you already published a PR, we have a CI step that publishes the flamegraph (https://github.com/pyroscope-io/pyroscope/actions/runs/3397437411/jobs/5649595841). You can point to |
|
Addresses #1671 / jaegertracing/jaeger-ui#1012
Trace
Flamegraph Before
Flamegraph After
Notes
One of the characteristics of the trace from the original issue was that it had parallel executions, particularly in
logic-proxy: upstream
node. That's why it looked so strange before.This PR makes it so that in such cases when the sum of children nodes' duration is above parent duration, parent duration is set to the sum of children nodes' duration.
One unfortunate consequence of this is that now durations of the top nodes are inflated and don't always match the durations in span, e.g look at the top
eyeball-facing-service: request
. While this is correct in a sense that that's how much wall time was spent processing a request, it might be confusing to some.@bobrik What do you think about this?