-
-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve ftime-trace implementation. #3797
Improve ftime-trace implementation. #3797
Conversation
JohanEngelen
commented
Jul 27, 2021
- Rewrite ftime-trace to our own implementatation instead of using LLVM's time trace code. The disadvantage is that this removes LLVM's work from the trace (optimization), but has the large benefit of being able to tailor the tracing output to our needs.
- Add memory tracing to ftime-trace (not possible with LLVM's implementation)
- Do not output the sum for each "category"/named string. This causes the LLVM output to be very long, because we put more information in each time segment name. Tooling that processes the time trace output can do this summing itself (i.e. Tracy), and makes the time trace much more pleasant to view in trace viewers.
- Add source file location info to time trace (Tracy understands it) (not possible with LLVM's implementation)
- Rewrite ftime-trace to our own implementatation instead of using LLVM's time trace code. The disadvantage is that this removes LLVM's work from the trace (optimization), but has the large benefit of being able to tailor the tracing output to our needs. - Add memory tracing to ftime-trace (not possible with LLVM's implementation) - Do not output the sum for each "category"/named string. This causes the LLVM output to be _very_ long, because we put more information in each time segment name. Tooling that processes the time trace output can do this summing itself (i.e. Tracy), and makes the time trace much more pleasant to view in trace viewers.
Sounds nice! Excluding the LLVM stats seems fine to me, after all it's mostly about introspecting time spent in frontend and glue layer. Wrt. memory stats, AFAICT you're tracking frontend memory consumption only. Not sure if glue layer and LLVM consumption is really relevant, but at least with mimalloc, we could probably get the overall heap stats via mi_stats_print_out() or so. |
Memory stats definitely needs more work, I'll leave that for the future. I'm actually not sure if I'm interpreting the GC stats correctly. It's like you say, this does not track the C++ memory usage. But I think the frontend usage is a lot more (CTFE...) than that. |
Apparently |
[GDC doesn't support DMD-style inline asm (which that predefined version stands for).] |
Edit: no longer needed after using |
this.beginningOfTime = time / time_scale; | ||
} | ||
|
||
void beginScope(const(char)[] name, const(char)[] details, Loc loc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const ref Loc
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loc is small (size of two pointers), so I chose to copy it, because it's probably faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't really matter, hence the ?
; it was primarily for consistency with the frontend, which mostly uses const ref Loc
but isn't totally consistent either.
{ | ||
// {"ph":"X","name": "Sema1: somename","ts":111,"dur":222,"loc":"filename.d:123","args": {"detail": "something", "loc":"filename.d:123"},"pid":0,"tid":0} | ||
|
||
void writeLocation(Loc loc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const ref?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loc is small (size of two pointers), so I chose to copy it, because it's probably faster.
Good to merge? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM apart from the const ref
on the Loc
which is only bad because for the frontend source uses const ref Loc
everywhere and not using it is surprising.
…surement' stage uses ticks as unit - Fix crash on `ldc2 -ftime-trace` without files passed.
Conversion to core.time.MonoTime complete, ready now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks, also nice to have less integer divisions now. I guess the test will remain a bit brittle (OS context switches etc.), but we'll see.
@JohanEngelen: I'm about to merge v2.097.2 and release LDC v1.27.1, mainly to re-add an AArch64 Linux package. Including this might be nice, so please merge if you agree. |
Merged. Risk of introducing badness to 1.27 is very low. |