execution traces (eg for code coverage, debugging, introspection, profiling) #15827

timotheecour · 2020-11-02T21:22:31Z

not ready for review, but putting it out there since there was interest in the forum refs https://forum.nim-lang.org/t/7024#44128
see linked forum post for details
works but needs rebase and removing of (small) private dependencies used mainly for debugging
analog to dtrace; works by instrumenting codegen in much same way as done with --stacktrace:on
already pretty useful and customizable thanks to user callbacks and scopable tracing
shows information that's complementary to a stacktrace, very useful when a stacktrace isn't good enough because the problem you're looking at occurred some time before, not up in the callstack; in particular allows to find the 1st point where 2 programs diverged (eg 1 program with 2 different inputs, or 2 versions of the same program with same input)

links

Generating code coverage with Nim 1.4.0 - Nim forum

cooldome · 2020-11-02T22:22:27Z

My idea is to upgrade existing https://nim-lang.github.io/Nim/manual_experimental.html#term-rewriting-macros functionality to make things like coverage, extra debug reports possible. Effectively term rewriting macro should be able to find and rewrite any proc, if statement, case statement, for loop and etc. Up for discussion.

timotheecour · 2020-11-07T03:19:18Z

Effectively term rewriting macro should be able to find and rewrite any proc, if statement, case statement, for loop and etc. Up for discussion.

@cooldome TRW are IMO the wrong approach for this, because they're not flexible enough: they're trying to cram everything into a succinct DSL, which has the big downside of being cryptic and too limiting, including for this task. There are other downsides too (also due to their inflexibilty because of its limiting DSL), eg the fact that it recursively applies unconditionally.
A better idea would be something like timotheecour#75 which would allow using the full power of nim macros to rewrite code in a more flexible way (and IMO easier to grok). We can discuss how these would work separately.

That said, in the meantime the feature implemented in this PR is IMO very useful for a variety of use cases so it's justified on its own. If/when something like whole-program AST transformation (like timotheecour#75) is implemented, we can always revisit the implementation of nim --exectrace to benefit from it instead

disruptek · 2020-11-23T19:39:47Z

Where is the RFC?

timotheecour · 2020-11-23T19:43:16Z

Where is the RFC?

I can make one once I'm satisfied with the API and add missing functionality. But this is already usable (modulo commits I haven't pushed yet)

ringabout · 2021-03-25T07:49:52Z

The forum post updated(https://forum.nim-lang.org/t/7024)

fwiw, rr gives the possibility to replay any program execution in a debugger: https://rr-project.org/ - this gives access to all interactive features of gdb effectively, without requiring any changes to nim - breakpoints, memory inspection etc etc work out of the box. We've used it to successfully debug difficult garbage collector issues for example.

ditto with code coverage - gcc already supports generating code coverage information which can be integrated trivially with lots of existing tools - we're using it for code coverage reports in nim-libp2p for example.

rr for replay, gcov for coverage - we have code coverage set up in CI for example: https://github.com/status-im/nim-libp2p/blob/master/.github/workflows/codecov.yml

timotheecour · 2021-03-25T09:43:48Z

Ya i saw that, rr is a great tool, but in no way a replacement for execution traces as provided by this PR:

rr is only available on linux (and no ARM support IIUC; no OSX, no windows) (eg see https://joneschrisg.wordpress.com/2015/01/29/rr-on-os-x-can-it-be-ported/)
execution traces can provide more accurate information (eg in presence of templates etc) since it's integrated in nim compiler
execution traces has use cases not covered by rr can't do, eg compare 2 execution traces (eg 2 versions of a program), eg to see the 1st place where they differ

likewise with tools like gcov, see binhonglee/coco#13 and other issues

When it comes to debugging, you often need more than 1 tool to get the job done, it's not an either/or.

And yes, I do intend to revive this PR.

I can register events, but haven't figured out how to get compiled function names (suggestions from https://stackoverflow.com/questions/11731229/dladdr-doesnt-return-the-function-name) dont work at all. Maybe this is fixable, though I would much rather prefer to use nim-lang/Nim#15827 anyway.

- https://hookrace.net/blog/nim-code-coverage/ is outdated, `--nimcache:.` is also needed, because `.gcda` files are generated (at least in my case), where binary was compiled, not where it was executed. At least passing this flag solved the issue - Also need to `--remove "<temporary test dir>/*'` from code coverage, idk why. - As expected cove coverage report is awful, makes almost no sense. - Another example why we need adequate code coverage that is aware of the nim semantics. nim-lang/Nim#15827 - `exec` is horrible, nim-lang/nimble#895 is needed. I wanted to break arguments into something more manageable, and ended up with `&"\"{dir}/*\""`. Super nice.

haxscramper · 2021-08-26T16:51:18Z

Just want to share my experience trying other "alternative" options for code coverage and profiling, maybe this will be taken in account in later pros/cons discussion (that of course will happen).

`gprof`

gprof profiling - somewhat working solution that requires extensive workarounds, such as using .ndi [1] mapping files. They have to be manually parsed in order to figure out something, then I also have to parse gprof output (because this wall of monochrome text with mangled names is not exactly a fun thing to look it), then merge the two things together. I can't get access to gcc raw profile data [2] at the program runtime, so I have to (1) execute the program, (2) write to file, (3) unparse ad-hoc "pretty-printed format", (4) unparse another ad-hoc (thankfully not pretty-printed) format of the .ndi files, (5) merge these two together and get some kind of meaningful output.

[1] Which I learned only thanks to leorize, I haven't seen them documented anywhere

[2]: Unless someone will figure out a way to sanely access hidden local symbols from dynamic library, in which case __bb_head can be read from the libc.so runtime. Otherwise, the best solution to getting structured data is to reimplement gprof file reading.

`__cyg` procedure tracing

I could not get it to work, but considering it only works for C, and I can't just use nim code in execution hooks (because none of the nim standard library procs are declared with codegendecl: "__attribute__((no_instrument_function)) ... etc"). There is no exact and simple example (maybe I missed something, I mean obviously I have to have excellent google skills to just be able to trace how my code is executed) to execute a program and store tracing data in a global variable. There are external tools that help to work with this, but this would get me to the gprof route again (steps 1 .. ?? might involve any number of random ad-hoc formats that I have to merge together to get at least something.)

Macro-based code coverage

After having to reimplement almost all functionality of the https://github.com/yglukhov/coverage due to some missing functionality and incorrect handling of the (seemingly obvious cases like multiline expressions) I can say this method is extremely annoying to use and correctly implement (getting correct annotations requires at least doubling each proc size due to line tracking, which affects compilation times).

`lcov` etc.

Absolutely abysmal API that requires to keep track of multiple commands, having to execute coverage binary in one directory but collecting info from another .gcda is written into --nimcache directory, although maybe I did something wrong (obviously, it is not hard to break something). And the final result had little to no correlation to the actual tests. Well, there were some matches somewhere, but ultimately it missed so much that I didn't even bother with thorough inspection.

Debugging in general

Just recently, I had to spend a couple of days on the very annoying bug that I had to painstakingly localize using dozens of separate examples and echo-ing everything in the execution path. Again, maybe I'm just using the wrong tools (though I completely fail to understand how to use debugger in this case), but if I could just take two tests, and see where execution paths were different, it would make my life so much easier.

haxscramper · 2021-08-26T16:52:56Z

So, to summarize this relatively long list - absolutely all options I've tried are horrible, ill-suited for anything, including C itself, based on ad-hoc formats, require hacking around just to get something meaningful while still providing abysmal output.

timotheecour · 2021-08-26T17:03:18Z

thanks, this is useful.
can any of the options you've tried do this:

report self-time, recursive time, and exact number of calls correctly for programs with recursive function invocations
report correct times (including self-time and recursive-time) and with --gc:arc|refc even if exceptions are being raised/caught

I kind of doubt it because they'd need to understand how nim does exception handling in particular (eg with jmp etc depending on which gc is used), but I'd be curious.

this PR can (pending pushing latest commits), and has relatively low overhead (~2x), ~10 nanosecond precision (thanks to #18743 via RDTSC), and can be easily customized to choose which functions to trace without modifying source code nor recompiling nim compiler (via inclusion/exclusion rules so you can easily pick everything except, say, APIs in gc module for example); just need to clean things up

haxscramper · 2021-08-26T17:51:53Z

gprof seems to have some kind of self/total separation, but I really doubt it gives any kind of precise information regardless.

Although I'm personally not that interested in performance metrics, so I didn't investigate it really deeply (just get an overall picture of what is going on in my code).

What I'm really interested in is a code coverage and especially! execution traces, as they, in my opinion, provide the ability to get insights into code execution in a way that no other tool could. Debugging, behavior detection (user can detect changes in control flow (if there is any) by simply running the program under two library (dependency) versions) and just general understanding of the code (for example, I could run nim c test.nim, look at the execution trace of the compiler and get a clear picture of how things work with each other. What part of code gets executed in which order, and so on). It is possible to get a lot of static data from the nim code (as my tests with haxdoc and different ways of code analysis has shown), but runtime data is also extremely valuable.

haxscramper · 2021-08-26T18:20:35Z

Getting a little ahead, but - current API does not provide access to the information where proc has been called, correct?

proc nimExecTraceEnter*(s: PFrame) {.prag.} = is executed when proc is entered
proc nimExecTraceLine*(s: PFrame, line: int16) {.prag.} = - for each line [1] [2]
proc nimExecTraceExit* {.prag.} = - when proc is finished

where PFrame is a std/system.PFrame = ref TFrame

TFrame {...} = object
  prev*: PFrame              ## Previous frame; used for chaining the call stack.
  procname*: cstring         ## Name of the proc that is currently executing. [3]
  line*: int                 ## Line number of the proc that is currently executing.
  filename*: cstring         ## Filename of the proc that is currently executing.
  len*: int16                ## Length of the inspectable slots.
  calldepth*: int16          ## Used for max call depth checking.
  when NimStackTraceMsgs:
      frameMsgLen*: int      ## end position in frameMsgBuf for this frame.

[1] what about if cond: proc1() else: proc2()? lcov and other solutions seem to happily report this line as "fully covered" in all scenarios, but it would be nice if structures like this were taken into account as well. [4]

[2] If column information was also taken into account, it would probably be enough to retrieve "who called" information using secondary code processing.

[3] procname reports the same name for all overloads. This can also be solved using secondary processing (- usually (but not always due to {.line: etc.) there is only one proc per file+line:column combination).

proc main(a: int) = 
  for e in getStackTraceEntries():
    echo e
    
proc main(b: string) = main(1)
proc main(q: char) = main("srrt")
proc main() = main('1')
main()

(procname: "prog", line: 8, filename: "/home/jail/prog.nim")
(procname: "main", line: 7, filename: "/home/jail/prog.nim")
(procname: "main", line: 6, filename: "/home/jail/prog.nim")
(procname: "main", line: 5, filename: "/home/jail/prog.nim")
(procname: "main", line: 2, filename: "/home/jail/prog.nim")

[4] If we assume there is always one node per file+line:column combination (which is not always the case, but usually it is) execution of each node can be tracked. Also, when defined(useNodeIds): can be used. I understand that this probably won't be accepted as it is a bit too hacky, but decided it might be worth to write down the idea anyway.

stale · 2022-09-08T23:52:07Z

This pull request has been automatically marked as stale because it has not had recent activity. If you think it is still a valid PR, please rebase it on the latest devel; otherwise it will be closed. Thank you for your contributions.

timotheecour force-pushed the pr_tracing branch from a05a0cf to 1d036ab Compare November 7, 2020 03:00

timotheecour mentioned this pull request Nov 7, 2020

RFC: --patchast to patch all modules via AST transformations timotheecour/Nim#75

Open

timotheecour force-pushed the pr_tracing branch from 1d036ab to 4d42f48 Compare November 7, 2020 03:51

timotheecour mentioned this pull request Nov 19, 2020

repr missing pointer when applied to seq/string with arc #16052

Closed

This was referenced Nov 24, 2020

nim needs a way to find deadcode timotheecour/Nim#245

Open

misc high priority timotheecour/Nim#377

Open

timotheecour mentioned this pull request Feb 21, 2021

misc trace timotheecour/Nim#609

Open

timotheecour mentioned this pull request Jul 16, 2021

docs: add debugging page describing various flags / strategies to help debugging compiler or a program timotheecour/Nim#785

Open

timotheecour added 16 commits July 29, 2021 18:04

_

71efa8c

_

eba635c

_

3df2551

_

c5d9785

_

cc3b173

cleanup

a81b1f6

_

335842e

works w line trace

d8229f7

cleanups

c99ad50

move exectraceimpl inline

43cadd7

cleanup

3768583

cleanups

b91bcb2

_

1efb16b

_

93933aa

_

e4de85c

_

8a412be

timotheecour added 2 commits July 29, 2021 18:52

_

2720817

_

61e0177

timotheecour force-pushed the pr_tracing branch from 15c7df3 to 61e0177 Compare July 30, 2021 01:57

timotheecour mentioned this pull request Aug 8, 2021

{.compilerproc, importc.} doesn't honor importc with nim cpp #18655

Open

This was referenced Aug 23, 2021

initRand now uses strict monotonic counter to guarantee uniqueness #18149

Closed

std/cputicks: high resolution low overhead strictly monotonic cpu counter nim-lang/RFCs#411

Open

timotheecour changed the title ~~execution traces (eg for code coverage, debugging, introspection)~~ execution traces (eg for code coverage, debugging, introspection, profiling) Aug 25, 2021

stale bot added the stale label Sep 8, 2022

stale bot closed this Oct 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

execution traces (eg for code coverage, debugging, introspection, profiling) #15827

execution traces (eg for code coverage, debugging, introspection, profiling) #15827

timotheecour commented Nov 2, 2020 •

edited

Loading

cooldome commented Nov 2, 2020 •

edited

Loading

timotheecour commented Nov 7, 2020 •

edited

Loading

disruptek commented Nov 23, 2020

timotheecour commented Nov 23, 2020

ringabout commented Mar 25, 2021

timotheecour commented Mar 25, 2021 •

edited

Loading

haxscramper commented Aug 26, 2021 •

edited

Loading

haxscramper commented Aug 26, 2021

timotheecour commented Aug 26, 2021 •

edited

Loading

haxscramper commented Aug 26, 2021 •

edited

Loading

haxscramper commented Aug 26, 2021 •

edited

Loading

stale bot commented Sep 8, 2022

execution traces (eg for code coverage, debugging, introspection, profiling) #15827

execution traces (eg for code coverage, debugging, introspection, profiling) #15827

Conversation

timotheecour commented Nov 2, 2020 • edited Loading

links

cooldome commented Nov 2, 2020 • edited Loading

timotheecour commented Nov 7, 2020 • edited Loading

disruptek commented Nov 23, 2020

timotheecour commented Nov 23, 2020

ringabout commented Mar 25, 2021

timotheecour commented Mar 25, 2021 • edited Loading

haxscramper commented Aug 26, 2021 • edited Loading

gprof

__cyg procedure tracing

Macro-based code coverage

lcov etc.

Debugging in general

haxscramper commented Aug 26, 2021

timotheecour commented Aug 26, 2021 • edited Loading

haxscramper commented Aug 26, 2021 • edited Loading

haxscramper commented Aug 26, 2021 • edited Loading

stale bot commented Sep 8, 2022

timotheecour commented Nov 2, 2020 •

edited

Loading

cooldome commented Nov 2, 2020 •

edited

Loading

timotheecour commented Nov 7, 2020 •

edited

Loading

timotheecour commented Mar 25, 2021 •

edited

Loading

haxscramper commented Aug 26, 2021 •

edited

Loading

`gprof`

`__cyg` procedure tracing

`lcov` etc.

timotheecour commented Aug 26, 2021 •

edited

Loading

haxscramper commented Aug 26, 2021 •

edited

Loading

haxscramper commented Aug 26, 2021 •

edited

Loading