-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] --stacktrace:noinline: up to 3X speedups with typically same stacktraces #13582
[RFC] --stacktrace:noinline: up to 3X speedups with typically same stacktraces #13582
Conversation
But why? Isn't the point of turning If you want speed wouldn't you just push stacktrace off wherever needed? |
thanks! fixed
makes default debug builds faster (or any build with --stacktrace:on), offering a better compromise between speed and debugging-ability you loose nothing with this PR:
|
Sure. We loose sane defaults IMO. |
cmon, it's just a matter of flipping the roles of the --stacktrace flags eg this could also be done with very little change from current state of this PR:
so your "sane" defaults are preserved that way. I prefer the way I wrote it but willing to change to these if there's consensus |
But this new behaviour is a step back. If you want performance at the cost of stacktraces use |
I've changed the default so that nothing changes except if you pass the new flag
there's nothing surprising or complex, this matches how native stacktraces work!
Here's an example: you get a 3X speedup compared to --stacktrace:on, and you get the exact same stacktrace here (but this is common case; the things we're inlining are hotspots, well debugged, and unlikely to be where crashes or anything interesting happens). import tables, times
var tab: Table[int, int]
proc main() =
let n = 1000
for i in 0..<n: tab[i] = i # doAssert i < n-1 # insert this to crash and show stacktrace
let t = cpuTime()
for i in 0..<100000: main()
echo cpuTime() - t nim c -d:danger -r --stacktrace:off main 0.490725 s likewise, we get a 3X speedup between as for the stacktraces: insert the doAssert line:
in the rare cases where the thing that crashed is inside a {.inline.} proc, just re-run with --stacktrace:on
as I already explained, this is a worse solution: you'll end up having to find where bottlenecks are, modify sources, rerun, and modify sources back, and it'll affect everything inside those push/pop instead of just the inline bits. It's the exact analog as why we have both -d:danger and -d:release:
|
We already have |
3892088
to
ac0e77d
Compare
This pull request has been automatically marked as stale because it has not had recent activity. If you think it is still a valid PR, please rebase it on the latest devel; otherwise it will be closed. Thank you for your contributions. |
So I'm a bit confused about the objections here - aside from what I see to be an internal optimization and a slight CLI change, what is the downside to this? Yes, one can turn stack traces off or inline functions using a pragma, but one can also write their own stack traces manually and inline functions by hand. To me (at least, currently), this seems like an improvement with little in the way of a downside. |
Advantages:
Disadvantages:
Personally I would rather see the effort spend on nlvm, which has the potential to give us really decent debugging support. |
@Araq Regarding "yet another switch", couldn't this just be made the default implementation? I also can't find any mention of libUnwind in the repository. If it isn't currently implemented, then (presumably) an implementation would require a switch anyway. |
This pull request has been automatically marked as stale because it has not had recent activity. If you think it is still a valid PR, please rebase it on the latest devel; otherwise it will be closed. Thank you for your contributions. |
refs [RFC] [performance] {.inline.} guarantees,
--stacktrace:inline
, and--ignoreInline
RFCs#198 (does not close because more items left)nim is now faster with
--stacktrace:onthe new option --stacktrace:noinlineexample 1:
see example #13582 (comment) showing 3X speedup compared to default stacktraces, while producing identical stacktraces
example 2:
nim compiling itself. Less drastic speedup than example 1 because it relies less on inlined procs (unlike math heavy code), but still an appreciatable speedup than anyone working on debugging compiler may benefit from.
before PR:
after building nim with
--stacktrace:on --opt:speed
:./bin/nim c -o:/tmp/z01 --hints:off -f compiler/nim.nim: 26.5.s
after building nim with
--stacktrace:on
:./bin/nim c -o:/tmp/z01 --hints:off -f compiler/nim.nim: 52.s
after building nim with
-d:release --stacktrace:on
:./bin/nim c -o:/tmp/z01 --hints:off -f compiler/nim.nim: 18.s
after building nim with
-d:danger --stacktrace:on
:./bin/nim c -o:/tmp/z01 --hints:off -f compiler/nim.nim: 17.7s
after PR:
after building nim with
--stacktrace:noinline --opt:speed
:./bin/nim c -o:/tmp/z01 --hints:off -f compiler/nim.nim: 19.5.s (1.35X faster)
after building nim with
--stacktrace:noinline
:./bin/nim c -o:/tmp/z01 --hints:off -f compiler/nim.nim: 46.9.s (1.09X faster)
after building nim with
-d:release --stacktrace:noinline
:./bin/nim c -o:/tmp/z01 --hints:off -f compiler/nim.nim: 14.7.s (1.24X faster)
after building nim with
-d:danger --stacktrace:noinline
:./bin/nim c -o:/tmp/z01 --hints:off -f compiler/nim.nim: 14.3.s (1.23X faster)
and for comparison (unchanged before/after this PR):
after building nim with
-d:danger --stacktrace:off
:./bin/nim c -o:/tmp/z01 --hints:off -f compiler/nim.nim: 8.5
links
-Og
, refs https://forum.nim-lang.org/t/7848#49991-d:nimStackTraceOverride --import:libbacktrace --debugger:native
+--passc:"-fno-omit-frame-pointer -fno-optimize-sibling-calls"
, refs fix #9 stacktrace works if triggered by a signal, eg SIGSEGV by timotheecour · Pull Request #10 · status-im/nim-libbacktrace