-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
move PNode.comment to a side channel, reducing memory usage during compilation by a factor 1.25x #18760
move PNode.comment to a side channel, reducing memory usage during compilation by a factor 1.25x #18760
Conversation
4133106
to
9ca09df
Compare
Just to make sure I understand how this works, this change saves space because not all AST nodes have comments? |
yes, as you can see above, ~ 1/260 PNode's have a non-empty comment field, which makes sense. The problem right now is that the cost of the lookup is higher than i expected so it ends up being a space vs speed tradeoff unless someone has concrete ideas for how to improve this tradeoff |
compiler/ast.nim
Outdated
var gconfig {.threadvar.}: Gconfig | ||
|
||
proc comment*(n: PNode): string = | ||
gconfig.comments.getOrDefault(n.nodeId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Educated guess: if you add another node flag you can skip the getOrDefault step in 99% of all cases and bring back the speed of the old approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea, I've now implement this and it works, giving same memory improvement without incurring a performance cost;
however there was 1 subtlety but it all works fine, see comments in the PR; TLDR: there's a small amount of leak in the comments table that is entirely justified for performance reasons; future work can improve this if needed, but it probably won't be needed.
Note:
adding a proc =destroy(a: var TNode)
would not work well:
- would incur a large performance overhead, negating any gains (interestingly, also would increase peakmem metric; design bug?); and furthermore wouldn't work with
--gc:arc
9a664fc
to
3a7cae5
Compare
@Araq PTAL, see the comments in last commit |
3a7cae5
to
6f5acd2
Compare
- CHANGED :: - Implementation improvements for the code pretty printer - more input nodes supported. - Clean up the implementation of the tree-sitter wrapper generator. - More predictable `treeRepr` output - nodes with comments are no longer split apart at random places. - Make a ton of `func` nodes into `proc`, because accessing comment field is no longer a side-effect-free operations, most likely due to the nim-lang/Nim#18760 - ADDED :: - Tree-sitter wrapper generator now can produce library wrappers that do not depend on hmisc for operation. - `addPragma` for enum declarations - REMOVED :: - `nimble_aux.nim` and dependency on the nimble - I no longer work on the nim-lang/RFCs#398 and I see no reason to try and revese-engineer the dependency management solutions, `nimph` provides much better approach in this case (edit `nim.cfg`, then user can simply dump it as needed).
…mpilation by a factor 1.25x (nim-lang#18760) * move PNode.comment so a side channel, reducing memory usage * fix a bug * fixup * use sfHasComment to speedup comment lookups * fix for IC * Update compiler/parser.nim Co-authored-by: Andreas Rumpf <rumpf_a@web.de>
the comment field is rarely used yet each PNode has to pay the price, until this PR.
TNode.sizeof reduces from 40 to 32; interestingly, the peakmem usage drops by a very similar factor:
nim --eval:'echo 40/32'
1.25
nim --eval:'echo 847 / 670'
1.264179104477612
which can be attributed to fact that TNode allocations dwarf all other gc allocations, as demonstrated in other PRs (eg #13067)
result with --hint:GCStats:
before
Hint: gc: refc; opt: speed; options: -d:release
187965 lines; 23.781s; 847.062MiB peakmem; proj: /Users/timothee/git_clone/nim/Nim_temp6/compiler/nim.nim; out: /Users/timothee/git_clone/nim/Nim_temp6/bin/nim.devel.d4 [SuccessX]
[GC] total memory: 888209408
[GC] occupied memory: 718308016
[GC] stack scans: 50083
[GC] stack cells: 4230
[GC] cycle collections: 0
[GC] max threshold: 0
[GC] zct capacity: 2304
[GC] max cycle table size: 0
[GC] max pause time [ms]: 0
[GC] max stack size: 95952
after
Hint: gc: refc; opt: speed; options: -d:release
187985 lines; 8.223s; 670.09MiB peakmem; proj: /Users/timothee/git_clone/nim/Nim_prs/compiler/nim.nim; out: /Users/timothee/git_clone/nim/Nim_prs/bin/nim.pr_PNode_comment_sidechannel.d1 [SuccessX]
[GC] total memory: 702640128
[GC] occupied memory: 653233632
[GC] stack scans: 50194
[GC] stack cells: 4201
[GC] cycle collections: 0
[GC] max threshold: 0
[GC] zct capacity: 2304
[GC] max cycle table size: 0
[GC] max pause time [ms]: 0
[GC] max stack size: 95024
links
comment
field out ofTNode
=> sizeof(TNode) = 32 instead of 40 #10054 (much simpler than that prior attempt)EDIT 1
unfortunately this involves a tradeoff bw compile times and memory consumption:
with -d:danger:
new:
188426 lines; 7.677s; 673.766MiB peakmem;
old:
187965 lines; 7.011s; 847.379MiB peakmem
(-d:release shows similar ratios for the time)
the bulk of the performance difference lies in the table accesses; more precisely:
(count1, count2, count3, count3b, count4)
(8831908, 33607, 23779703, 1278, 32330)
=> biggest cost is
if id in gconfig.comments:
better table insertion/deletion or hashing algorithms might help here (
hashWangYi1
is expensive!)EDIT 2
solved via #18760 (comment)
future work