move PNode.comment to a side channel, reducing memory usage during compilation by a factor 1.25x #18760

timotheecour · 2021-08-27T18:04:43Z

the comment field is rarely used yet each PNode has to pay the price, until this PR.

TNode.sizeof reduces from 40 to 32; interestingly, the peakmem usage drops by a very similar factor:

nim --eval:'echo 40/32'
1.25

nim --eval:'echo 847 / 670'
1.264179104477612

which can be attributed to fact that TNode allocations dwarf all other gc allocations, as demonstrated in other PRs (eg #13067)

result with --hint:GCStats:

before

Hint: gc: refc; opt: speed; options: -d:release
187965 lines; 23.781s; 847.062MiB peakmem; proj: /Users/timothee/git_clone/nim/Nim_temp6/compiler/nim.nim; out: /Users/timothee/git_clone/nim/Nim_temp6/bin/nim.devel.d4 [SuccessX]
[GC] total memory: 888209408
[GC] occupied memory: 718308016
[GC] stack scans: 50083
[GC] stack cells: 4230
[GC] cycle collections: 0
[GC] max threshold: 0
[GC] zct capacity: 2304
[GC] max cycle table size: 0
[GC] max pause time [ms]: 0
[GC] max stack size: 95952

after

Hint: gc: refc; opt: speed; options: -d:release
187985 lines; 8.223s; 670.09MiB peakmem; proj: /Users/timothee/git_clone/nim/Nim_prs/compiler/nim.nim; out: /Users/timothee/git_clone/nim/Nim_prs/bin/nim.pr_PNode_comment_sidechannel.d1 [SuccessX]
[GC] total memory: 702640128
[GC] occupied memory: 653233632
[GC] stack scans: 50194
[GC] stack cells: 4201
[GC] cycle collections: 0
[GC] max threshold: 0
[GC] zct capacity: 2304
[GC] max cycle table size: 0
[GC] max pause time [ms]: 0
[GC] max stack size: 95024

links

supersedes [superseded] move comment field out of TNode => sizeof(TNode) = 32 instead of 40 #10054 (much simpler than that prior attempt)
will possibly help with RFC: AST comments should be properly exposed RFCs#150 (see also [TODO] add macros.comment(NimNode) magic to get node doc comment #8903 to expose comments in macros), but that's entirely optional

EDIT 1

unfortunately this involves a tradeoff bw compile times and memory consumption:
with -d:danger:
new:
188426 lines; 7.677s; 673.766MiB peakmem;
old:
187965 lines; 7.011s; 847.379MiB peakmem

(-d:release shows similar ratios for the time)

the bulk of the performance difference lies in the table accesses; more precisely:

roc comment*(n: PNode): string {.inline.} =
  count1.inc
  count4 = max(gconfig.comments.len, count4)
  gconfig.comments.getOrDefault(n.nodeId)

proc `comment=`*(n: PNode, a: string) {.inline.} =
  let id = n.nodeId
  if a.len > 0:
    count2.inc
    gconfig.comments[id] = a
  else:
    count3.inc
    if id in gconfig.comments:
      count3b.inc
      gconfig.comments.del(id)
    # gconfig.comments.del(id)
  count4 = max(gconfig.comments.len, count4)

(count1, count2, count3, count3b, count4)
(8831908, 33607, 23779703, 1278, 32330)

=> biggest cost is if id in gconfig.comments:

better table insertion/deletion or hashing algorithms might help here (hashWangYi1 is expensive!)

EDIT 2

solved via #18760 (comment)

future work

revive [TODO] add macros.comment(NimNode) magic to get node doc comment #8903

Varriount · 2021-08-28T00:44:35Z

Just to make sure I understand how this works, this change saves space because not all AST nodes have comments?

timotheecour · 2021-08-28T01:33:24Z

Just to make sure I understand how this works, this change saves space because not all AST nodes have comments?

yes, as you can see above, ~ 1/260 PNode's have a non-empty comment field, which makes sense.

The problem right now is that the cost of the lookup is higher than i expected so it ends up being a space vs speed tradeoff unless someone has concrete ideas for how to improve this tradeoff

Araq · 2021-08-28T07:19:31Z

compiler/ast.nim

+var gconfig {.threadvar.}: Gconfig
+
+proc comment*(n: PNode): string =
+  gconfig.comments.getOrDefault(n.nodeId)


Educated guess: if you add another node flag you can skip the getOrDefault step in 99% of all cases and bring back the speed of the old approach.

good idea, I've now implement this and it works, giving same memory improvement without incurring a performance cost;
however there was 1 subtlety but it all works fine, see comments in the PR; TLDR: there's a small amount of leak in the comments table that is entirely justified for performance reasons; future work can improve this if needed, but it probably won't be needed.

Note:

adding a proc =destroy(a: var TNode) would not work well:

would incur a large performance overhead, negating any gains (interestingly, also would increase peakmem metric; design bug?); and furthermore wouldn't work with --gc:arc

timotheecour · 2021-08-28T21:23:12Z

@Araq PTAL, see the comments in last commit

compiler/parser.nim

- CHANGED :: - Implementation improvements for the code pretty printer - more input nodes supported. - Clean up the implementation of the tree-sitter wrapper generator. - More predictable `treeRepr` output - nodes with comments are no longer split apart at random places. - Make a ton of `func` nodes into `proc`, because accessing comment field is no longer a side-effect-free operations, most likely due to the nim-lang/Nim#18760 - ADDED :: - Tree-sitter wrapper generator now can produce library wrappers that do not depend on hmisc for operation. - `addPragma` for enum declarations - REMOVED :: - `nimble_aux.nim` and dependency on the nimble - I no longer work on the nim-lang/RFCs#398 and I see no reason to try and revese-engineer the dependency management solutions, `nimph` provides much better approach in this case (edit `nim.cfg`, then user can simply dump it as needed).

…mpilation by a factor 1.25x (nim-lang#18760) * move PNode.comment so a side channel, reducing memory usage * fix a bug * fixup * use sfHasComment to speedup comment lookups * fix for IC * Update compiler/parser.nim Co-authored-by: Andreas Rumpf <rumpf_a@web.de>

timotheecour force-pushed the pr_PNode_comment_sidechannel branch from 4133106 to 9ca09df Compare August 27, 2021 18:15

timotheecour changed the title ~~move PNode.comment so a side channel, reducing memory usage by a factor 1.25x~~ move PNode.comment so a side channel, reducing memory usage during compilation by a factor 1.25x Aug 27, 2021

timotheecour marked this pull request as draft August 27, 2021 19:46

Araq reviewed Aug 28, 2021

View reviewed changes

timotheecour force-pushed the pr_PNode_comment_sidechannel branch from 9a664fc to 3a7cae5 Compare August 28, 2021 21:17

timotheecour changed the title ~~move PNode.comment so a side channel, reducing memory usage during compilation by a factor 1.25x~~ move PNode.comment to a side channel, reducing memory usage during compilation by a factor 1.25x Aug 28, 2021

timotheecour marked this pull request as ready for review August 28, 2021 21:24

timotheecour mentioned this pull request Aug 28, 2021

regression: building nim from devel (resp 1.4) uses 846 (rep 680) peakmem #18765

Closed

timotheecour added 5 commits August 28, 2021 16:36

move PNode.comment so a side channel, reducing memory usage

eb970f3

fix a bug

6499405

fixup

94c81c1

use sfHasComment to speedup comment lookups

0be877f

fix for IC

6f5acd2

timotheecour force-pushed the pr_PNode_comment_sidechannel branch from 3a7cae5 to 6f5acd2 Compare August 29, 2021 00:41

Araq reviewed Aug 29, 2021

View reviewed changes

compiler/parser.nim Outdated Show resolved Hide resolved

Update compiler/parser.nim

2411600

Araq added the merge_when_passes_CI mergeable once green label Aug 29, 2021

Araq merged commit fa7c1aa into nim-lang:devel Aug 29, 2021

timotheecour deleted the pr_PNode_comment_sidechannel branch August 29, 2021 17:07

timotheecour mentioned this pull request Aug 29, 2021

[superseded] move comment field out of TNode => sizeof(TNode) = 32 instead of 40 #10054

Closed

2 tasks

timotheecour added the TODO: followup needed remove tag once fixed or tracked elsewhere label Aug 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

move PNode.comment to a side channel, reducing memory usage during compilation by a factor 1.25x #18760

move PNode.comment to a side channel, reducing memory usage during compilation by a factor 1.25x #18760

timotheecour commented Aug 27, 2021 •

edited

Loading

Varriount commented Aug 28, 2021

timotheecour commented Aug 28, 2021

Araq Aug 28, 2021

timotheecour Aug 28, 2021 •

edited

Loading

timotheecour commented Aug 28, 2021

move PNode.comment to a side channel, reducing memory usage during compilation by a factor 1.25x #18760

move PNode.comment to a side channel, reducing memory usage during compilation by a factor 1.25x #18760

Conversation

timotheecour commented Aug 27, 2021 • edited Loading

before

after

links

EDIT 1

EDIT 2

future work

Varriount commented Aug 28, 2021

timotheecour commented Aug 28, 2021

Araq Aug 28, 2021

Choose a reason for hiding this comment

timotheecour Aug 28, 2021 • edited Loading

Choose a reason for hiding this comment

Note:

timotheecour commented Aug 28, 2021

timotheecour commented Aug 27, 2021 •

edited

Loading

timotheecour Aug 28, 2021 •

edited

Loading