forceValue: make pos mandatory #5676

kamadorueda · 2021-11-27T17:46:43Z

Make passing the position to forceValue mandatory,
this way we remember people that the position is
important for better error messages
Add pos to all forceValue calls

Mentions #3505

thufschmitt

Great!

I didn’t check whether all the calls to determinePos were justified, but in any case that’s a big improvement and a good thing for the future

edolstra · 2021-12-09T19:29:04Z

src/libexpr/eval-inline.hh

@@ -53,7 +53,7 @@ void EvalState::forceValue(Value & v, const Pos & pos)

 inline void EvalState::forceAttrs(Value & v)
 {
-    forceValue(v);
+    forceValue(v, v.determinePos(noPos));


What's the performance impact of these non-lazy calls to determinePos()?

BTW determinePos() is currently defined in eval.cc, but if we're going to call it everywhere (including on some hot paths), it should probably be moved to value.hh so that it can be inlined.

@kamadorueda it would be really cool to have this merged before it gets conflicts 😊

Here are some steps done:

I tried to inline determinePos, however it uses struct Pos, class Bindings and others, which by the time determinePos needs to be declared in order for the compiler to inline it, are not defined, so the compiler complains that we do not know their size:

which means that we cannot inline it

So I tried benchmarking the performance of this PR (top) vs nix-master (bottom) and there is certainly a difference:

Aparently 0.05 seconds every 6 or 7 seconds

@edolstra
I added a few commits to slim down the critical path,
closed my browser, code editor, only left open the terminal,
and increased the bench-marking rounds to 100 (~10 minutes running) to have more data points

this is the result:

where there is a difference in execution time (rejected) the difference is negative (so faster),

I think it should not be becoming faster even if data tells so,
so I would be happy if someone help me confirm the results on other machine

nix-master $ clear; PAGER=cat benchmark -t 100 -f ../prev -- ./outputs/out/bin/nix-env -qaf ../nixpkgs nix-pr $ clear; PAGER=cat benchmark -t 100 -f ../new -c ../prev -- ./outputs/out/bin/nix-env -qaf ../nixpkgs

https://github.com/edolstra/benchmark

I changed the flake to include it but I'm still seeing the same behavior.

$ git diff diff --git a/flake.lock b/flake.lock index 4f89d76..623d5c8 100644 --- a/flake.lock +++ b/flake.lock @@ -17,11 +17,11 @@ }, "nixpkgs": { "locked": { - "lastModified": 1642786796, - "narHash": "sha256-FsKuowd1SZGNhsX+YeNPlodp3mecyVbLPxCcUMbkLU8=", + "lastModified": 1643948526, + "narHash": "sha256-XmRNI2zmR0iTX7LUJy93MG6GZoMHy1nlMNcp31XHi8w=", "owner": "nixos", "repo": "nixpkgs", - "rev": "1281c5be8ae76b231e181cde812b7b6cdc3028f8", + "rev": "d67ad28fc301305baeeb364a04f0565c5f5118c8", "type": "github" }, "original": { diff --git a/flake.nix b/flake.nix index a55646b..4a0de74 100644 --- a/flake.nix +++ b/flake.nix @@ -22,6 +22,7 @@ StatisticsTTest UnixGetrusage FileSlurp + pkgs.util-linux ]; unpackPhase = "true"; $ ./result/bin/benchmark -t 3 -- sleep 1 run 0... taskset: failed to set pid 6481's affinity: Invalid argument run 1... taskset: failed to set pid 6483's affinity: Invalid argument run 2... taskset: failed to set pid 6485's affinity: Invalid argument note: discarding 10% highest outliers maximum RSS: median = 10546.0000 mean = 10546.0000 stddev = 2.8284 min = 10544.0000 max = 10548.0000 soft page faults: median = 103.0000 mean = 103.0000 stddev = 2.8284 min = 101.0000 max = 105.0000 system CPU time: median = 0.0007 mean = 0.0007 stddev = 0.0000 min = 0.0007 max = 0.0008 elapsed time: median = 0.0009 mean = 0.0009 stddev = 0.0000 min = 0.0009 max = 0.0010

@ConnorBaker I'm able to reproduce it now, benchmark is trying to attach to CPU 6, but you maybe don't have that much CPU on the cloud instance:

(my home computer have 16)

[kamadorueda@nixos:/data]$ taskset -c 16 sleep 1 taskset: failed to set pid 3509's affinity: Invalid argument [kamadorueda@nixos:/data]$ taskset -c 15 sleep 1 (ok) [kamadorueda@nixos:/data]$ taskset -c 0 sleep 1 (ok)

-c 6 attaches to the 7th cpu
https://github.com/edolstra/benchmark/blob/096970f0ac3e74405db124902c39ecc8c0394713/benchmark.pl#L39

Thank you for sorting that out for me!

I used an m6g.metal instance and wasn't able to reproduce the taskset error (like you said, I think because of the core count).

$ lscpu Architecture: aarch64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 64 On-line CPU(s) list: 0-63 Vendor ID: ARM Model name: Neoverse-N1 Model: 1 Thread(s) per core: 1 Core(s) per socket: 64 Socket(s): 1 Stepping: r3p1 BogoMIPS: 243.75 Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs Caches (sum of all): L1d: 4 MiB (64 instances) L1i: 4 MiB (64 instances) L2: 64 MiB (64 instances) L3: 32 MiB (1 instance) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-63 Vulnerabilities: Itlb multihit: Not affected L1tf: Not affected Mds: Not affected Meltdown: Not affected Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Spectre v1: Mitigation; __user pointer sanitization Spectre v2: Not affected Srbds: Not affected Tsx async abort: Not affected $ lsmem RANGE SIZE STATE REMOVABLE BLOCK 0x0000000000000000-0x000000007fffffff 2G online yes 0-1 0x0000000100000000-0x0000003fffffffff 252G online yes 4-255 0x0000008100000000-0x000000817fffffff 2G online yes 516-517 Memory block size: 1G Total online memory: 256G Total offline memory: 0B

I think part of the issue was that when I specified the path to clone Nixpkgs to, I forgot to make it a relative path, and it cloned it to the root directory (so it was at /nixpkgs instead of /root/nixpkgs). It was able to do that since on the EC2 instance I'm using I haven't set up any users, so it defaults to root.

Anyway, probably not relevant any more, but here's what I observed on the m6g.metal instance:

$ PAGER=cat ./benchmark/bin/benchmark -t 20 -f ./prev -- ./nix-pr/bin/nix-env -qaf ./nixpkgs # Output snipped maximum RSS: median = 2023162.0000 mean = 1873273.7778 stddev = 436250.4782 min = 674012.0000 max = 2023432.0000 soft page faults: median = 506947.0000 mean = 469244.7778 stddev = 109796.7131 min = 167440.0000 max = 507034.0000 system CPU time: median = 0.9361 mean = 0.9208 stddev = 0.0556 min = 0.8147 max = 0.9861 user CPU time: median = 10.3891 mean = 9.5182 stddev = 2.5623 min = 2.4753 max = 10.4820 elapsed time: median = 11.3407 mean = 10.8687 stddev = 1.3863 min = 7.0558 max = 11.4138

$ PAGER=cat ./benchmark/bin/benchmark -t 20 -f ./new -c ./prev -- ./nix-master/bin/nix-env -qaf ./nixpkgs # Output snipped maximum RSS: median = 2023060.0000 mean = 1873229.1111 stddev = 436261.1472 min = 674060.0000 max = 2023512.0000 [not rejected, p=0.99976, Δ=-44.66667±436546.64994] soft page faults: median = 506948.5000 mean = 469238.2222 stddev = 109793.4156 min = 167443.0000 max = 507037.0000 [not rejected, p=0.99986, Δ=-6.55556±109868.26108] system CPU time: median = 0.9401 mean = 0.9402 stddev = 0.0398 min = 0.8621 max = 1.0020 [not rejected, p=0.23556, Δ=0.01946±0.04839] user CPU time: median = 10.3588 mean = 9.4907 stddev = 2.5674 min = 2.4120 max = 10.4415 [not rejected, p=0.97448, Δ=-0.02755±2.56658] elapsed time: median = 11.3278 mean = 10.8647 stddev = 1.3801 min = 7.0591 max = 11.3926 [not rejected, p=0.99318, Δ=-0.00397±1.38413]

That is very nice, those 5 "not rejected" mean this PR did not introduced any relevant performance impact, at least from the point of view of evaluating nixpkgs

Cool! Thank you

- Make passing the position to `forceValue` mandatory, this way we remember people that the position is important for better error messages - Add pos to all `forceValue` calls

edolstra · 2022-02-03T23:36:27Z

Thanks, merged it. Because I did still observe a small slowdown, I've made most calls to determinePos() lazy by passing them as a lambda to forceValue() (bd383d1).

thufschmitt approved these changes Dec 1, 2021

View reviewed changes

edolstra reviewed Dec 9, 2021

View reviewed changes

garbas assigned kamadorueda Jan 21, 2022

kamadorueda force-pushed the issue-3505 branch from 4a639c4 to f1268a7 Compare January 21, 2022 15:44

kamadorueda added 5 commits January 21, 2022 16:32

forceValue: make pos mandatory

49b0bb0

- Make passing the position to `forceValue` mandatory, this way we remember people that the position is important for better error messages - Add pos to all `forceValue` calls

avoid unnecesary calls

1472e04

forceAttrs: make pos mandatory

c3896e1

forceList: make pos mandatory

3d2ad2b

determinePos: remove from critical path

50efc54

kamadorueda force-pushed the issue-3505 branch from 60f3bed to 50efc54 Compare January 21, 2022 21:37

edolstra added this to the nix-2.7 milestone Feb 3, 2022

edolstra self-assigned this Feb 3, 2022

edolstra merged commit 4c755c3 into NixOS:master Feb 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

forceValue: make pos mandatory #5676

forceValue: make pos mandatory #5676

kamadorueda commented Nov 27, 2021 •

edited

Loading

thufschmitt left a comment

edolstra Dec 9, 2021

domenkozar Jan 6, 2022

kamadorueda Jan 21, 2022

kamadorueda Jan 21, 2022

tomberek Feb 3, 2022

ConnorBaker Feb 4, 2022

kamadorueda Feb 4, 2022

kamadorueda Feb 4, 2022

ConnorBaker Feb 6, 2022

kamadorueda Feb 6, 2022

edolstra commented Feb 3, 2022

forceValue: make pos mandatory #5676

forceValue: make pos mandatory #5676

Conversation

kamadorueda commented Nov 27, 2021 • edited Loading

thufschmitt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

edolstra commented Feb 3, 2022

kamadorueda commented Nov 27, 2021 •

edited

Loading