process: slightly simplify next tick execution #16888

apapirovski · 2017-11-08T16:01:52Z

This PR gets rid of a separate function to execute the callback from _tickCallback as it no longer yields a performance benefit.

process/next-tick-breadth-args.js millions=2     -0.33 %      0.4171216
process/next-tick-breadth.js millions=2          -0.33 %      0.4386435
process/next-tick-depth-args.js millions=12       0.29 %      0.3878064
process/next-tick-depth.js millions=12            0.09 %      0.7962429

Unlike with emit this doesn't yield an improvement on any of our benchmarks but since it also doesn't make them worse, I feel this is probably a worthwhile change (simpler code and less unnecessary internals in stack traces).

Unfortunately we still can't switch to using spread operator in nextTick instead of the manual arguments copying. We can follow up on that in the next few V8 versions.

Checklist

make -j4 test (UNIX), or vcbuild test (Windows) passes
commit message follows commit guidelines

Affected core subsystem(s)

process

apapirovski · 2017-11-08T16:02:50Z

CI: https://ci.nodejs.org/job/node-test-pull-request/11313/
Benchmark CI: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/24/

apapirovski · 2017-11-08T16:05:30Z

Note: this will need adjustment to test/message/stdin_messages.out once #16869 lands.

apapirovski · 2017-11-08T16:38:54Z

Ok, so it seems like the CI benchmark turned out worse than my local testing:

 process/next-tick-breadth-args.js millions=2      -4.39 %            0.054449966
 process/next-tick-breadth.js millions=2           -0.39 %            0.757371770
 process/next-tick-depth-args.js millions=12       -4.27 %         ** 0.002636105
 process/next-tick-depth.js millions=12            -1.81 %            0.237992442

Doing a second run just to be sure:
https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/25/

apapirovski · 2017-11-08T17:11:52Z

Ok, I'm pretty sure that wasn't necessarily reflective. I think there is likely a slight performance dip but it's not consistent and very minor.

 process/bench-env.js n=100000                     -0.48 %            0.8152647
 process/bench-hrtime.js type="diff" n=1000000     -2.34 %            0.2048529
 process/bench-hrtime.js type="raw" n=1000000       1.11 %            0.5847665
 process/memoryUsage.js n=100000                   -0.15 %            0.8392653
 process/next-tick-breadth-args.js millions=2       0.95 %            0.6382626
 process/next-tick-breadth.js millions=2           -0.19 %            0.8283266
 process/next-tick-depth-args.js millions=12       -1.54 %            0.2947018
 process/next-tick-depth.js millions=12            -0.13 %            0.9288224

Third run: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/26/

apapirovski · 2017-11-08T17:42:06Z

Yep, the results are worse after all. Unless we're ok with 2-4% decline on the depth benchmark I'll probably have to close this.

 process/bench-env.js n=100000                      1.07 %            0.338564916
 process/bench-hrtime.js type="diff" n=1000000      0.69 %            0.692105626
 process/bench-hrtime.js type="raw" n=1000000       0.32 %            0.838770747
 process/memoryUsage.js n=100000                    0.54 %            0.340006562
 process/next-tick-breadth-args.js millions=2      -2.65 %            0.204050747
 process/next-tick-breadth.js millions=2           -0.64 %            0.706830203
 process/next-tick-depth-args.js millions=12       -4.14 %        *** 0.000258298
 process/next-tick-depth.js millions=12            -0.19 %            0.908582943

Fishrock123 · 2017-11-08T17:45:28Z

Seems like this should probably not be backported, or only to 8.x?

apapirovski · 2017-11-08T17:46:19Z

@Fishrock123 Definitely shouldn't. I don't even know if this can go on 9.x to be honest given the slight decline in performance.

apapirovski · 2017-11-09T00:02:58Z

Ok, I think there's no performance regression after bumping up the iterations to remove any noise.

process/next-tick-breadth-args.js millions=8     -1.39 %            0.4175407
process/next-tick-breadth.js millions=8           0.78 %            0.3573891
process/next-tick-depth-args.js millions=50      -0.69 %            0.5951104
process/next-tick-depth.js millions=50           -0.76 %            0.6042670

refack · 2017-11-09T00:13:22Z

@apapirovski the benchmarks don't show any significance (huge p-value, except for next-tick-depth-args.js) so it should be considered as if there's no difference in performance.

apapirovski · 2017-11-09T00:14:29Z

@refack yeah, I was just worried about

 process/next-tick-depth-args.js millions=12       -4.14 %        *** 0.000258298

but it seems like it was noise after bumping up iterations and running it more times.

refack · 2017-11-09T00:18:03Z

but it seems like it was noise after bumping up iterations and running it more times.

So we got:

process/next-tick-depth-args.js millions=50      -0.69 %            0.5951104

I'm trying to think about way to improve the benchmarks to see if there is a significant pattern after all.

starkwang · 2017-11-09T06:47:39Z

Unfortunately we still can't switch to using spread operator in nextTick instead of the manual arguments copying. We can follow up on that in the next few V8 versions.

Sorry, I don't quite understand this. Why we can't do that? It seems that the nextTick function should be of great significance for process.nextTick

update:
I may understand it after running a benchmark : )

 process/next-tick-breadth-args.js millions=2     -23.57 %        *** 2.046862e-50

refack · 2017-11-09T14:45:49Z

[refack: I pushed a change, then pushed it out, because I wanted to use the benchmark machine]

apapirovski · 2017-11-09T14:47:38Z

A couple more runs:

 process/next-tick-breadth-args.js millions=8     -0.26 %            0.820344067
 process/next-tick-breadth.js millions=8          -0.03 %            0.921366382
 process/next-tick-depth-args.js millions=50      -0.90 %            0.179757893
 process/next-tick-depth.js millions=50           -2.77 %         ** 0.003797703

 process/next-tick-breadth-args.js millions=8      0.07 %            0.9475671
 process/next-tick-breadth.js millions=8           0.05 %            0.8747827
 process/next-tick-depth-args.js millions=50       0.08 %            0.9172713
 process/next-tick-depth.js millions=50           -0.89 %            0.3620226

Just looking at the raw data, I can see that both the old code and the new code seem to end up in situations where they're either deoptimized or just not optimized by V8.

apapirovski · 2017-11-09T14:51:40Z

I'll probably spend some more time looking the nextTick code. The fact that there's such huge peaks and valleys indicates that there's something off in both versions.

apapirovski · 2017-11-09T15:10:47Z

Here's the results from @refack's experiment with many runs (and low n value):

 process/next-tick-breadth-args.js millions=1     -0.70 %            0.0869824873
 process/next-tick-breadth.js millions=1          -0.19 %            0.5451670018
 process/next-tick-depth-args.js millions=1        1.29 %        *** 0.0007451827
 process/next-tick-depth.js millions=1            -0.99 %            0.0669916919

refack · 2017-11-09T15:46:10Z

and this is the distribution (for next-tick-depth-args.js), it's still way to variable, also it's not a normal
(I think GC runs cause the semi-discrete groups)

tl;dr IMHO this benchmark is not informative

apapirovski · 2017-11-09T17:32:27Z

I think the real issue here is basically that those benchmarks aren't really testing these particular changes, or at least not in a very representative way. Most of the difference seems to have to do with GC kicking in or not. On my local system I get none of the extreme dips and it's all pretty even.

I created two benchmarks that should do a better job of testing the actual execution performance within _tickCallback.

Benchmark CI: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/39/

apapirovski · 2017-11-09T19:40:30Z

Ok, I think that got us more reliable info...

1st run:

process/next-tick-exec-args.js millions=5     -0.45 %          6.589957e-01
process/next-tick-exec.js millions=5          -2.85 %      *** 1.436700e-08

2nd run:

process/next-tick-exec-args.js millions=5     -2.22 %        *** 4.588545e-04
process/next-tick-exec.js millions=5          -3.11 %        *** 6.527721e-17

AndreasMadsen · 2017-11-10T10:20:06Z

I think at least some of the confusion here has to do with how you use statistics. In simple words:
statistics can never tell you that there is no difference. It can only tell you there is a difference, with some risk of that difference only existing because of randomness. How much risk you tollerate is something you need to descide on before you run the benchmarks.

This means that the only conculsion (and this concusion will likely be wrong) that you can arrive at by running the benchmarks multiplie times, is that there is a difference. But you only conculude this because of the randomness, not because the difference is actually there.

If you think there is a garbage collection issue that interfers with the results (and there may very well be). Then you need to analyse this issue without using the benchmark suite. You can use the trace_events or --expose-gc to debug this. Once you have made a benchmark that you belive accurately represents what you are trying to measure, then you can use the benchmark suite.

Statisticians (especially frequentistic statisticians which is what is the base for our benchmark suite) don't like to admit that there are belifes in statestics, but the unfortunete truth is that there is :(

and this is the distribution it's still way to variable

Since you are trying to prove that there is no difference, it will always have too high a variance. I discussed a bit in #8139 what consequences a high variance has. It is especially important that you understand that a high variance is fine if there is enogth runs.

Here are some outputs of that:

cv	im = 1%	im = 10%
n = 10	`cv < 0.011`	`cv < 0.111`
n = 30	`cv < 0.019`	`cv < 0.193`
n = 100	`cv < 0.035`	`cv < 0.35`
n = 200	`cv < 0.05`	`cv < 0.5`
n = 500	`cv < 0.079`	`cv < 0.79`

Let's be super specific. Let's say we are looking for at least a 1% difference and we have 100 runs then we should have a coefficient of variation (cv) less than 0.035. In this case we have (data from https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/39/) the coefficient of variation (cv.unbiased) to be:

                        filename configuration     mean   std.dev  cv.biased cv.unbiased   conf.int
  process/next-tick-exec-args.js    millions=5 1.677843 0.1206786 0.07192487  0.07201478 0.01682723
       process/next-tick-exec.js    millions=5 2.818856 0.1051820 0.03731372  0.03736036 0.01466641

As you can see the variance for process/next-tick-exec.js is only slightly too high. And for process/next-tick-exec-args.js it is differently too high to find a difference of 1%.

An important assumption in these calculations, is that they assume the variance is the same. But this does appear to be somewhat so. You can look at the F-test (var.test in R) if you like.

also it's not a normal

Don't expect it to be normal. Time is typically gamma distributed, because of the central limit theorm it will approach a normal distribution as you increase the number of iterations. But it will never become normally distributed. This is theoretically unfortunete, as the t-test assumes a normal distribution, however in practise it is not a huge deal.

BridgeAR

LGTM but it needs a rebase

apapirovski · 2017-11-25T18:31:19Z

I've resolved all performance regressions in the latest version.

 process/next-tick-breadth-args.js millions=4      0.12 %            9.122406e-01
 process/next-tick-breadth.js millions=4           2.10 %        *** 4.341579e-06
 process/next-tick-depth-args.js millions=12       1.31 %         ** 3.382566e-03
 process/next-tick-depth.js millions=12            1.52 %          * 3.364892e-02
 process/next-tick-exec-args.js millions=5        -1.32 %            8.180421e-02
 process/next-tick-exec.js millions=5              0.84 %            1.020348e-01

PTAL @addaleax, @refack, @cjihrig and others. This should be ready to land after the weekend.

Get rid of separate function to call callback from _tickCallback as it no longer yields worthwhile performance improvement. Move some code from nextTick & internalNextTick into TickObject constructor to minimize duplication.

Get rid of separate function to call callback from _tickCallback as it no longer yields worthwhile performance improvement. Move some code from nextTick & internalNextTick into TickObject constructor to minimize duplication. PR-URL: #16888 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Refael Ackermann <refack@gmail.com> Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de> Reviewed-By: Timothy Gu <timothygu99@gmail.com>

apapirovski · 2017-11-28T13:49:58Z

Landed in cbaf59c. Thanks everyone for the reviews & patience in getting this code tightened up.

Get rid of separate function to call callback from _tickCallback as it no longer yields worthwhile performance improvement. Move some code from nextTick & internalNextTick into TickObject constructor to minimize duplication. PR-URL: #16888 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Refael Ackermann <refack@gmail.com> Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de> Reviewed-By: Timothy Gu <timothygu99@gmail.com>

apapirovski added the process Issues and PRs related to the process subsystem. label Nov 8, 2017

addaleax approved these changes Nov 8, 2017

View reviewed changes

cjihrig approved these changes Nov 8, 2017

View reviewed changes

apapirovski added dont-land-on-v4.x labels Nov 8, 2017

refack approved these changes Nov 9, 2017

View reviewed changes

lpinca approved these changes Nov 9, 2017

View reviewed changes

refack mentioned this pull request Nov 9, 2017

Status of running node core benchmarks nodejs/benchmarking#127

Open

refack force-pushed the patch-process-next-tick branch from 158071c to 39ec9e6 Compare November 9, 2017 14:39

apapirovski force-pushed the patch-process-next-tick branch 2 times, most recently from a2f43ef to b749fa1 Compare November 9, 2017 17:29

apapirovski force-pushed the patch-process-next-tick branch from b749fa1 to 9b12434 Compare November 9, 2017 17:38

jasnell approved these changes Nov 10, 2017

View reviewed changes

BridgeAR approved these changes Nov 23, 2017

View reviewed changes

apapirovski force-pushed the patch-process-next-tick branch 2 times, most recently from e710e21 to a40a73c Compare November 25, 2017 16:59

process: slightly simplify next tick execution

c9513e1

Get rid of separate function to call callback from _tickCallback as it no longer yields worthwhile performance improvement. Move some code from nextTick & internalNextTick into TickObject constructor to minimize duplication.

apapirovski force-pushed the patch-process-next-tick branch from a40a73c to c9513e1 Compare November 26, 2017 15:03

AndreasMadsen mentioned this pull request Nov 26, 2017

[WIP] events: use Map for events and limit copying #17074

Closed

4 tasks

TimothyGu approved these changes Nov 27, 2017

View reviewed changes

addaleax approved these changes Nov 27, 2017

View reviewed changes

refack approved these changes Nov 27, 2017

View reviewed changes

apapirovski closed this Nov 28, 2017

apapirovski deleted the patch-process-next-tick branch November 28, 2017 13:50

MylesBorins mentioned this pull request Dec 12, 2017

v9.3.0 proposal #17631

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

process: slightly simplify next tick execution #16888

process: slightly simplify next tick execution #16888

apapirovski commented Nov 8, 2017

apapirovski commented Nov 8, 2017

apapirovski commented Nov 8, 2017

apapirovski commented Nov 8, 2017

apapirovski commented Nov 8, 2017

apapirovski commented Nov 8, 2017 •

edited

Loading

Fishrock123 commented Nov 8, 2017

apapirovski commented Nov 8, 2017

apapirovski commented Nov 9, 2017

refack commented Nov 9, 2017

apapirovski commented Nov 9, 2017

refack commented Nov 9, 2017

starkwang commented Nov 9, 2017 •

edited

Loading

refack commented Nov 9, 2017

apapirovski commented Nov 9, 2017

apapirovski commented Nov 9, 2017

apapirovski commented Nov 9, 2017 •

edited

Loading

refack commented Nov 9, 2017

apapirovski commented Nov 9, 2017

apapirovski commented Nov 9, 2017

AndreasMadsen commented Nov 10, 2017

BridgeAR left a comment

apapirovski commented Nov 25, 2017 •

edited

Loading

apapirovski commented Nov 28, 2017

process: slightly simplify next tick execution #16888

process: slightly simplify next tick execution #16888

Conversation

apapirovski commented Nov 8, 2017

Checklist

Affected core subsystem(s)

apapirovski commented Nov 8, 2017

apapirovski commented Nov 8, 2017

apapirovski commented Nov 8, 2017

apapirovski commented Nov 8, 2017

apapirovski commented Nov 8, 2017 • edited Loading

Fishrock123 commented Nov 8, 2017

apapirovski commented Nov 8, 2017

apapirovski commented Nov 9, 2017

refack commented Nov 9, 2017

apapirovski commented Nov 9, 2017

refack commented Nov 9, 2017

starkwang commented Nov 9, 2017 • edited Loading

refack commented Nov 9, 2017

apapirovski commented Nov 9, 2017

apapirovski commented Nov 9, 2017

apapirovski commented Nov 9, 2017 • edited Loading

refack commented Nov 9, 2017

apapirovski commented Nov 9, 2017

apapirovski commented Nov 9, 2017

AndreasMadsen commented Nov 10, 2017

BridgeAR left a comment

Choose a reason for hiding this comment

apapirovski commented Nov 25, 2017 • edited Loading

apapirovski commented Nov 28, 2017

apapirovski commented Nov 8, 2017 •

edited

Loading

starkwang commented Nov 9, 2017 •

edited

Loading

apapirovski commented Nov 9, 2017 •

edited

Loading

apapirovski commented Nov 25, 2017 •

edited

Loading