Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process: slightly simplify next tick execution #16888

Closed

Conversation

apapirovski
Copy link
Member

This PR gets rid of a separate function to execute the callback from _tickCallback as it no longer yields a performance benefit.

process/next-tick-breadth-args.js millions=2     -0.33 %      0.4171216
process/next-tick-breadth.js millions=2          -0.33 %      0.4386435
process/next-tick-depth-args.js millions=12       0.29 %      0.3878064
process/next-tick-depth.js millions=12            0.09 %      0.7962429

Unlike with emit this doesn't yield an improvement on any of our benchmarks but since it also doesn't make them worse, I feel this is probably a worthwhile change (simpler code and less unnecessary internals in stack traces).

Unfortunately we still can't switch to using spread operator in nextTick instead of the manual arguments copying. We can follow up on that in the next few V8 versions.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines
Affected core subsystem(s)

process

@apapirovski apapirovski added the process Issues and PRs related to the process subsystem. label Nov 8, 2017
@apapirovski
Copy link
Member Author

Note: this will need adjustment to test/message/stdin_messages.out once #16869 lands.

@apapirovski
Copy link
Member Author

Ok, so it seems like the CI benchmark turned out worse than my local testing:

 process/next-tick-breadth-args.js millions=2      -4.39 %            0.054449966
 process/next-tick-breadth.js millions=2           -0.39 %            0.757371770
 process/next-tick-depth-args.js millions=12       -4.27 %         ** 0.002636105
 process/next-tick-depth.js millions=12            -1.81 %            0.237992442

Doing a second run just to be sure:
https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/25/

@apapirovski
Copy link
Member Author

Ok, I'm pretty sure that wasn't necessarily reflective. I think there is likely a slight performance dip but it's not consistent and very minor.

 process/bench-env.js n=100000                     -0.48 %            0.8152647
 process/bench-hrtime.js type="diff" n=1000000     -2.34 %            0.2048529
 process/bench-hrtime.js type="raw" n=1000000       1.11 %            0.5847665
 process/memoryUsage.js n=100000                   -0.15 %            0.8392653
 process/next-tick-breadth-args.js millions=2       0.95 %            0.6382626
 process/next-tick-breadth.js millions=2           -0.19 %            0.8283266
 process/next-tick-depth-args.js millions=12       -1.54 %            0.2947018
 process/next-tick-depth.js millions=12            -0.13 %            0.9288224

Third run: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/26/

@apapirovski
Copy link
Member Author

apapirovski commented Nov 8, 2017

Yep, the results are worse after all. Unless we're ok with 2-4% decline on the depth benchmark I'll probably have to close this.

 process/bench-env.js n=100000                      1.07 %            0.338564916
 process/bench-hrtime.js type="diff" n=1000000      0.69 %            0.692105626
 process/bench-hrtime.js type="raw" n=1000000       0.32 %            0.838770747
 process/memoryUsage.js n=100000                    0.54 %            0.340006562
 process/next-tick-breadth-args.js millions=2      -2.65 %            0.204050747
 process/next-tick-breadth.js millions=2           -0.64 %            0.706830203
 process/next-tick-depth-args.js millions=12       -4.14 %        *** 0.000258298
 process/next-tick-depth.js millions=12            -0.19 %            0.908582943

@Fishrock123
Copy link
Contributor

Seems like this should probably not be backported, or only to 8.x?

@apapirovski
Copy link
Member Author

@Fishrock123 Definitely shouldn't. I don't even know if this can go on 9.x to be honest given the slight decline in performance.

@apapirovski
Copy link
Member Author

Ok, I think there's no performance regression after bumping up the iterations to remove any noise.

process/next-tick-breadth-args.js millions=8     -1.39 %            0.4175407
process/next-tick-breadth.js millions=8           0.78 %            0.3573891
process/next-tick-depth-args.js millions=50      -0.69 %            0.5951104
process/next-tick-depth.js millions=50           -0.76 %            0.6042670

@refack
Copy link
Contributor

refack commented Nov 9, 2017

@apapirovski the benchmarks don't show any significance (huge p-value, except for next-tick-depth-args.js) so it should be considered as if there's no difference in performance.

@apapirovski
Copy link
Member Author

@refack yeah, I was just worried about

 process/next-tick-depth-args.js millions=12       -4.14 %        *** 0.000258298

but it seems like it was noise after bumping up iterations and running it more times.

@refack
Copy link
Contributor

refack commented Nov 9, 2017

but it seems like it was noise after bumping up iterations and running it more times.

So we got:

process/next-tick-depth-args.js millions=50      -0.69 %            0.5951104

I'm trying to think about way to improve the benchmarks to see if there is a significant pattern after all.

@starkwang
Copy link
Contributor

starkwang commented Nov 9, 2017

Unfortunately we still can't switch to using spread operator in nextTick instead of the manual arguments copying. We can follow up on that in the next few V8 versions.

Sorry, I don't quite understand this. Why we can't do that? It seems that the nextTick function should be of great significance for process.nextTick


update:
I may understand it after running a benchmark : )

 process/next-tick-breadth-args.js millions=2     -23.57 %        *** 2.046862e-50

@refack
Copy link
Contributor

refack commented Nov 9, 2017

[refack: I pushed a change, then pushed it out, because I wanted to use the benchmark machine]

@apapirovski
Copy link
Member Author

A couple more runs:

 process/next-tick-breadth-args.js millions=8     -0.26 %            0.820344067
 process/next-tick-breadth.js millions=8          -0.03 %            0.921366382
 process/next-tick-depth-args.js millions=50      -0.90 %            0.179757893
 process/next-tick-depth.js millions=50           -2.77 %         ** 0.003797703
 process/next-tick-breadth-args.js millions=8      0.07 %            0.9475671
 process/next-tick-breadth.js millions=8           0.05 %            0.8747827
 process/next-tick-depth-args.js millions=50       0.08 %            0.9172713
 process/next-tick-depth.js millions=50           -0.89 %            0.3620226

Just looking at the raw data, I can see that both the old code and the new code seem to end up in situations where they're either deoptimized or just not optimized by V8.

@apapirovski
Copy link
Member Author

I'll probably spend some more time looking the nextTick code. The fact that there's such huge peaks and valleys indicates that there's something off in both versions.

@apapirovski
Copy link
Member Author

apapirovski commented Nov 9, 2017

Here's the results from @refack's experiment with many runs (and low n value):

 process/next-tick-breadth-args.js millions=1     -0.70 %            0.0869824873
 process/next-tick-breadth.js millions=1          -0.19 %            0.5451670018
 process/next-tick-depth-args.js millions=1        1.29 %        *** 0.0007451827
 process/next-tick-depth.js millions=1            -0.99 %            0.0669916919

@refack
Copy link
Contributor

refack commented Nov 9, 2017

and this is the distribution (for next-tick-depth-args.js), it's still way to variable, also it's not a normal
(I think GC runs cause the semi-discrete groups)
histogram of unpaired t test data 2
unpaired t test data 2

tl;dr IMHO this benchmark is not informative

@apapirovski apapirovski force-pushed the patch-process-next-tick branch 2 times, most recently from a2f43ef to b749fa1 Compare November 9, 2017 17:29
@apapirovski
Copy link
Member Author

I think the real issue here is basically that those benchmarks aren't really testing these particular changes, or at least not in a very representative way. Most of the difference seems to have to do with GC kicking in or not. On my local system I get none of the extreme dips and it's all pretty even.

I created two benchmarks that should do a better job of testing the actual execution performance within _tickCallback.

Benchmark CI: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/39/

@apapirovski apapirovski force-pushed the patch-process-next-tick branch from b749fa1 to 9b12434 Compare November 9, 2017 17:38
@apapirovski
Copy link
Member Author

Ok, I think that got us more reliable info...

1st run:

process/next-tick-exec-args.js millions=5     -0.45 %          6.589957e-01
process/next-tick-exec.js millions=5          -2.85 %      *** 1.436700e-08

2nd run:

process/next-tick-exec-args.js millions=5     -2.22 %        *** 4.588545e-04
process/next-tick-exec.js millions=5          -3.11 %        *** 6.527721e-17

@AndreasMadsen
Copy link
Member

I think at least some of the confusion here has to do with how you use statistics. In simple words:
statistics can never tell you that there is no difference. It can only tell you there is a difference, with some risk of that difference only existing because of randomness. How much risk you tollerate is something you need to descide on before you run the benchmarks.

This means that the only conculsion (and this concusion will likely be wrong) that you can arrive at by running the benchmarks multiplie times, is that there is a difference. But you only conculude this because of the randomness, not because the difference is actually there.

If you think there is a garbage collection issue that interfers with the results (and there may very well be). Then you need to analyse this issue without using the benchmark suite. You can use the trace_events or --expose-gc to debug this. Once you have made a benchmark that you belive accurately represents what you are trying to measure, then you can use the benchmark suite.

Statisticians (especially frequentistic statisticians which is what is the base for our benchmark suite) don't like to admit that there are belifes in statestics, but the unfortunete truth is that there is :(


and this is the distribution it's still way to variable

Since you are trying to prove that there is no difference, it will always have too high a variance. I discussed a bit in #8139 what consequences a high variance has. It is especially important that you understand that a high variance is fine if there is enogth runs.

latex-image-1

Here are some outputs of that:

cv im = 1% im = 10%
n = 10 cv < 0.011 cv < 0.111
n = 30 cv < 0.019 cv < 0.193
n = 100 cv < 0.035 cv < 0.35
n = 200 cv < 0.05 cv < 0.5
n = 500 cv < 0.079 cv < 0.79

Let's be super specific. Let's say we are looking for at least a 1% difference and we have 100 runs then we should have a coefficient of variation (cv) less than 0.035. In this case we have (data from https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/39/) the coefficient of variation (cv.unbiased) to be:

                        filename configuration     mean   std.dev  cv.biased cv.unbiased   conf.int
  process/next-tick-exec-args.js    millions=5 1.677843 0.1206786 0.07192487  0.07201478 0.01682723
       process/next-tick-exec.js    millions=5 2.818856 0.1051820 0.03731372  0.03736036 0.01466641

As you can see the variance for process/next-tick-exec.js is only slightly too high. And for process/next-tick-exec-args.js it is differently too high to find a difference of 1%.

An important assumption in these calculations, is that they assume the variance is the same. But this does appear to be somewhat so. You can look at the F-test (var.test in R) if you like.

rplot

also it's not a normal

Don't expect it to be normal. Time is typically gamma distributed, because of the central limit theorm it will approach a normal distribution as you increase the number of iterations. But it will never become normally distributed. This is theoretically unfortunete, as the t-test assumes a normal distribution, however in practise it is not a huge deal.

Copy link
Member

@BridgeAR BridgeAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but it needs a rebase

@apapirovski apapirovski force-pushed the patch-process-next-tick branch 2 times, most recently from e710e21 to a40a73c Compare November 25, 2017 16:59
@apapirovski
Copy link
Member Author

apapirovski commented Nov 25, 2017

I've resolved all performance regressions in the latest version.

 process/next-tick-breadth-args.js millions=4      0.12 %            9.122406e-01
 process/next-tick-breadth.js millions=4           2.10 %        *** 4.341579e-06
 process/next-tick-depth-args.js millions=12       1.31 %         ** 3.382566e-03
 process/next-tick-depth.js millions=12            1.52 %          * 3.364892e-02
 process/next-tick-exec-args.js millions=5        -1.32 %            8.180421e-02
 process/next-tick-exec.js millions=5              0.84 %            1.020348e-01

PTAL @addaleax, @refack, @cjihrig and others. This should be ready to land after the weekend.

Get rid of separate function to call callback from _tickCallback as
it no longer yields worthwhile performance improvement.

Move some code from nextTick & internalNextTick into TickObject
constructor to minimize duplication.
apapirovski added a commit that referenced this pull request Nov 28, 2017
Get rid of separate function to call callback from _tickCallback as
it no longer yields worthwhile performance improvement.

Move some code from nextTick & internalNextTick into TickObject
constructor to minimize duplication.

PR-URL: #16888
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Timothy Gu <timothygu99@gmail.com>
@apapirovski
Copy link
Member Author

Landed in cbaf59c. Thanks everyone for the reviews & patience in getting this code tightened up.

@apapirovski apapirovski deleted the patch-process-next-tick branch November 28, 2017 13:50
MylesBorins pushed a commit that referenced this pull request Dec 12, 2017
Get rid of separate function to call callback from _tickCallback as
it no longer yields worthwhile performance improvement.

Move some code from nextTick & internalNextTick into TickObject
constructor to minimize duplication.

PR-URL: #16888
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Timothy Gu <timothygu99@gmail.com>
MylesBorins pushed a commit that referenced this pull request Dec 12, 2017
Get rid of separate function to call callback from _tickCallback as
it no longer yields worthwhile performance improvement.

Move some code from nextTick & internalNextTick into TickObject
constructor to minimize duplication.

PR-URL: #16888
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Timothy Gu <timothygu99@gmail.com>
MylesBorins pushed a commit that referenced this pull request Dec 12, 2017
Get rid of separate function to call callback from _tickCallback as
it no longer yields worthwhile performance improvement.

Move some code from nextTick & internalNextTick into TickObject
constructor to minimize duplication.

PR-URL: #16888
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Timothy Gu <timothygu99@gmail.com>
@MylesBorins MylesBorins mentioned this pull request Dec 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
process Issues and PRs related to the process subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.