Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tls: replace forEach with for #15053

Closed
wants to merge 1 commit into from
Closed

Conversation

mscdex
Copy link
Contributor

@mscdex mscdex commented Aug 27, 2017

CI: https://ci.nodejs.org/job/node-test-pull-request/9855/

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines
Affected core subsystem(s)
  • tls

@mscdex mscdex added the tls Issues and PRs related to the tls subsystem. label Aug 27, 2017
@mscdex mscdex force-pushed the tls-replace-foreach branch from d0bd60b to ab9ca09 Compare August 27, 2017 15:31
Copy link
Contributor

@refack refack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO this is CrackshaftScript...
Could you show a benchmark

@refack
Copy link
Contributor

refack commented Aug 27, 2017

I run a micro benchmark:

> function A(arg) { global.arg = arg }
undefined
> var a = [1,2,3]
undefined
> console.time('of'); for (let i = 0; i < 1e8; ++i) {for (const i of a) A(i)};console.timeEnd('of')
of: 1808.680ms
undefined
> console.time('in'); for (let i = 0; i < 1e8; ++i) {for (const i in a) A(a[i])};console.timeEnd('in')
in: 36703.466ms
undefined
> console.time('in2'); for (let i = 0; i < 1e8; ++i) {for (const i in a) {const v = a[i]; A(v);} };console.timeEnd('in2')
in2: 36739.227ms
undefined
> console.time('forEach'); for (let i = 0; i < 1e8; ++i) { a.forEach((i) => A(i)) };console.timeEnd('forEach')
forEach: 10563.082ms

That show a significant advantage for for ... of.
BTW: even forEach performs better than for ... in.

tl;dr

I would love to see some benchmarks

@refack
Copy link
Contributor

refack commented Aug 27, 2017

another data point (removing the allocation of the iterator)

> var j = 0
undefined
> console.time('in3'); for (let i = 0; i < 1e7; ++i) {for (const j in a) A(a[j]); };console.timeEnd('in3')
in3: 3683.538ms
undefined
> console.time('in3'); for (let i = 0; i < 1e7; ++i) {for (j in a) A(a[j]); };console.timeEnd('in3')
in3: 3748.282ms
undefined
> console.time('of'); for (let i = 0; i < 1e7; ++i) {for (v of a) A(v); };console.timeEnd('of')
of: 199.984ms
undefined

TF&I seems like a crazy new world

@mscdex
Copy link
Contributor Author

mscdex commented Aug 27, 2017

@refack Using a simple, ordinary for-loop is not "crankshaftscript." forEach() has overhead because of at least the closure. Maybe V8 can improve that a bit with some optimizations (such as detecting if this or the callback arguments are never used, etc. in order to implicitly convert it to an ordinary for-loop), but right now this is not the case AFAIK.

Secondly, we have an existing benchmark that compares these loops. Here is one run to give you some idea of how things currently perform in master (separated for better readability):

es/foreach-bench.js millions=5 count=5 method="for": 126.6353306696689
es/foreach-bench.js millions=5 count=10 method="for": 76.45481236941615
es/foreach-bench.js millions=5 count=20 method="for": 35.55623115555305
es/foreach-bench.js millions=5 count=100 method="for": 9.040663842782768

es/foreach-bench.js millions=5 count=5 method="for-of": 71.68430518732721
es/foreach-bench.js millions=5 count=10 method="for-of": 48.1283227372895
es/foreach-bench.js millions=5 count=20 method="for-of": 29.183578307099143
es/foreach-bench.js millions=5 count=100 method="for-of": 5.120690805932898

es/foreach-bench.js millions=5 count=5 method="for-in": 3.205664307275511
es/foreach-bench.js millions=5 count=10 method="for-in": 1.9687642285666993
es/foreach-bench.js millions=5 count=20 method="for-in": 1.1150559886482698
es/foreach-bench.js millions=5 count=100 method="for-in": 0.2549428554729159

es/foreach-bench.js millions=5 count=5 method="forEach": 11.527265610800029
es/foreach-bench.js millions=5 count=10 method="forEach": 6.6736901560677016
es/foreach-bench.js millions=5 count=20 method="forEach": 4.143783238393817
es/foreach-bench.js millions=5 count=100 method="forEach": 0.9785879944277384

@refack
Copy link
Contributor

refack commented Aug 27, 2017

@bmeurer does this ^^^ makes sense, or are my micro-benchmarks over simplistic?

@mscdex
Copy link
Contributor Author

mscdex commented Aug 27, 2017

FWIW these loops had been plain for-loops up until #14807, which was landed just recently, so it can be seen as a partial revert of that PR.

@refack
Copy link
Contributor

refack commented Aug 27, 2017

@refack Using a simple, ordinary for-loop is not "crankshaftscript."

I'm so sorry, I was sure I saw for ... in loops 🤦‍♂️ .

But I did add add the following benchmark:

function forWithAssign(n, items, count) {
  bench.start();
  for (let i = 0; i < n; i++) {
    for (let j = 0; j < count; j++) {
      /* eslint-disable no-unused-vars */
      var item = items[j];
      /* esline-enable no-unused-vars */
    }
  }
  bench.end(n / 1e6);
}

to get:

"filename", "configuration", "rate", "time"
"es\foreach-bench.js", "millions=5 count=5 method=""for""", 89.88626044285091, 0.055625854
"es\foreach-bench.js", "millions=5 count=10 method=""for""", 54.98455198015207, 0.090934632
"es\foreach-bench.js", "millions=5 count=20 method=""for""", 30.213640409627352, 0.165488168
"es\foreach-bench.js", "millions=5 count=100 method=""for""", 6.588766433648529, 0.758867392

"es\foreach-bench.js", "millions=5 count=5 method=""forWithAssign""", 88.47490028126703, 0.056513203
"es\foreach-bench.js", "millions=5 count=10 method=""forWithAssign""", 55.57996565891778, 0.089960473
"es\foreach-bench.js", "millions=5 count=20 method=""forWithAssign""", 30.502123579195068, 0.163923013
"es\foreach-bench.js", "millions=5 count=100 method=""forWithAssign""", 6.498356250533962, 0.769425345

So IMHO the variable assignment should be lowered into the loops.

var ca = options.ca;
if (ca !== undefined) {
if (Array.isArray(ca)) {
for (i = 0; i < ca.length; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i and val should be lowered into this scope.

var cert = options.cert;
if (cert !== undefined) {
if (Array.isArray(cert)) {
for (i = 0; i < cert.length; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i and val should be lowered into this scope.

var passphrase = options.passphrase;
if (key !== undefined) {
if (Array.isArray(key)) {
for (i = 0; i < key.length; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i and val should be lowered into this scope.

@mscdex
Copy link
Contributor Author

mscdex commented Aug 27, 2017

@refack

So IMHO the variable assignment should be lowered into the loops.

  • i is already being used in multiple places and we aren't generally using let in node core yet. The definition was already at the top of the function because prior to the previously mentioned PR, ordinary for-loops were being used and were sharing the same increment variable.

  • It doesn't make much sense to put var val = ... inside any of the loops because it's already hoisted to the top of the function and I decided to use the same variable name in all of the loops for consistency (and the linter would most likely complain about re-redefinitions if I used var val = ... in each loop). Also, one of the loops has a conditional when re-assigning val, so I cannot consistently use const in each loop.

@refack
Copy link
Contributor

refack commented Aug 27, 2017

@mscdex thanks for the reply.
I removed my erroneous objection, and approved this PR. I left style nits which are up to you decide if you want to adopt.
On the one hand I see that this is a partial revert, so it makes sense to return the code to the way it was.
On the other hand it's a chance to clean out CrankshaftScript (like pre-declaring i and val in the outer scope), my last data point indicated it might not have any performance benefit. IMHO unnecessarily hoisted variables are error prone.
AFAICT the three vals could be defined as scoped consts, but I may be wrong.

I gave my POV, and for this PR I believe that final style decisions should be made by the OP.

Copy link
Member

@benjamingr benjamingr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for...of is optimized in TurboFan, can we use that instead @mscdex?

@mscdex
Copy link
Contributor Author

mscdex commented Aug 28, 2017

@benjamingr Did you see the benchmark results I posted above?

@benjamingr
Copy link
Member

for... of with numeric keys is literally as fast as the for loop above, it gets translated directly to the same code. I'm finding a reference (don't want to leave you hanging, will update in a few minutes)

@benjamingr
Copy link
Member

Found it https://github.com/v8/v8/blob/master/src/compiler/js-builtin-reducer.cc#L208-L326

@bmeurer can you comment on why the benchmarks @mscdex is seeing (comparing for.. of vs. for) are showing slower results for for.. of in #15053 (comment)

@benjamingr
Copy link
Member

Come to think of it, even if for... of is optimized I see no harm in this sort of optimization to for until optimization of for... of is stable enough.

@refack
Copy link
Contributor

refack commented Aug 28, 2017

for... of with numeric keys is literally as fast as the for loop above, it gets translated directly to the same code. I'm finding a reference (don't want to leave you hanging, will update in a few minutes)

@benjamingr our current microbenchmarks (#15053 (comment)) show that for ... of are 40-50% slower than numeric for (although both are an order of magnitude faster than the alternatives), for such a tight construct it might even be even an extra instruction or two.
Personally I think it's worth it because it allows the iterator to be const, but for such a marginal change IMHO it should be the OP's decision.

@benjamingr
Copy link
Member

@benjamingr our current microbenchmarks (#15053 (comment)) show that for ... of are 40-50% slower than numeric for (although both are an order of magnitude faster than the alternatives)

I'll clarify - I don't believe these benchmarks and am convinced for.. of is fast (look at that V8 code) but I think for.. is still more easily optimizable than for... of so I don't see harm in core doing that.

@jasnell
Copy link
Member

jasnell commented Aug 29, 2017

@benjamingr ... it's entirely possible that the microbenchmarks are emphasizing the wrong thing. What would be helpful, however, is an improved benchmark that would give a more realistic, real world picture. I'll see if I can come up with one, but if you have a suggestion on that, it would be helpful!

@BridgeAR
Copy link
Member

Landed in 5723c4c

@BridgeAR BridgeAR closed this Aug 30, 2017
BridgeAR pushed a commit that referenced this pull request Aug 30, 2017
PR-URL: #15053
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@mscdex mscdex deleted the tls-replace-foreach branch August 30, 2017 18:30
@refack
Copy link
Contributor

refack commented Aug 30, 2017

I'll clarify - I don't believe these benchmarks and am convinced for.. of is fast (look at that V8 code) but I think for.. is still more easily optimizable than for... of so I don't see harm in core doing that.

[academic discussion]
I agree that for ... of are fast. Very fast (10 time faster than for ... in or simple forEach).
for ... ++i is only a tiny bit faster.
From "nano"-benchmarks I run in the REPL it seems the difference it equivalent to 1 added non trivial operation.
for ... of loops also has the added benefit that they are more succinct hence more readable, and the iterator can be a const variable. I'd say on the face of it I prefer them.

> function A(arg) { global.arg = arg }
> var a = [1,2,3,4,5]
> console.time('of'); for (let i = 0; i < 1e8; ++i) {
  for (const j of a) A(j);
};console.timeEnd('of')
of: 19053.156ms
> console.time('i+pow2'); for (let i = 0; i < 1e8; ++i) {
  for (let j = 0; j < a.length; ++j) { const val = a[j]; A(val**2);  }
};console.timeEnd('i+assign')
i+assign: 18272.780ms

cjihrig pushed a commit to cjihrig/node that referenced this pull request Aug 31, 2017
PR-URL: nodejs#15053
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
addaleax added a commit to addaleax/node that referenced this pull request Sep 1, 2017
5723c4c was an unintentional breaking change in that it changed
the behaviour of `tls.createSecureContext()` to throw on false-y input
rather than ignoring it. This breaks real-world applications like `npm`.

This restores the previous behaviour.

Ref: nodejs#15053
@addaleax addaleax mentioned this pull request Sep 1, 2017
3 tasks
@addaleax
Copy link
Member

addaleax commented Sep 1, 2017

Marking as don’t land for now since this is breaking, see #15131

addaleax pushed a commit to addaleax/ayo that referenced this pull request Sep 5, 2017
PR-URL: nodejs/node#15053
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
mhdawson pushed a commit that referenced this pull request Sep 5, 2017
5723c4c was an unintentional breaking change in that it changed
the behaviour of `tls.createSecureContext()` to throw on false-y input
rather than ignoring it. This breaks real-world applications like `npm`.

This restores the previous behaviour.

PR-URL: #15131
Ref: #15053
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Alexey Orlenko <eaglexrlnk@gmail.com>
Reviewed-By: MichaëZasso <targos@protonmail.com>
Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com>
Reviewed-By: Timothy Gu <timothygu99@gmail.com>
Reviewed-By: Brian White <mscdex@mscdex.net>
addaleax added a commit to addaleax/node that referenced this pull request Sep 13, 2017
5723c4c was an unintentional breaking change in that it changed
the behaviour of `tls.createSecureContext()` to throw on false-y input
rather than ignoring it. This breaks real-world applications like `npm`.

This restores the previous behaviour.

PR-URL: nodejs#15131
Ref: nodejs#15053
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Alexey Orlenko <eaglexrlnk@gmail.com>
Reviewed-By: MichaëZasso <targos@protonmail.com>
Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com>
Reviewed-By: Timothy Gu <timothygu99@gmail.com>
Reviewed-By: Brian White <mscdex@mscdex.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tls Issues and PRs related to the tls subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants