Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up token generation #14

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Speed up token generation #14

wants to merge 2 commits into from

Conversation

ai
Copy link

@ai ai commented Oct 13, 2018

This PR reduces dependencies number (from 3 to 1) and should a little improve performance according to this benchmark with the same security (nanoid use the same hardware random generator; it’s popular and well maintained).

nanoid                    413,579 ops/sec
uid.sync                  354,882 ops/sec

nanoid/async               85,168 ops/sec
uid                        78,426 ops/sec

nanoid/non-secure       2,718,186 ops/sec
rndm                    2,544,612 ops/sec

@dougwilson what do you think? Should I rename nanoidFast to nanoidNonSecure in R to be more clear?

@dougwilson
Copy link
Contributor

Needs to at least support Node.js 0.8. What makes it quicker? Is it not possible to speed up our existing dependencies? I have not had great experience with third party dependencies, especially when they seem to very quickly discard support for old Node.js versions. Now, that's their own opinion, though I try to support as far down as is reasonable with the CI infrastructure (which is 0.8 currently).

@dougwilson dougwilson added the pr label Oct 13, 2018
@ai
Copy link
Author

ai commented Oct 13, 2018

Needs to at least support Node.js 0.8

Yeap, I support Promise-less environment and it is written by the simple ES5. I especially made few changes to Nano ID to pass csrf tests (great tests, BTW).

What makes it quicker?

Thanks for answering =^_^= In secure generator it happens because of we manager with Buffers in a more optimized way. In non-secure we used a cool trick with Math.floor and old V8 optimizations for asm.js.

Is it not possible to speed up our existing dependencies?

It is possible in theory. But do you want to spend time on maintaining multiple projects and copy ideas every time, when there is a good well-maintained project, which always tries to find a tricky way of optimization? =^_^= (we are focusing not only on performance but also of client-side size)

@dougwilson
Copy link
Contributor

Yeap, I support Promise-less environment and it is written by the simple ES5. I especially made few changes to Nano ID to pass csrf tests (great tests, BTW).

Ah, sorry, I assumed it didn't because the Node.js 0.8 CI failed for this PR and I looked at the module and it doesn't have any CI for Node.js 0.8 so assumed there is no support. If there is no CI, what is the guarantee that 0.8 is a supported platform in an ongoing fashion?

It is possible in theory. But do you want to spend time on maintaining multiple projects and copy ideas every time, when there is a good well-maintained project, which always tries to find a tricky way of optimization? =^_^= (we are focusing not only on performance but also of client-side size)

Right, I definitely would rather use external deps when possible if they exist and they seem to be aligned with the same target support environments.

@ai
Copy link
Author

ai commented Oct 13, 2018

If there is no CI, what is the guarantee that 0.8 is a supported platform in an ongoing fashion?

Jest doesn’t support Node.js 0.8 to run a test on CI :(. I think right now most of the test runners drop even Node.js 4 support.

Maybe I can write a script to test critical things manually. Or maybe you have a better idea?

@dougwilson
Copy link
Contributor

Yea, I don't have a better idea besides just using a testing framework that does support 0.8 (which is what this module does). In my opinion testing frameworks should be supporting pretty much every possible version since they are going to influence what environments real modules can support. But maybe those test frameworks want to use their power to dictate the versions of Node.js people will support, idk.

Coming from the Perl system Node.js feels very fractured to me, with constant major versions of Node.js and module authors quickly discarding support for Node.js versions :(

@ai
Copy link
Author

ai commented Oct 13, 2018

In my opinion testing frameworks should be supporting pretty much every possible version since they are going to influence what environments real modules can support.

Totally agree. I fight with Jest community for a few months for Node.js 4 even at the moment when it was still LTS.

Fair enough note about CI and Node.js 0.8. I will try to fix it soon.

@ai
Copy link
Author

ai commented Oct 13, 2018

Done. I added Node.js 4, 0.12 and 0.8 to CI:

https://travis-ci.org/ai/nanoid

It was tricky, since yarn install doesn’t work for many reasons. So I used custom installer and custom task runner with manually testing core features for Node 4-0.8.

@ai
Copy link
Author

ai commented Oct 15, 2018

@dougwilson I thought about how to fix npm error in csrf CI and do not have a solution right now.

But maybe it is just a cache issue and if we will call it again, it will be installed?

@ai
Copy link
Author

ai commented Oct 20, 2018

We released Nano ID 1.3.1. It contains client-side optimization (not important for Node.js version). But maybe you will like the way. We wrote a bruteforcer to find the most gzip-friendly alphabet symbols order.

Also, your suggestion testing Nodejs 0.8 found an issue in Travis CI. Not it is fixed.

@ai
Copy link
Author

ai commented Oct 30, 2018

@dougwilson if you don’t want to speed up it by some reason, just say it. It is not a big deal for me, but I will be able to remove Node.js 0.8—4 support code (it was added only to fit csrf Node.js supporting policies)

@dougwilson
Copy link
Contributor

I just need the CI to pass here to merge...

@ai
Copy link
Author

ai commented Oct 30, 2018

@dougwilson sure. I wrote to npm support to ask about storage npm behaviour on Node.js 0.8.

@ai
Copy link
Author

ai commented Oct 31, 2018

@dougwilson done, I fixed CI 🎉

@dougwilson
Copy link
Contributor

Hi @ai, awesome just saw that 🎉 . But this looks like a breaking change: it would no longer install on machines that it used to install on. Our current dependencies don't have any issue with the verison of npm that is included with Node.js 0.8. Is there any particular reason why this new module does not support the same npm version?

At minimum, if a higher version of npm is required, can you update the engines field to add this requirement? If merging this is going to require a new major version, at that point it would likely just be time to remove Node.js 0.8 support and just make Node.js 0.10 the minimum version., though. I'll need to then cascade that support drop through the modules I have that uses this one, too, of course.

Circling back around, is there any way that this new module can just run under the same version of npm the rest do? Or should this just become a larger discussion around it being a major breaking change?

@ai
Copy link
Author

ai commented Oct 31, 2018

Circling back around, is there any way that this new module can just run under the same version of npm the rest do?

I asked npm support.

I think Node.js 0.8 is too old for npm and they serve old registry for this old npm.

@dougwilson
Copy link
Contributor

Strange. Most of all my modules are Node.js 0.8+ and even ones I released last week install on Node.js 0.8 without issue, with the npm that is included out-of-the-box (npm 1.2.30).

@dougwilson
Copy link
Contributor

This is a fresh build of master on Node.js 0.8 and npm 1.2.30: https://travis-ci.org/pillarjs/csrf/jobs/448623371 . It didn't seem to have any issues downloading and installing modules from the npm registry. It's just using the standard registry https://registry.npmjs.org/

@ai
Copy link
Author

ai commented Oct 31, 2018

@dougwilson I agree that it is strange. Let’s wait for npm support answer.

@dougwilson
Copy link
Contributor

👍 there is an unreleased change to master, so I'm going to release a simple patch version of this module in the meantime, don't panic if you see it. I still think this module is a better method. I may also work to add a simple benchmark to this module in the meantime as well, so (from the usage of the two libs through this module) we can add what the perf improvement is to the changelog too 🎉

@dougwilson
Copy link
Contributor

So speaking of Node.js 0.8, I was playing around with this PR on Node.js 0.8 on my machine and I encountered a bug I basically forgot about: #10 (comment)

Basically in Node.js 0.8 (and 0.10, perhaps some more), the underlying crypto engine in OpenSSL will sometimes throw an error that it's not seeded when the process starts and you try to grab random bytes too fast. To solve that issue, I ended up creating the module https://github.com/crypto-utils/random-bytes and that is used in turn by uid-safe. Since this PR is replacing uid-safe, but it ends up just calling crypto.randomBytes directly (https://github.com/ai/nanoid/blob/master/random.js#L14) this error resurfaces.

Is it possible to get this work-around added to the nanoid module as well?

@ai
Copy link
Author

ai commented Oct 31, 2018

Basically in Node.js 0.8 (and 0.10, perhaps some more), the underlying crypto engine in OpenSSL will sometimes throw an error that it's not seeded when the process starts and you try to grab random bytes too fast.

Sure, I can add it.

@ai
Copy link
Author

ai commented Oct 31, 2018

@dougwilson I found a way to fix npm issue in Node.js 0.8 without updating npm (npm 1.x can’t understand ^ version selector)

@ai
Copy link
Author

ai commented Oct 31, 2018

I will release new Nano ID with OpenSSL issue fix in next hour

@dougwilson
Copy link
Contributor

Oh, nice, I didn't even realize that was there in the PR. I always pin dependency versions anyway to specific versions such that I can see what the changes are before users installing these modules suddenly encounter changes. It has helped a lot in the past especially when a dependency accidentally releases a breaking change to a subset of users but it takes a long time to get it resolved, and users usually are not sure where to report the issue to.

@ai
Copy link
Author

ai commented Oct 31, 2018

@dougwilson done:

  1. I added workaround for RNG not seeded error.
  2. I replace ~ to the direct version in dependencies

@dougwilson
Copy link
Contributor

Awesome, @ai ! Let me just finish up this release of csrf and then I'll test out this PR again on the same machine to double-check the error is gone 👍

@dougwilson
Copy link
Contributor

Hi @ai , sorry to come back again 😢. So I tested the updated PR (and yes, it installed nanoid 1.3.2), and I still had the same issue, the seed error. I hooked up a debugger and stepping through the code, it looks like there is still no handling to retry for the error in the code.

The very high-level code that triggers the code path is node -pe 'require("csrf")().secretSync()'. When I walked it through, it seems like it's just calling crypto.randomBytes directly.

What I'm finding is that secretSync is calling this function: https://github.com/ai/nanoid/blob/master/index.js#L21

That in turn calls the random function, defined at https://github.com/ai/nanoid/blob/master/index.js#L1

This just ends up being a direct reference to crypto.randomBytes here: https://github.com/ai/nanoid/blob/master/random.js#L14

I hope that helps! I also finished the simple benchmarks I promised and pushed to master. They are runnable with just npm run bench. I really quickly on 8.12.0 just ran them on the current master and then again on this PR branch (rebased onto master so the benchmarks are there). This is what I found:

verify is basically the same speed (that's fine)
secretSync is much faster with nanoid, that is awesome!
create is a bit slower with nanoid, which is weird

A typical application embedding this module has the flow of (1) call secret* once per user to initialize their session (2) call create on every page load that needs a token (but a lot of people through csurf just end up calling it on every request) and then (3) call verify on a page submit with a token.

What this means that the typical hot path in an app is the create call, as that is typically on every page load that needs a token (though there are apps out there that incorrectly call it on every request). In contrast, the secret* is only once per user session.

Ideally the hot path is the best one to get a speed boost, so I was kind of surprised (at least on my machine) to see the performance go down in the spot that is going to be the hottest path.

I'm not sure yet what is causing the decrease, but the increase in secret* is awesome. It's possible we can just sub nanoid in the secret* functions and keep the other module in the create function, to get the fastest path on all of them. Does that sound like a good solution, or no?

For reference these are the comparisons on my machine:

master
> csrf@3.0.6 bench ./csrf
> node benchmark/index.js

  http_parser@2.8.0
  node@8.12.0
  v8@6.2.414.66
  uv@1.19.2
  zlib@1.2.11
  ares@1.10.1-DEV
  modules@57
  nghttp2@1.32.0
  napi@3
  openssl@1.0.2p
  icu@62.1
  unicode@11.0
  cldr@33.1
  tz@2018e

> node benchmark/create.js

  create

  1 test completed.

  create x 405,126 ops/sec ±0.48% (191 runs sampled)

> node benchmark/secret.js

  secret

  3 tests completed.

  secretSync        x 334,375 ops/sec ±0.76% (183 runs sampled)
  secret - callback x  53,006 ops/sec ±0.73% (177 runs sampled)
  secret - promise  x  49,499 ops/sec ±1.00% (176 runs sampled)

> node benchmark/verify.js

  verify

  2 tests completed.

  verify - valid   x 79,736 ops/sec ±1.07% (177 runs sampled)
  verify - invalid x 78,411 ops/sec ±1.30% (180 runs sampled)
pr-14
> csrf@3.0.6 bench ./csrf
> node benchmark/index.js

  http_parser@2.8.0
  node@8.12.0
  v8@6.2.414.66
  uv@1.19.2
  zlib@1.2.11
  ares@1.10.1-DEV
  modules@57
  nghttp2@1.32.0
  napi@3
  openssl@1.0.2p
  icu@62.1
  unicode@11.0
  cldr@33.1
  tz@2018e

> node benchmark/create.js

  create

  1 test completed.

  create x 388,440 ops/sec ±0.43% (190 runs sampled)

> node benchmark/secret.js

  secret

  3 tests completed.

  secretSync        x 454,359 ops/sec ±0.48% (190 runs sampled)
  secret - callback x  58,837 ops/sec ±0.64% (178 runs sampled)
  secret - promise  x  56,138 ops/sec ±0.60% (178 runs sampled)

> node benchmark/verify.js

  verify

  2 tests completed.

  verify - valid   x 78,578 ops/sec ±1.15% (182 runs sampled)
  verify - invalid x 77,687 ops/sec ±1.32% (181 runs sampled)

P.S. thanks for being so responsive. You rock!

@ai
Copy link
Author

ai commented Oct 31, 2018

What I'm finding is that secretSync is calling this function: https://github.com/ai/nanoid/blob/master/index.js#L21

Oh, yeap. I add not seeded fix only for async version. I will add it for sync version tomorrow.

I also will think about verify performance drop

@ai
Copy link
Author

ai commented Oct 31, 2018

@dougwilson what Node.js version do you? Some optimization (like that asm.js optimisation for nanoid/non-secure) will work only on modern Node.js version (it is a good to question, what to do in this case)

@dougwilson
Copy link
Contributor

Hi @ai I was using Node.js 8.12.0, which was the version "recommended for most users" just as of last week, though they rolled over the versions now. I was just testing on what I have most of my production projects running on just as a quick test.

@ai
Copy link
Author

ai commented Oct 31, 2018

I added not seeded fix in 1.3.3 and update PR. Still investigate the performance problems of create

@ai
Copy link
Author

ai commented Oct 31, 2018

I found that if we will remove hash() from Token#create (so, we will compare pure ID generators), Nano ID will be faster. But with hash() the result is very unstable (however, rndm got a better result more often).

Can I ask you to run both branches 5 times, to get a more stable result?

If not, we can keep rndm and replace only uid-safe.

@dougwilson
Copy link
Contributor

Yes, I will run again another day; it's quite late here and I already signed off and am just on the phone. Just didn't want to leave you hanging when I saw your message.

@ai
Copy link
Author

ai commented Nov 2, 2018

@dougwilson I found the reason :D. rndm uses 62 symbols and nanoid/non-secure uses the same 64 symbols alphabet (as in nanoid). As result, these extra letters somehow affected to the hash function and made it slower. When I changed alphabet I started to get pretty similar results (nanoid/non-secure is faster, but in create 99% time spend in hashing).

As result, I replaced nanoid/non-secure back to rndm and keep nanoid for hardware random generation.

@dougwilson
Copy link
Contributor

Ah, I guess that makes sense. Sorry I was gone for a few days, I apologize. I am taking a final look over this today and was just testing it in some apps. I found a case that was accidentally not covered in the tests in this module and instead we accidentally relied on the uid-safe tests to catch: invalid callback argument value. I just added that test to master branch.

I also noticed in an app that the character set seems to have changed and there is now a ~ as part of the secret. It seems to just replace the previous - character.

Would it be OK if I pushed to this PR the following changes?

  1. Add extra guards to this module to catch non-function callback arguments
  2. Add extra Promise guard to keep the TypeError the same
  3. Change it to use the custom charset function in nanoid to keep the same character set

I already started on those changes, just since I see you've been squashing your commits together, I didn't want to surprise with a change showing up on your branch the next time you went to force push.

@ai
Copy link
Author

ai commented Nov 4, 2018

Add extra guards to this module to catch non-function callback arguments

Done

Add extra Promise guard to keep the TypeError the same

Done

Change it to use the custom charset function in nanoid to keep the same character set

We used ~ instead of - because we found that - works badly as the latest symbol of URL on old Android. But then we found that ~ is not better.

Maybe I should just release 2.0 with - instead of ~. Seems like it cause only problems.

@ai
Copy link
Author

ai commented Nov 4, 2018

@dougwilson I released Nano ID 2.0 with - instead of ~.

@dougwilson
Copy link
Contributor

Awesome thanks. I just pushed up the change for (1) and (2) as well to this PR so now all three points are hit. I'll be back at work tomorrow to test those apps again before merge :D

@dougwilson
Copy link
Contributor

P.S. I know you said the reason rndm was slightly faster was that nanoid non-secure used two extra chars. I see that nanoid has a custom formatter so I wonder if that can be used to remove the rndm dependency and still get that difference. It's just an out-loud thought, and not even saying that is needed for this PR (maybe a follow up PR after this is merged, even). Just thinking since the OP started off with the reduced dependency count as one of the benefits.

@ai
Copy link
Author

ai commented Nov 4, 2018

P.S. I know you said the reason rndm was slightly faster was that nanoid non-secure used two extra chars. I see that nanoid has a custom formatter so I wonder if that can be used to remove the rndm dependency and still get that difference.

Yeap, I added nanoid/non-secure/generate for this purpose. But on Node.js 11 benchmark results are unstable, I am not sure about performance here.

@dougwilson
Copy link
Contributor

Also (sorry, trying to just get as much done tonight to hurry this PR out tomorrow), I'm curious about this part: it seems like there are more bytes getting pulled from the OpenSSL random pool with nanoid than there was from uid-safe, even though the secretLength option is set to 18 for both tests. I would have expected 18 bytes from the pool (which happens in uid-safe), but it seems like nanoid is pulling 28 bytes from the pool. Is this expected behavior / should the readme be updated to reflect this? I guess this has something to do with the multiplication / floor happening before the call to nanoid, though I didn't expect that to change the underlying byte length.

@ai
Copy link
Author

ai commented Nov 4, 2018

but it seems like nanoid is pulling 28 bytes from the pool

It happens, because of Nano ID generate every byte to char. Yeap, it is a good point for optimization. I created issue.

Let’s make it faster a little later.

@dougwilson
Copy link
Contributor

dougwilson commented Nov 4, 2018

I saw it first on a custom Node.js build that is using the Windows Crypto instead of OpenSSL, so then wanted to quickly validate it by switching to a standard Node.js linked against OpenSSL and added the following simple code to see the requested size:

diff --git a/test/test.js b/test/test.js
index a9161f4..de8529a 100644
--- a/test/test.js
+++ b/test/test.js
@@ -1,6 +1,17 @@
 var assert = require('assert')
 var crypto = require('crypto')
 
+var _randomBytes = crypto.randomBytes
+var _randomFillSync = crypto.randomFillSync
+crypto.randomBytes = function () {
+  console.dir(arguments[0])
+  return _randomBytes.apply(this, arguments)
+}
+crypto.randomFillSync = function () {
+  console.dir(arguments[0].length)
+  return _randomFillSync.apply(this, arguments)
+}
+
 var Promise = global.Promise || require('bluebird')
 var Tokens = require('..')

Running npm test on master vs this PR is how I got the final numbers (18 bytes in master goes to 28 in this PR).

@dougwilson
Copy link
Contributor

I just tried the following patch, which forces nanoid to fallback to crypto.randomBytes and it still shows 28 there:

diff --git a/test/test.js b/test/test.js
index a9161f4..b66a30a 100644
--- a/test/test.js
+++ b/test/test.js
@@ -1,6 +1,13 @@
 var assert = require('assert')
 var crypto = require('crypto')
 
+var _randomBytes = crypto.randomBytes
+crypto.randomBytes = function () {
+  console.dir(arguments[0])
+  return _randomBytes.apply(this, arguments)
+}
+crypto.randomFillSync = null
+
 var Promise = global.Promise || require('bluebird')
 var Tokens = require('..')

@ai
Copy link
Author

ai commented Nov 4, 2018

I just tried the following patch, which forces nanoid to fallback to crypto.randomBytes and it still shows 28 there:

Yeap, it happens because of different stringification algorithm. I will optimize it after my vacation. I think it could give us a few extra % of performance.

@ai
Copy link
Author

ai commented Nov 8, 2018

Do we have anything else to fix?

@ai
Copy link
Author

ai commented Nov 21, 2018

@dougwilson I tried buffer.toString('base64').replace(… way and it reduced perfomance:

old                    339,466 ops/sec
new                    317,253 ops/sec

Seems like take few extra random bytes is better

@ai
Copy link
Author

ai commented Nov 26, 2018

@dougwilson if extra random bytes is a deal breaker for you, please close PR.

I do not have a solution right now (old way will slow down generation). Maybe I will find out later. But this PR is already taking too many forces from me. Keeping it open is bad for our minds, it will just frustrate us and keep the dangerous fear of any changes.

@ai
Copy link
Author

ai commented Dec 1, 2018

@dougwilson I will remove csrf specific changes in the middle of December (since there is no answer for a month).

@ai
Copy link
Author

ai commented Dec 10, 2018

@dougwilson ping

@ai
Copy link
Author

ai commented Jan 3, 2019

@dougwilson Hi. Seems like you reduce your GitHub activities. I hope everything is OK.

I continue to ping you, since Nano ID contains changes, which are important only for csrf and reduce npm package size for others. If you do not want to merge this PR, I need to clean them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants