Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer: stricter Buffer.isBuffer #11961

Closed
wants to merge 3 commits into from

Conversation

TimothyGu
Copy link
Member

@TimothyGu TimothyGu commented Mar 21, 2017

Make "fake" Buffer subclasses whose instances are not valid Uint8Arrays fail the test.

Performance-wise, there are two options for this PR: one is to simply add an isUint8Array check and be done with it (w/o 88cf96b). It makes non-Buffers a lot faster, but Buffers slower:

                                                        improvement confidence      p.value
 buffers/buffer-isbuffer.js n=5000000 type="fake"          554.78 %        *** 3.206274e-20
 buffers/buffer-isbuffer.js n=5000000 type="fastbuffer"    -17.47 %        *** 1.807310e-11
 buffers/buffer-isbuffer.js n=5000000 type="object"        554.77 %        *** 1.516898e-38
 buffers/buffer-isbuffer.js n=5000000 type="primitive"     551.40 %        *** 2.627161e-22
 buffers/buffer-isbuffer.js n=5000000 type="slowbuffer"    -18.48 %        *** 1.807566e-06
 buffers/buffer-isbuffer.js n=5000000 type="uint8array"    -18.54 %        *** 4.267087e-11

Because of the decrease in performance of actual buffers, I exploited the fact that currently subclassing Buffer is not possible due to #4701, and only checked [[Prototype]] of the value itself (rather than going up the prototype chain as instanceof does). I can remove 88cf96b if it is deemed to be less future-proof but it does help out the performance a lot.

                                                        improvement confidence      p.value
 buffers/buffer-isbuffer.js n=5000000 type="fake"          556.85 %        *** 3.050148e-14
 buffers/buffer-isbuffer.js n=5000000 type="fastbuffer"    193.17 %        *** 7.663717e-17
 buffers/buffer-isbuffer.js n=5000000 type="object"        542.87 %        *** 2.297773e-15
 buffers/buffer-isbuffer.js n=5000000 type="primitive"     569.15 %        *** 1.922435e-17
 buffers/buffer-isbuffer.js n=5000000 type="slowbuffer"    191.37 %        *** 1.020681e-20
 buffers/buffer-isbuffer.js n=5000000 type="uint8array"    186.11 %        *** 3.797325e-13

Fixes: #11954

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)

buffer

@nodejs-github-bot nodejs-github-bot added the lib / src Issues and PRs related to general changes in the lib or src directory. label Mar 21, 2017
t(null, false);
t(undefined, false);
t(() => {}, false);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the blank line here?

'use strict';
require('../common');
const assert = require('assert');
const { Buffer } = require('buffer');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to require this, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjihrig Our linter forbids using Buffer via the global

Copy link
Member

@Trott Trott Mar 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our linter forbids using Buffer via the global

That's only in lib. Tests like this (as well as benchmarks and tools) can use the Buffer global and the linter won't complain.

@addaleax
Copy link
Member

addaleax commented Mar 21, 2017

A couple of things:

  • I would tend to say that this is semver-major?
  • As we’ve figured out in buffer: discuss future direction of Buffer constructor API #9531 (buffer: discuss future direction of Buffer constructor API #9531 (comment)), you can subclass Buffer. The fact that it has taken us a while to figure this out ourselves, and that it’s a slow and iffy solution, probably implies that nobody actually does that in practice, though. ;)
  • Imho a better¹ fix would be to just add Uint8Array support to dgram, like this. There aren’t so terribly many places in our API where that is missing, and I’ve had that on my to-do list for a while… I kinda wanted to wait to see what came out of the streams/string-decoder PRs I have open before going on, but I am beginning to feel that those are going to take a while. ;)

¹ edit: That’s orthogonal to this PR, I’m not fundamentally opposed to a change like this or anything

@jasnell jasnell requested a review from trevnorris March 21, 2017 15:10
@mscdex mscdex added buffer Issues and PRs related to the buffer subsystem. and removed lib / src Issues and PRs related to general changes in the lib or src directory. labels Mar 21, 2017
@TimothyGu
Copy link
Member Author

  • I would tend to say that this is semver-major?

It can be taken both ways, but I think it is a semver-patch since an object with merely Buffer's prototype is fundamentally not a buffer. Just as Array.isArray(Object.create(Array.prototype)) is false, Buffer.isBuffer(Object.create(Buffer.prototype)) should have been false from the beginning, and the failure to do that is a bug.

😕 That does mean that we have to walk the prototype chain instead of simply relying on the prototype being the same as Buffer's…

  • Imho a better¹ fix would be to just add Uint8Array support to dgram

As you have remarked it's orthogonal to this change, but independent with it.

@jasnell
Copy link
Member

jasnell commented Mar 21, 2017

Buffer is so broadly used that I would not feel safe at all with anything less than semver-major on this. At this point I am not entirely comfortable with this change. We need to make sure that this is in alignment with the direction we want to take with Buffer.

@addaleax
Copy link
Member

Just as Array.isArray(Object.create(Array.prototype)) is false, Buffer.isBuffer(Object.create(Buffer.prototype)) should have been false from the beginning, and the failure to do that is a bug.

Agreed! But still, sometimes bugfixes are semver-major and we have to live with that… the example from the issue is constructed enough to not worry about that, I’d say.

addaleax added a commit to addaleax/node that referenced this pull request Mar 22, 2017
@jasnell jasnell added the semver-major PRs that contain breaking changes and should be released in the next major version. label Mar 24, 2017
@jasnell
Copy link
Member

jasnell commented Mar 24, 2017

We likely need to hold off on this until we decide what we're doing with Buffer in general.

addaleax added a commit that referenced this pull request Mar 27, 2017
Fixes: #11954
Refs: #11961
PR-URL: #11985
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Timothy Gu <timothygu99@gmail.com>
@TimothyGu
Copy link
Member Author

I still would like to advance this – if only the Buffer.isBuffer change. I don't see any valid case where isUint8Array() is false but Buffer.isBuffer() should return true. I will discard the other parts of the PR replacing instanceof Buffer with Buffer.isBuffer() though, as that would be invalidated by the effort to support Uint8Array in all core modules.

@jasnell
Copy link
Member

jasnell commented Jul 31, 2017

Ping @nodejs/buffer

@trevnorris
Copy link
Contributor

I think that the JS Buffer check is used enough that performance is a concern, and simply checking whether the argument is obviously not a Buffer is acceptable. Buffers always end up being passed to C++ anyway where they need to be checked again. Or we could go the cheap route and set something like:

const is_buffer_symbol = Symbol('this is a Buffer fool!');
function Buffer() {
  this[is_buffer_symbol] = true;
}

Seems ridiculous, and of course the user could use Object.getOwnPropertySymbols to get is_buffer_symbol and make the object look like a Buffer, but that's just an edge case that will always be caught on the C++ side.

While checking the performance of the call, I'd prioritize for when isBuffer() === true. If there's no performance regression then I'm okay with just about anything.

@RReverser
Copy link
Member

(removed my previous comment about .constructor since, as @addaleax mentioned above, they can be actually subclassed)

@RReverser
Copy link
Member

@trevnorris The getOwnPropertySymbols case can be avoided by just using a WeakSet instead, but I'm not sure if it's more efficient than current prototype check.

@BridgeAR
Copy link
Member

I think the idea from @trevnorris to use a Symbol is good. It should probably yield the best performance result and it is safe for inheritance.

@TimothyGu would you mind rebasing and giving that a try?

@BridgeAR
Copy link
Member

Ping @TimothyGu once more.

@TimothyGu
Copy link
Member Author

@trevnorris

While checking the performance of the call, I'd prioritize for when isBuffer() === true. If there's no performance regression then I'm okay with just about anything.

Last time I checked, this PR actually increases the performance of isBuffer() === true.

I'll try to address the rest of the questions and get this up to date tonight.

Make "fake" Buffer subclasses whose instances are not valid Uint8Arrays
fail the test.

Fixes: nodejs#11954
@TimothyGu
Copy link
Member Author

Unfortunately the performance of isUint8Array (or any C++ methods) on TurboFan seems to be very much subpar... I have rebased this but will temporarily put this on hold.

@TimothyGu TimothyGu added the wip Issues and PRs that are still a work in progress. label Sep 22, 2017
@BridgeAR
Copy link
Member

@TimothyGu that is not totally correct. I just benchmarked this on my own and I do not see any way to improve this a lot currently anymore. The reason is not that isUint8Array is slow but because v8 6.0 improved instanceof significantly.

Comparing Node.js 8.2 to 8.3 (only three runs)

 buffers/buffer-isbuffer.js n=100000000 type="fake"          887.97 %        *** 1.347711e-07
 buffers/buffer-isbuffer.js n=100000000 type="fastbuffer"    873.74 %        *** 2.783180e-05
 buffers/buffer-isbuffer.js n=100000000 type="object"       1206.00 %        *** 1.969215e-07
 buffers/buffer-isbuffer.js n=100000000 type="primitive"    1169.38 %        *** 1.843636e-04
 buffers/buffer-isbuffer.js n=100000000 type="slowbuffer"    874.26 %        *** 1.130297e-06
 buffers/buffer-isbuffer.js n=100000000 type="subclassed"    880.14 %        *** 2.332352e-06
 buffers/buffer-isbuffer.js n=100000000 type="uint8array"   1195.64 %        *** 1.292662e-05

Comparing Node.js master with a symbol check

 buffers/buffer-isbuffer.js n=100000000 type="fake"           -6.71 %        *** 8.285148e-38
 buffers/buffer-isbuffer.js n=100000000 type="fastbuffer"    -22.44 %        *** 3.377269e-50
 buffers/buffer-isbuffer.js n=100000000 type="object"         14.22 %        *** 4.954863e-16
 buffers/buffer-isbuffer.js n=100000000 type="primitive"      -7.59 %        *** 1.682501e-21
 buffers/buffer-isbuffer.js n=100000000 type="slowbuffer"    -22.09 %        *** 5.647517e-55
 buffers/buffer-isbuffer.js n=100000000 type="subclassed"     -7.13 %        *** 2.762112e-23
 buffers/buffer-isbuffer.js n=100000000 type="uint8array"     88.69 %        *** 8.339753e-47

So using a symbol check can indeed improve the performance in some circumstances but it is a slowdown for the average case. As instanceof might actually get improved further by v8 in the future, I think it is safe to stick to the instanceof, especially as we now know that it got insanely fast (between 250-350 million operations per second in the benchmark. These numbers are so high that I am not even sure if the benchmark is indeed still testing what it should).

I am closing this as I doubt that we can really improve any instanceof checks from now on much further. If anyone disagrees, please reopen.

@BridgeAR BridgeAR closed this Sep 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. semver-major PRs that contain breaking changes and should be released in the next major version. wip Issues and PRs that are still a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants