-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Buffer(number) is unsafe #4660
Comments
For those interested in getting the new behavior we propose without a change to node core, we published a user-land package: |
Now, the new Buffer(number, encoding) will throw exception, detail is here: 3b27dd5 . I introduce a new API Buffer.encode() for encoding case. |
You could make similar arguments about It should be noted that the primary reason for not zero-filling by default is performance. So |
This discussion has been had several times in the last few weeks. If a "high-risk functionality" application doesn't have proper type checking in place then it already has a security issue. This problem can't be dished into Buffer as if it's at fault. Not zero filling the Buffer is completely a performance issue, but the impact on performance is more than trivial. Even at allocations as low as 1KB. And while this type of impact probably wouldn't harm your standard web app, it can noticeably affect performance of a node process. The concern is when a number is passed when something else is expected. Again this is a bug in the developers code, and TBH it doesn't make much sense to impact the performance of the many modules out there today and force them to update their code to use new syntax because of that. And let's be honest, in the years that Buffer has worked exactly this way how many times has this been reportedly seen as an issue in the wild. The only reason you're here now is because it affected modules you were directly involved with. |
@rvagg @trevnorris You both raise good points. Node is not the browser, and performance is critical. The difference with modules like
You can pass in:
The API seems like it's trying to lay a trap for the user. Look at the contributors on each of those repos. Some of the most talented node.js developers looked at this code extensively and didn't notice this issue for years. |
To be clear, I'm not suggesting that node core should use zeroed-out buffers. It would continue using unallocated buffers, but that functionality would move to a different API away from the one that looks and behaves (and now actually is) a |
-1 to changing |
That would not fix anything, there is already a |
I see around 20-25% perf regression using
|
Nice. Remember too, this is not a 20-25% perf regression across the board. It's only going to apply when all of these conditions are true:
Node.js core would not be slower. Uses of |
I think I'm in support of this, but I'm not aware of all the use cases for Instead of introducing new API, how about an opt-in? Buffer(100); // safe
Buffer.unsafe = true;
Buffer(100); // unsafe |
Thanks feross for explaining this well, I learned something tonight! I am relatively new to Node - I gained my open source chops in Python land. In Python, slow performance is sort of a given. If you want to make a Python program more performant, you can optimize it in certain ways, which are well known. Performance then is an advanced problem. The old adage "make it work, then make it fast" works well for these kinds of dynamic, interpreted languages. After all, we are not writing in C here, and the beauty of languages like Node and Python is the ease with which external, more performant packages can be accessed when necessary. 20-25% is actually quite a lot slower, true -- but how many calls to Buffer are using Buffer(Number) and require high performance? It seems as though this being fixed in core will bring great gratitude from newcomers and old hats alike. As a relatively new node programmer, I am certainly grateful - my memory won't be stolen! |
I generally agree that the API is problematic. I just want to point out, that fixing var someEntropy = (new Buffer(256)).toString('hex') Updating to a new node major version with the proposed new Buffer API would not break the code, but would introduce (an even higher) security risk to that code, without the user noticing right away. I just wanted to point that out for the sake of argument. |
@finnp that API is so insecure that I would flag it as a security vulnerability, for the same assumption of this PR. So, I'm 👎 on anything that make node (or apps that runs on node) slower. However, I see the security vulnerability. My gut tells me: let's benchmark this change in node. Let's benchmark this change in userland (all the popular modules have benchmarks), and let's see where we stand. Does express gets slower? or Hapi? or socket.io? To what degree? I think the problem is that |
I think anyone depending on that kind of buffer entropy is already walking in the land of the undocumented, and the security risk represented by the issue versus the pretty modest performance ding make this one a no brainer. To put this back into a C context, returning memory without a memset in response to a 'client' request is effectively insanity. |
+1 on making uninitialized Buffer creation a separate function in the API, and removing it from the "type-guessing" constructor, or replacing it with a 'safe' variant. This kind of potentially dangerous functionality should be explicitly opt-in. I'm not sure why there's any discussion about it at all, to be honest. It's well understood that having safe defaults considerably decreases the chance of vulnerabilities, and moving uninitialized creation to a separate function would not incur any performance issues for maintained code (because they can simply use the new function if they know what they are doing, and actually intend uninitialized creation). The unmaintained code would not receive security patches anyway, and in those cases a slowdown is better than an outright vulnerability. Node.js, as a project that follows semantic versioning, is in a fairly unique position to actually be able to make such patches without widespread ecosystem breakage.
I would vote against a global opt-in. It doesn't really fix the problem (as a single intended opt-in means your entire codebase is now unprotected), while still introducing the same kind of break. That, plus global state gets messy in general.
It absolutely does. Safe defaults are the single most important thing a language/runtime can possibly provide for fostering a secure ecosystem around it. No matter how competent people are, they make mistakes, and the amount of possibly footguns should be reduced to an absolute minimum. The code change would be trivial, and provide significant security gains.
Observance of security issues does not correlate to severity or how badly they need to be solved. Not to mention that you have no idea whether this has been discovered (and exploited) in the wild or not, nor any way to check, nor would the perpetrators have any reason to disclose as such. The security community has spent several decades now understanding how to assess the severity of a security issue and how to prevent common issues, and it would probably be advisable to work from that knowledge, rather than repeating that entire process again in the Node.js community and exposing many users to danger in the process.
Secure opt-ins don't work. See also this and the general principle of people following the path of least resistance. |
While I first thought that making Making Documenting that in a way «zero-fills since 6.x, zero-fill it twice if you use a lower version» would not help, Alice could think «I don't need 5.x» and publish a lib that does not manually zero-fill Alternate proposal:
Thoughts? This will also solve unintentional calls to |
@finnp Anyone who uses the code you provided is already doomed. |
At this point it's easier just to soft-deprecate |
I think this is even harder than changing Buffer api, as Buffer is part of the Stream API, and also a lot of C++ code. |
@vkurchatkin |
it shouldn't be that hard to change, since Buffers ARE typed arrays now.
The only concern is instantiation. We can still provide a function to create uninitialized typed array or array buffer.
I mean not really deprecating them, but discouraging and maybe eventually removing from docs. |
I'm +1 on the general approach @ChALkeR suggests in #4660 (comment) By introducing two new alternative factory functions (one safe, one unsafe) and deprecating the current unsafe constructor, we ensure that existing code isn't adversely affected while giving new code an appropriate path forward. Changing the expected behavior of the Buffer constructor or even deprecating Buffer altogether does not address all of the requirements here. One tweak I would make to @ChALkeR's suggestion is: instead of |
Some comments, in no particular order:
@vkurchatkin Except that
@Karissa Core's are probably most important, but if you are, say, running a heavy realtime websocket application, and
Python is a tricky comparison because the language just comes with everything. Sure, we expect JavaScript to not execute as fast as C, or even Java, but in Node, being able to achieve high throughput of I/O operations is pretty darn important. That's what we do in a nutshell after all.
I would generally agree with this.
I think this the most sound proposal here, although I wouldn't mind tweaking it like @jasnell pointed out. |
-1
+1
-1
+2
-1 |
#4682 has landed adding safe constructors for v6. |
Wow, epic 🚀 – amazing job! |
Excellent! |
Related: #5799. |
Hey folks! If you followed this issue, then you're probably like me and want to start using There's a better way! You can use
Hope someone finds this useful! |
commit 3827da752d3b2eb86613b92ffb6715806d799fdc Author: Marius <maerious@gmail.com> Date: Tue May 30 21:31:26 2017 +0200 Transpile changes commit 025aac69f827e6c0f51215609070bab6b76cc491 Author: Marius <maerious@gmail.com> Date: Tue May 30 21:30:41 2017 +0200 Add test for number metadata commit 35b470af8386ff96230e09408a3ba83c6831b3a0 Merge: 902e4bf 74e4500 Author: Marius <maerious@gmail.com> Date: Tue May 30 21:26:52 2017 +0200 Merge branch 'fix/buffer-leak' of https://github.com/goto-bus-stop/tus-js-client into buffer commit 74e4500 Author: Renée Kooi <renee@kooi.me> Date: Tue May 30 17:18:44 2017 +0200 fix lint commit a420570 Author: Renée Kooi <renee@kooi.me> Date: Tue May 30 15:43:13 2017 +0200 fix buffer initialization in base64 encoding If a number was passed to `encode()`, the buffer would be created with uninitialised memory. This patch casts anything that's passed in to a string first and then uses the safe `Buffer.from` API. `Buffer.from` was added in Node v4 but the `buffer-from` module ponyfills it for older Node versions. See [nodejs/node#4660](nodejs/node#4660)
tl;dr
This issue proposes:
new Buffer(number)
to return safe, zeroed-out memoryBuffer.alloc(number)
Update: Jan 15, 2016
Upon further consideration, I think that returning zeroed out memory is a separate issue. The core issue is: unsafe buffer allocation should be in a different API.
I now support adding two APIs:
Buffer.from(value)
- convert from any type to a bufferBuffer.alloc(size)
- create an uninitialized buffer with given sizeThis solves the core problem that affected
ws
andbittorrent-dht
which isBuffer(variable)
getting tricked into taking a number argument.Why is
Buffer
unsafe?Today, the node.js
Buffer
constructor is overloaded to handle many different argument types likeString
,Array
,Object
,TypedArrayView
(Uint8Array
, etc.),ArrayBuffer
, and alsoNumber
.The API is optimized for convenience: you can throw any type at it, and it will try to do what you want.
Because the Buffer constructor is so powerful, you often see code like this:
_But what happens if
toHex
is called with aNumber
argument?_Remote Memory Disclosure
If an attacker can make your program call the
Buffer
constructor with aNumber
argument, then they can make it allocate uninitialized memory from the node.js process. This could potentially disclose TLS private keys, user data, or database passwords.When the
Buffer
constructor is passed aNumber
argument, it returns an UNINITIALIZED block of memory of the specifiedsize
. When you create aBuffer
like this, you MUST overwrite the contents before returning it to the user.Would this ever be a problem in real code?
Yes. It's surprisingly common to forget to check the type of your variables in a dynamically-typed language like JavaScript.
Usually the consequences of assuming the wrong type is that your program crashes with an uncaught exception. But the failure mode for forgetting to check the type of arguments to the
Buffer
constructor is more catastrophic.Here's an example of a vulnerable service that takes a JSON payload and converts it to hex:
In this example, an http client just has to send:
and it will get back 1,000 bytes of uninitialized memory from the server.
This is a very serious bug. It's similar in severity to the the Heartbleed bug that allowed disclosure of OpenSSL process memory by remote attackers.
Which real-world packages were vulnerable?
bittorrent-dht
@mafintosh and I found this issue in one of our own packages,
bittorrent-dht
. The bug would allow anyone on the internet to send a series of messages to a user ofbittorrent-dht
and get them to reveal 20 bytes at a time of uninitialized memory from the node.js process.Here's the commit that fixed it. We released a new fixed version, created a Node Security Project disclosure, and deprecated all vulnerable versions on npm so users will get a warning to upgrade to a newer version.
ws
That got us wondering if there were other vulnerable packages. Sure enough, within a short period of time, we found the same issue in
ws
, the most popular WebSocket implementation in node.js.If certain APIs were called with
Number
parameters instead ofString
orBuffer
as expected, then uninitialized server memory would be disclosed to the remote peer.These were the vulnerable methods:
Here's a vulnerable socket server with some echo functionality:
socket.send(number)
called on the server, will disclose server memory.Here's the release where the issue was fixed, with a more detailed explanation. Props to @3rd-Eden for the quick fix. Here's the Node Security Project disclosure.
What's the solution?
It's important that node.js offers a fast way to get memory otherwise performance-critical applications would needlessly get a lot slower.
But we need a better way to signal our intent as programmers. When we want uninitialized memory, we should request it explicitly.
Sensitive functionality should not be packed into a developer-friendly API that loosely accepts many different types. This type of API encourages the lazy practice of passing variables in without checking the type very carefully.
Buffer.alloc(number)
The functionality of creating buffers with uninitialized memory should be part of another API. We propose
Buffer.alloc(number)
. This way, it's not part of an API that frequently gets user input of all sorts of different types passed into it.How do we fix node.js core?
We sent a PR (merged as
semver-major
) which defends against one case:In this situation, it's implied that the programmer intended the first argument to be a string, since they passed an encoding as a second argument. Today, node.js will allocate uninitialized memory in the case of
new Buffer(number, encoding)
, which is probably not what the programmer intended.But this is only a partial solution, since if the programmer does
new Buffer(variable)
(without anencoding
parameter) there's no way to know what they intended. Ifvariable
is sometimes a number, then uninitialized memory will sometimes be returned.What's the real long-term fix?
We could deprecate and remove
new Buffer(number)
and useBuffer.alloc(number)
when we need uninitialized memory. But that would break 1000s of packages. So that's a no-go.Instead, we believe the best solution is to:
new Buffer(number)
to return safe, zeroed-out memoryBuffer.alloc(number)
This way, existing code continues working and the impact on the npm ecosystem will be minimal. Over time, npm maintainers can migrate performance-critical code to use
Buffer.alloc(number)
instead ofnew Buffer(number)
.Conclusion
We think there's a serious design issue with the
Buffer
API as it exists today. It promotes insecure software by putting high-risk functionality into a convenient API with friendly "developer ergonomics".This wasn't merely a theoretical exercise because we found the issue in some of the most popular npm packages.
Eventually, we hope that node.js core can switch to this new, safer behavior. We believe the impact on the ecosystem would be minimal since it's not a breaking change. Well-maintained, popular packages would be updated to use
Buffer.alloc
quickly, while older, insecure packages would magically become safe from this attack vector.The text was updated successfully, but these errors were encountered: