-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
buffer: remove error for malformatted hex string #12012
Conversation
Rather than attempting to adjust the behavior of our code to match the error message (as in #3773) this takes the approach of adjusting the error message to more accurately report the issue that is found. (Of course, we can always implement more complete and robust error checking like in #3773 at a later date if deemed desirable and performant.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a RangeError is appropriate?
test/parallel/test-buffer-alloc.js
Outdated
@@ -512,7 +512,8 @@ assert.strictEqual(Buffer.from('=bad'.repeat(1e4), 'base64').length, 0); | |||
|
|||
// Test single hex character throws TypeError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: comment should be updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, thanks!
src/node_buffer.cc
Outdated
@@ -616,7 +616,7 @@ void Fill(const FunctionCallbackInfo<Value>& args) { | |||
enc == UCS2 ? str_obj->Length() * sizeof(uint16_t) : str_obj->Length(); | |||
|
|||
if (enc == HEX && str_length % 2 != 0) | |||
return env->ThrowTypeError("Invalid hex string"); | |||
return env->ThrowError("Invalid hex string length"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can be a bit more explicit about it? Something like String length must be a multiple of 2 if encoding is "hex"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps hex string length must be a multiple of 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm starting to think this check isn't really worthwhile and the right thing to do is remove it. This error is still misleading in that if you get it, one would reasonably expect there's more validation going on for hex strings than just a length check. One would reasonably expect that we check for valid hex chars. But we don't. We happily allow out-of-range characters just as long as the string length is divisible by two.
The upshot is puzzling behavior like this:
> Buffer.from('xx', 'hex')
<Buffer >
> Buffer.from('x', 'hex')
TypeError: Invalid hex string
at Buffer.write (buffer.js:769:21)
at fromString (buffer.js:213:18)
at Function.Buffer.from (buffer.js:105:12)
at repl:1:8
at ContextifyScript.Script.runInThisContext (vm.js:23:33)
at REPLServer.defaultEval (repl.js:339:29)
at bound (domain.js:280:14)
at REPLServer.runBound [as eval] (domain.js:293:12)
at REPLServer.onLine (repl.js:536:10)
at emitOne (events.js:101:20)
> Buffer.from('abxx', 'hex')
<Buffer ab>
> Buffer.from('abx', 'hex')
TypeError: Invalid hex string
at Buffer.write (buffer.js:769:21)
at fromString (buffer.js:213:18)
at Function.Buffer.from (buffer.js:105:12)
at repl:1:8
at ContextifyScript.Script.runInThisContext (vm.js:23:33)
at REPLServer.defaultEval (repl.js:339:29)
at bound (domain.js:280:14)
at REPLServer.runBound [as eval] (domain.js:293:12)
at REPLServer.onLine (repl.js:536:10)
at emitOne (events.js:101:20)
>
It would be a whole lot less puzzling if this happened:
> Buffer.from('xx', 'hex')
<Buffer >
> Buffer.from('x', 'hex')
<Buffer >
> Buffer.from('abxx', 'hex')
<Buffer ab>
> Buffer.from('abx', 'hex')
<Buffer ab>
>
Principle Of Least Astonishment to me says we either accept anything and do the best we can or else we do reasonably rigorous error checking.
If I got a RangeError, I would expect it to be something like the string contained characters that were invalid hex digits ( |
Reprinting an above comment down here for more visibility: I'm starting to think this check isn't really worthwhile and the right thing to do is remove it. This error is still misleading in that if you get it, one would reasonably expect there's more validation going on for hex strings than just a length check. One would reasonably expect that we check for valid hex chars. But we don't. We happily allow out-of-range characters just as long as the string length is divisible by two. The upshot is puzzling behavior like this: > Buffer.from('xx', 'hex')
<Buffer >
> Buffer.from('x', 'hex')
TypeError: Invalid hex string
at Buffer.write (buffer.js:769:21)
at fromString (buffer.js:213:18)
at Function.Buffer.from (buffer.js:105:12)
at repl:1:8
at ContextifyScript.Script.runInThisContext (vm.js:23:33)
at REPLServer.defaultEval (repl.js:339:29)
at bound (domain.js:280:14)
at REPLServer.runBound [as eval] (domain.js:293:12)
at REPLServer.onLine (repl.js:536:10)
at emitOne (events.js:101:20)
> Buffer.from('abxx', 'hex')
<Buffer ab>
> Buffer.from('abx', 'hex')
TypeError: Invalid hex string
at Buffer.write (buffer.js:769:21)
at fromString (buffer.js:213:18)
at Function.Buffer.from (buffer.js:105:12)
at repl:1:8
at ContextifyScript.Script.runInThisContext (vm.js:23:33)
at REPLServer.defaultEval (repl.js:339:29)
at bound (domain.js:280:14)
at REPLServer.runBound [as eval] (domain.js:293:12)
at REPLServer.onLine (repl.js:536:10)
at emitOne (events.js:101:20)
> It would be a whole lot less puzzling if this happened: > Buffer.from('xx', 'hex')
<Buffer >
> Buffer.from('x', 'hex')
<Buffer >
> Buffer.from('abxx', 'hex')
<Buffer ab>
> Buffer.from('abx', 'hex')
<Buffer ab>
> Principle Of Least Astonishment to me says we either accept anything and do the best we can or else we do reasonably rigorous error checking. |
That’s mostly what we do for |
Done! PTAL |
// - https://github.com/nodejs/node/issues/6770 | ||
assert.throws(() => Buffer.from('A', 'hex'), TypeError); | ||
// Test single hex character is discarded. | ||
assert.strictEqual(Buffer.from('A', 'hex').length, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test where the resulting length is >0 ?
Like Buffer.from('abx', 'hex')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@targos Good idea. Added!
Remove error message when a hex string of an incorrect length is sent to .write() or .fill(). Fixes: nodejs#3770
(Just chiming in as this relates to #4877) It's clear that there's precedent here with base64 (it clearly just swallows any invalid characters): > Buffer.from('not !**# base64 at all_... !!!!', 'base64').toString('base64')
'notbase64atall//' Would it make sense to also add "strict mode" where invalid values are rejected? In other words, I'd guess there is a not-insignificant group of folks out there who would prefer to be told when garbage goes into creating a new Buffer (since they're saying what the format should be). A strict mode wouldn't hurt those of us who just want the buffer to do "just do the best it can", and is super helpful to those of us that want to be notified when we accidentally send invalid values into the Buffer. A few options that seem reasonable:
Not sure if it matters also, but short-cut validation methods that check whether the |
@jgeewax I definitely see that sort of error-checking safety as useful behavior and while I wish it was what Node.js did, it does seem like there's no reason that validating input like that can't be implemented as a userland module. EDIT: And if a userland module can show a way to do it in a performant way, that might even be a path into Node.js core. My sense right now is that most of the objections/hesitance about expanding the error checking is the performance hit incurred. |
@jgeewax If we were designing Node from scratch, we’d probably implement one of your suggestions (which all make sense from an API perspective), but I think they all come with some kind of negative performance impact that we would want to avoid now. |
Landed in 682573c |
Remove error message when a hex string of an incorrect length is sent to .write() or .fill(). PR-URL: nodejs#12012 Fixes: nodejs#3770 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Michaël Zasso <targos@protonmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
buffer behaviour change caused by: nodejs/node#12012 landed in: nodejs/node@682573c11
Improve accuracy of the error message when a hex string of an incorrect
length is send to .write() or .fill().
Fixes: #3770
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesAffected core subsystem(s)
buffer