Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer: remove error for malformatted hex string #12012

Closed
wants to merge 1 commit into from

Conversation

Trott
Copy link
Member

@Trott Trott commented Mar 23, 2017

Improve accuracy of the error message when a hex string of an incorrect
length is send to .write() or .fill().

Fixes: #3770

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines
Affected core subsystem(s)

buffer

@Trott Trott added buffer Issues and PRs related to the buffer subsystem. semver-major PRs that contain breaking changes and should be released in the next major version. labels Mar 23, 2017
@Trott Trott added this to the 8.0.0 milestone Mar 23, 2017
@nodejs-github-bot nodejs-github-bot added buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. labels Mar 23, 2017
@Trott
Copy link
Member Author

Trott commented Mar 23, 2017

Rather than attempting to adjust the behavior of our code to match the error message (as in #3773) this takes the approach of adjusting the error message to more accurately report the issue that is found. (Of course, we can always implement more complete and robust error checking like in #3773 at a later date if deemed desirable and performant.)

@Trott
Copy link
Member Author

Trott commented Mar 24, 2017

Copy link
Member

@targos targos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a RangeError is appropriate?

@@ -512,7 +512,8 @@ assert.strictEqual(Buffer.from('=bad'.repeat(1e4), 'base64').length, 0);

// Test single hex character throws TypeError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: comment should be updated

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks!

@@ -616,7 +616,7 @@ void Fill(const FunctionCallbackInfo<Value>& args) {
enc == UCS2 ? str_obj->Length() * sizeof(uint16_t) : str_obj->Length();

if (enc == HEX && str_length % 2 != 0)
return env->ThrowTypeError("Invalid hex string");
return env->ThrowError("Invalid hex string length");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can be a bit more explicit about it? Something like String length must be a multiple of 2 if encoding is "hex"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps hex string length must be a multiple of 2

Copy link
Member Author

@Trott Trott Mar 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm starting to think this check isn't really worthwhile and the right thing to do is remove it. This error is still misleading in that if you get it, one would reasonably expect there's more validation going on for hex strings than just a length check. One would reasonably expect that we check for valid hex chars. But we don't. We happily allow out-of-range characters just as long as the string length is divisible by two.

The upshot is puzzling behavior like this:

> Buffer.from('xx', 'hex')
<Buffer >
> Buffer.from('x', 'hex')
TypeError: Invalid hex string
    at Buffer.write (buffer.js:769:21)
    at fromString (buffer.js:213:18)
    at Function.Buffer.from (buffer.js:105:12)
    at repl:1:8
    at ContextifyScript.Script.runInThisContext (vm.js:23:33)
    at REPLServer.defaultEval (repl.js:339:29)
    at bound (domain.js:280:14)
    at REPLServer.runBound [as eval] (domain.js:293:12)
    at REPLServer.onLine (repl.js:536:10)
    at emitOne (events.js:101:20)
> Buffer.from('abxx', 'hex')
<Buffer ab>
> Buffer.from('abx', 'hex')
TypeError: Invalid hex string
    at Buffer.write (buffer.js:769:21)
    at fromString (buffer.js:213:18)
    at Function.Buffer.from (buffer.js:105:12)
    at repl:1:8
    at ContextifyScript.Script.runInThisContext (vm.js:23:33)
    at REPLServer.defaultEval (repl.js:339:29)
    at bound (domain.js:280:14)
    at REPLServer.runBound [as eval] (domain.js:293:12)
    at REPLServer.onLine (repl.js:536:10)
    at emitOne (events.js:101:20)
> 

It would be a whole lot less puzzling if this happened:

> Buffer.from('xx', 'hex')
<Buffer >
> Buffer.from('x', 'hex')
<Buffer >
> Buffer.from('abxx', 'hex')
<Buffer ab>
> Buffer.from('abx', 'hex')
<Buffer ab>
>

Principle Of Least Astonishment to me says we either accept anything and do the best we can or else we do reasonably rigorous error checking.

@Trott
Copy link
Member Author

Trott commented Mar 24, 2017

Maybe a RangeError is appropriate?

If I got a RangeError, I would expect it to be something like the string contained characters that were invalid hex digits (/[^a-fA-F0-9]/). This error we're talking about here seems more like "invalid formatting" than "out of acceptable range" to me...

@Trott
Copy link
Member Author

Trott commented Mar 24, 2017

Reprinting an above comment down here for more visibility:

I'm starting to think this check isn't really worthwhile and the right thing to do is remove it. This error is still misleading in that if you get it, one would reasonably expect there's more validation going on for hex strings than just a length check. One would reasonably expect that we check for valid hex chars. But we don't. We happily allow out-of-range characters just as long as the string length is divisible by two.

The upshot is puzzling behavior like this:

> Buffer.from('xx', 'hex')
<Buffer >
> Buffer.from('x', 'hex')
TypeError: Invalid hex string
    at Buffer.write (buffer.js:769:21)
    at fromString (buffer.js:213:18)
    at Function.Buffer.from (buffer.js:105:12)
    at repl:1:8
    at ContextifyScript.Script.runInThisContext (vm.js:23:33)
    at REPLServer.defaultEval (repl.js:339:29)
    at bound (domain.js:280:14)
    at REPLServer.runBound [as eval] (domain.js:293:12)
    at REPLServer.onLine (repl.js:536:10)
    at emitOne (events.js:101:20)
> Buffer.from('abxx', 'hex')
<Buffer ab>
> Buffer.from('abx', 'hex')
TypeError: Invalid hex string
    at Buffer.write (buffer.js:769:21)
    at fromString (buffer.js:213:18)
    at Function.Buffer.from (buffer.js:105:12)
    at repl:1:8
    at ContextifyScript.Script.runInThisContext (vm.js:23:33)
    at REPLServer.defaultEval (repl.js:339:29)
    at bound (domain.js:280:14)
    at REPLServer.runBound [as eval] (domain.js:293:12)
    at REPLServer.onLine (repl.js:536:10)
    at emitOne (events.js:101:20)
> 

It would be a whole lot less puzzling if this happened:

> Buffer.from('xx', 'hex')
<Buffer >
> Buffer.from('x', 'hex')
<Buffer >
> Buffer.from('abxx', 'hex')
<Buffer ab>
> Buffer.from('abx', 'hex')
<Buffer ab>
>

Principle Of Least Astonishment to me says we either accept anything and do the best we can or else we do reasonably rigorous error checking.

@addaleax
Copy link
Member

accept anything and do the best we can

That’s mostly what we do for base64, so doing it for hex too sounds good to me

@Trott
Copy link
Member Author

Trott commented Mar 25, 2017

That’s mostly what we do for base64, so doing it for hex too sounds good to me

Done! PTAL

@Trott Trott changed the title buffer: better error for malformated hex string buffer: remove error for malformated hex string Mar 25, 2017
@Trott Trott changed the title buffer: remove error for malformated hex string buffer: remove error for malformatted hex string Mar 25, 2017
@addaleax
Copy link
Member

@targos @cjihrig You probably want to re-review too now that this PR does something quite different :)

// - https://github.com/nodejs/node/issues/6770
assert.throws(() => Buffer.from('A', 'hex'), TypeError);
// Test single hex character is discarded.
assert.strictEqual(Buffer.from('A', 'hex').length, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test where the resulting length is >0 ?
Like Buffer.from('abx', 'hex')

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@targos Good idea. Added!

Remove error message when a hex string of an incorrect length is sent
to .write() or .fill().

Fixes: nodejs#3770
@jgeewax
Copy link

jgeewax commented Mar 26, 2017

(Just chiming in as this relates to #4877)

It's clear that there's precedent here with base64 (it clearly just swallows any invalid characters):

> Buffer.from('not !**# base64 at all_... !!!!', 'base64').toString('base64')
'notbase64atall//'

Would it make sense to also add "strict mode" where invalid values are rejected?

In other words, I'd guess there is a not-insignificant group of folks out there who would prefer to be told when garbage goes into creating a new Buffer (since they're saying what the format should be). A strict mode wouldn't hurt those of us who just want the buffer to do "just do the best it can", and is super helpful to those of us that want to be notified when we accidentally send invalid values into the Buffer.

A few options that seem reasonable:

  • Flags like RegExp: Buffer.from(input, 'hex', 's')
  • Boolean for strict-mode: Buffer.from(input, 'base64', true)
  • A separate encoding type: Buffer.from(input, 'hex-strict')

Not sure if it matters also, but short-cut validation methods that check whether the Buffer.from(input, 'base64').toString('base64') == input could be useful (this might be out of scope of the Buffer class though...)

@Trott
Copy link
Member Author

Trott commented Mar 26, 2017

@jgeewax I definitely see that sort of error-checking safety as useful behavior and while I wish it was what Node.js did, it does seem like there's no reason that validating input like that can't be implemented as a userland module. EDIT: And if a userland module can show a way to do it in a performant way, that might even be a path into Node.js core. My sense right now is that most of the objections/hesitance about expanding the error checking is the performance hit incurred.

@addaleax
Copy link
Member

@jgeewax If we were designing Node from scratch, we’d probably implement one of your suggestions (which all make sense from an API perspective), but I think they all come with some kind of negative performance impact that we would want to avoid now.

@Trott
Copy link
Member Author

Trott commented Mar 28, 2017

@Trott
Copy link
Member Author

Trott commented Mar 28, 2017

Landed in 682573c

@Trott Trott closed this Mar 28, 2017
Trott added a commit to Trott/io.js that referenced this pull request Mar 28, 2017
Remove error message when a hex string of an incorrect length is sent
to .write() or .fill().

PR-URL: nodejs#12012
Fixes: nodejs#3770
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Michaël Zasso <targos@protonmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
@jasnell jasnell mentioned this pull request Apr 4, 2017
leonardodino added a commit to leonardodino/basic-crypto that referenced this pull request Oct 31, 2017
buffer behaviour change
caused by: nodejs/node#12012
landed in: nodejs/node@682573c11
@Trott Trott deleted the hex-fix branch January 13, 2022 22:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. semver-major PRs that contain breaking changes and should be released in the next major version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

buffer: inconsistent string validation
10 participants