-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: add test to verify ErrnoException path #13958
Conversation
I don’t think this is a good idea. Always interpreting it as Latin-1 instead of always interpreting it as UTF-8 isn’t going to help anyone; and at least on Unices, it’s somewhat reasonable to assume UTF-8 as the default, even if there is no guarantee that the path is actually UTF-8-encoded. (Not sure about Windows but I think for consistency there would have to be something (C runtime?) that transcodes to UTF-8 for us as well.) I’d suggest dropping the /cc @bnoordhuis |
@addaleax Thanks, I was not sure about this and like you suggested perhaps dropping FIXME and commenting would be good here. |
@danbev It might also be good to have a test for this – if tests worked with your current patch, we don’t have one. Something like checking the error thrown by |
I tried that and that succeeds as it looks like that would go through the UVException function which uses UTF-8. I'll look into how a test for this can be written. |
src/node.cc
Outdated
@@ -877,8 +877,7 @@ Local<Value> ErrnoException(Isolate* isolate, | |||
|
|||
Local<String> path_string; | |||
if (path != nullptr) { | |||
// FIXME(bnoordhuis) It's questionable to interpret the file path as UTF-8. | |||
path_string = String::NewFromUtf8(env->isolate(), path); | |||
path_string = FIXED_ONE_BYTE_STRING(env->isolate(), path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, there are edge cases where even this is going to be inadequate. Those are the same edge cases that led me to add the ability to pass Buffer
instances to fs
APIs... specifically, on posix systems, it's possible for a path string to contain multiple encodings... for instance, parent directory in ASCII, directory in Windows-1250, UTF-8 filename, etc. Unfortunately it's not theoretical either... actually had this happen.
May be worthwhile deprecating path_string
and moving to a Buffer
instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's still an improvement of sorts. Any byte string is a valid Latin-1 string (although not per se a meaningful Latin-1 string) whereas conversion to UTF-8 is lossy - it's not bidirectional when the byte string contains invalid UTF-8 character sequences because those get replaced with U+FFFD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bnoordhuis I still think this change would be more confusing than helpful, given that Node defaults to UTF-8 in all places by now, encountering Latin-1 encoded paths is already pretty rare and won’t get more frequent over time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't disagree and to be clear, I don't advocate making changes haphazardly, that wouldn't help.
Aside (and for background): node's use of UTF-8 for file paths is accidental, it wasn't a design choice. V8 originally only supported UTF-8, one-byte strings weren't added until much later.
0592a8a
to
2fc1bb7
Compare
test/addons/errno-exception/test.js
Outdated
const err = binding.errno(); | ||
assert.strictEqual(err.syscall, 'syscall'); | ||
assert.strictEqual(err.errno, 10); | ||
assert.strictEqual(err.path, 'päth'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a tiny suggestion, maybe check err.message
as well :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yes :) Thanks
b83d30e
to
8de6e76
Compare
test/arm failure looks unrelatedout/Release/cctest --gtest_output=tap:cctest.tap
[==========] Running 48 tests from 6 test cases.
[----------] Global test environment set-up.
[----------] 2 tests from Base64Test
[ RUN ] Base64Test.Encode
[ OK ] Base64Test.Encode (0 ms)
[ RUN ] Base64Test.Decode
[ OK ] Base64Test.Decode (1 ms)
[----------] 2 tests from Base64Test (1 ms total)
[----------] 2 tests from EnvironmentTest
[ RUN ] EnvironmentTest.AtExitWithEnvironment
[ OK ] EnvironmentTest.AtExitWithEnvironment (31 ms)
[ RUN ] EnvironmentTest.AtExitWithArgument
Received signal 11 SEGV_MAPERR 000007e966dc
==== C stack trace ===============================
[end of stack trace] |
@addaleax Just wanted to ask if your approve still holds? I've rebased and reworded the commit message to reflect that this commit is now only adding a test and nothing else. Regarding the failure arm failure I'm going to take a closer look at it now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code itself LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo whitespace/comment nit.
test/addons/errno-exception/test.js
Outdated
const assert = require('assert'); | ||
const binding = require(`./build/${common.buildType}/binding`); | ||
const err = binding.errno(); | ||
assert.strictEqual(err.syscall, 'syscall'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit time (from the guide)...
'use strict';
const common = require('../../common');
// Verify that the path argument to node::ErrnoException() can contain UTF-8
// characters.
const assert = require('assert');
const binding = require(`./build/${common.buildType}/binding`);
const err = binding.errno();
assert.strictEqual(err.syscall, 'syscall');
assert.strictEqual(err.errno, 10);
assert.strictEqual(err.path, 'päth');
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll update this.
This commit adds a test to verify that the path argument to ErrnoException can contain UTF-8 characters.
2248040
to
32a3027
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect, thanks
This commit adds a test to verify that the path argument to ErrnoException can contain UTF-8 characters. PR-URL: nodejs#13958 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Tobias Nießen <tniessen@tnie.de> Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Landed in 95c8df1 |
This commit adds a test to verify that the path argument to
ErrnoException can contain UTF-8 characters.
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesAffected core subsystem(s)
src