-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: add tests for invalid UTF-8 #40351
test: add tests for invalid UTF-8 #40351
Conversation
0995716
to
18827d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Run with --expose-internals flag
const assert = require('assert');
const { toUSVString } = require('internal/util');
const decoder = new TextDecoder();
const chunk = Buffer.from([0x66, 0x6f, 0x6f, 0xed, 0xa0, 0x80]); // foo + U+D800
const str = decoder.decode(chunk);
assert.strictEqual(toUSVString(str), 'foo\ufffd');
TextDecoder()
already replaces each byte of the surrogate code point with U+FFFD.
Hmm both Chrome and Safari work like this PR: new Blob([new Uint8Array([0x66, 0x6f, 0x6f, 0xed, 0xa0, 0x80])]).text()
.then((str) => str === 'foo\ufffd\ufffd\ufffd')
.then(console.log); but this is not consistent with const { toUSVString } = require('internal/util');
toUSVString('foo\ud800'); // returns 'foo\ufffd'; |
|
Does your tests pass without the fix? |
yes. I ran test cases with |
Then I don't think this needs fixing. I would propose you remove the changes but keep the tests. We should still merge the tests. |
@lpinca @ronag I tried with |
All surrogates are 3 bytes and in this range: byte 1 = ED
That's because |
18827d1
to
5fa492a
Compare
Since the readable web streams already returns USVString The changes made to blob.js and consumers.js are reverted Test cases are added to check surrogate to USVString conversion for ReadablewebStreams Textdecoder and Blob PR-URL: nodejs#40351 Fixes: nodejs#39804
PR-URL: nodejs#40351 Fixes: nodejs#39804
{ | ||
const passthrough = new PassThrough(); | ||
|
||
text(passthrough).then(common.mustCall(async (str) => { | ||
assert.strictEqual(str.length, 11); | ||
assert.deepStrictEqual(str, 'hellothere\ufffd'); | ||
})); | ||
|
||
passthrough.write('hello'); | ||
setTimeout(() => passthrough.end('there\ud801'), 10); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not make much sense because passthrough.write()
and passthrough.end()
will call Buffer.from()
when the chunk is written. If anything, chunks should be Buffer
s or Uint8Array
s.
{ | ||
const decoder = new TextDecoder(); | ||
const chunk = Buffer.from('\ud807'); | ||
const str = decoder.decode(chunk); | ||
assert.strictEqual(str, '\ufffd'); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the one above this does not make much sense to me. When decoder.decode(chunk)
is called, chunk
is already U+FFFD (valid UTF-8).
4fb8c8a
to
e5857ba
Compare
@git-srinivas if you can please
|
Sure @lpinca I'll do the changes |
22de91f
to
50b943b
Compare
@git-srinivas can I suggest something like this?
I find the current commit message a bit misleading because Thank you. |
Verify that `Blob.prototype.text()`, `streamConsumers.text()` and `TextDecoder.prototype.decode()` work as expected with invalid UTF-8. Fixes: nodejs#39804
50b943b
to
75c69af
Compare
Verify that `Blob.prototype.text()`, `streamConsumers.text()` and `TextDecoder.prototype.decode()` work as expected with invalid UTF-8. PR-URL: #40351 Fixes: #39804 Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Robert Nagy <ronagy@icloud.com>
Landed in dc35aef. |
Verify that `Blob.prototype.text()`, `streamConsumers.text()` and `TextDecoder.prototype.decode()` work as expected with invalid UTF-8. PR-URL: #40351 Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Robert Nagy <ronagy@icloud.com>
Verify that `Blob.prototype.text()`, `streamConsumers.text()` and `TextDecoder.prototype.decode()` work as expected with invalid UTF-8. PR-URL: nodejs#40351 Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Robert Nagy <ronagy@icloud.com>
Verify that `Blob.prototype.text()`, `streamConsumers.text()` and `TextDecoder.prototype.decode()` work as expected with invalid UTF-8. PR-URL: #40351 Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Robert Nagy <ronagy@icloud.com>
Verify that `Blob.prototype.text()`, `streamConsumers.text()` and `TextDecoder.prototype.decode()` work as expected with invalid UTF-8. PR-URL: #40351 Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Robert Nagy <ronagy@icloud.com>
Verify that `Blob.prototype.text()`, `streamConsumers.text()` and `TextDecoder.prototype.decode()` work as expected with invalid UTF-8. PR-URL: #40351 Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Robert Nagy <ronagy@icloud.com>
Fixes: #39804