Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client.request throws Error with setEncoding #1125

Closed
koh110 opened this issue Dec 6, 2021 · 7 comments · Fixed by #3505
Closed

client.request throws Error with setEncoding #1125

koh110 opened this issue Dec 6, 2021 · 7 comments · Fixed by #3505
Labels
bug Something isn't working Status: help-wanted This issue/pr is open for contributions

Comments

@koh110
Copy link
Contributor

koh110 commented Dec 6, 2021

Bug Description

client.request throws an error when body.json() & body.text() are called with setEncoding.

Reproducible By

import { createServer } from 'http'
import { Client } from 'undici'
import { once } from 'events'
import assert from 'assert'

const data = 'a'.repeat(100)

const server = createServer((request, response) => {
  response.end(JSON.stringify({ data }))
}).listen()

await once(server, 'listening')

const client = new Client(`http://localhost:${server.address().port}`)

try {
  const { body, headers, statusCode } = await client.request({
    path: '/',
    method: 'GET'
  })
  console.log(`response received ${statusCode}`)
  console.log('headers', headers)
  body.setEncoding('utf8')
  const json = await body.json()
  assert.strictEqual(json.data, data, 'multi byte')
} catch (error) {
  console.error('error!!:', error)
} finally {
  client.close()
  server.close()
}
$ node index.mjs
error!!: TypeError [ERR_INVALID_ARG_TYPE]: The "list[0]" argument must be an instance of Buffer or Uint8Array. Received type string ('{"data":"aaaaaaaaaaaaaaa...)
    at new NodeError (node:internal/errors:371:5)
    at Function.concat (node:buffer:559:13)
    at consumeEnd (/Users/kohtaito/dev/tmp/test_undici/node_modules/undici/lib/api/readable.js:238:33)
    at BodyReadable.<anonymous> (/Users/kohtaito/dev/tmp/test_undici/node_modules/undici/lib/api/readable.js:220:7)
    at BodyReadable.emit (node:events:390:28)
    at BodyReadable.emit (/Users/kohtaito/dev/tmp/test_undici/node_modules/undici/lib/api/readable.js:66:18)
    at endReadableNT (node:internal/streams/readable:1343:12)
    at processTicksAndRejections (node:internal/process/task_queues:83:21) {
  code: 'ERR_INVALID_ARG_TYPE'
}

Expected Behavior

Logs & Screenshots

Environment

Node.js v16.13.1
undici v4.11.0

Additional context

chunk is converted to a string by StringDecoder only when consumeStart is executed.

$ git diff
diff --git a/lib/api/readable.js b/lib/api/readable.js
index 3bb454e..8ea758e 100644
--- a/lib/api/readable.js
+++ b/lib/api/readable.js
@@ -94,6 +94,7 @@ module.exports = class BodyReadable extends Readable {

   push (chunk) {
     if (this[kConsume] && chunk !== null && !this[kReading]) {
+      console.log('BodyReadable: push', typeof chunk, this.readableEncoding)
       consumePush(this[kConsume], chunk)
       return true
     } else {
@@ -196,6 +197,7 @@ function consumeStart (consume) {
   const { _readableState: state } = consume.stream

   for (const chunk of state.buffer) {
+    console.log('consumeStart', typeof chunk, consume.stream.readableEncoding)
     consumePush(consume, chunk)
   }
$ node index.mjs
response received 200
headers {
  date: 'Mon, 06 Dec 2021 09:08:48 GMT',
  connection: 'keep-alive',
  'keep-alive': 'timeout=5',
  'content-length': '10000011'
}
consumeStart string utf8
BodyReadable: push object utf8
BodyReadable: push object utf8
BodyReadable: push object utf8
BodyReadable: push object utf8
BodyReadable: push object utf8

I think that this code needs to use consume.stream._readableState.decoder.end() in consumeEnd instead of Buffer.concat(body) if the value is in readableEncoding.
Is this the right way to fix it?

I also considered a method to overwrite StringDecoder so that it is not created in setEncoding.
However, it seems that it will be difficult to upgrade because it depends too much on the structure of Stream.

Ref: #1119

CC: @mcollina

@koh110 koh110 added the bug Something isn't working label Dec 6, 2021
@ronag ronag added the Status: help-wanted This issue/pr is open for contributions label Dec 6, 2021
@ronag
Copy link
Member

ronag commented Dec 6, 2021

I think that this code needs to use consume.stream._readableState.decoder.end() in consumeEnd instead of Buffer.concat(body) if the value is in readableEncoding.
Is this the right way to fix it?

Just check if elements in body are string or Buffer and concat accordingly.

@koh110
Copy link
Contributor Author

koh110 commented Dec 6, 2021

I think that mixing strings and buffers could result in data loss at chunk boundaries.
Isn't this a problem?

Then it seems that it can be made simple.

if (typeof chunk === 'str')  {
  chunk = Buffer.from(chunk)
} 

@ronag
Copy link
Member

ronag commented Dec 6, 2021

I think that would work.

@koh110
Copy link
Contributor Author

koh110 commented Dec 6, 2021

okey. I'll do test & create PR.
Thanks.

@koh110
Copy link
Contributor Author

koh110 commented Dec 7, 2021

@ronag
I tested this code.

diff --git a/lib/api/readable.js b/lib/api/readable.js
index 3bb454e..c1a80db 100644
--- a/lib/api/readable.js
+++ b/lib/api/readable.js
@@ -196,6 +196,10 @@ function consumeStart (consume) {
   const { _readableState: state } = consume.stream

   for (const chunk of state.buffer) {
+    if (consume.stream.readableEncoding) {
+      consumePush(consume, Buffer.from(chunk, consume.stream.readableEncoding))
+      continue
+    }
     consumePush(consume, chunk)
   }

The first reproduction code now works with this fix.

But this reproduction code doesn't work.

import { createServer } from 'http'
import { Client } from 'undici'
import { once } from 'events'
import assert from 'assert'

const data = Buffer.from('あいうえお')

const server = createServer((request, response) => {
  response.write(data.slice(0, 1))
  setTimeout(() => {
    response.write(data.slice(1))
    response.end()
  }, 100)
}).listen()

await once(server, 'listening')

const client = new Client(`http://localhost:${server.address().port}`)

try {
  const { body, headers, statusCode } = await client.request({
    path: '/',
    method: 'GET'
  })
  console.log(`response received ${statusCode}`)
  console.log('headers', headers)
  body.setEncoding('utf8')
  const text = await body.text()
  console.log(Buffer.from(text))
  console.log(Buffer.from(data))
  assert.strictEqual(text, data, 'multi byte')
} catch (error) {
  console.error('error!!:', error)
} finally {
  client.close()
  server.close()
}
$ node test-first-byte.mjs
response received 200
headers {
  date: 'Tue, 07 Dec 2021 05:31:38 GMT',
  connection: 'keep-alive',
  'keep-alive': 'timeout=5',
  'transfer-encoding': 'chunked'
}
<Buffer ef bf bd ef bf bd e3 81 84 e3 81 86 e3 81 88 e3 81 8a>
<Buffer e3 81 82 e3 81 84 e3 81 86 e3 81 88 e3 81 8a>
error!!: AssertionError [ERR_ASSERTION]: multi byte
    at file:///tmp/test_undici/index2.mjs:31:10
    at processTicksAndRejections (node:internal/process/task_queues:96:5) {
  generatedMessage: false,
  code: 'ERR_ASSERTION',
  actual: '��いうえお',
  expected: [Buffer [Uint8Array]],
  operator: 'strictEqual'
}

It looks like the first few bytes are lost. So I think we need another way.

@ronag
Copy link
Member

ronag commented Dec 7, 2021

I you open a PR with the tests then we can collaborate on it.

@koh110
Copy link
Contributor Author

koh110 commented Dec 7, 2021

OK. Thanks.
I'll open PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Status: help-wanted This issue/pr is open for contributions
Projects
None yet
2 participants