Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: improve Buffer's encoding documentation #37945

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 17 additions & 13 deletions doc/api/buffer.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,17 +79,21 @@ console.log(Buffer.from('fhqwhgads', 'utf16le'));
// Prints: <Buffer 66 00 68 00 71 00 77 00 68 00 67 00 61 00 64 00 73 00>
```

Node.js buffers accept all case variations of encoding strings that they
receive. For example, UTF-8 can be specified as `'utf8'`, `'UTF8'` or `'uTf8'`.

The character encodings currently supported by Node.js are the following:

* `'utf8'`: Multi-byte encoded Unicode characters. Many web pages and other
document formats use [UTF-8][]. This is the default character encoding.
When decoding a `Buffer` into a string that does not exclusively contain
valid UTF-8 data, the Unicode replacement character `U+FFFD` � will be used
to represent those errors.
* `'utf8'` (alias: `'utf-8'`): Multi-byte encoded Unicode characters. Many web
pages and other document formats use [UTF-8][]. This is the default character
encoding. When decoding a `Buffer` into a string that does not exclusively
contain valid UTF-8 data, the Unicode replacement character `U+FFFD` � will be
used to represent those errors.

* `'utf16le'`: Multi-byte encoded Unicode characters. Unlike `'utf8'`, each
character in the string will be encoded using either 2 or 4 bytes.
Node.js only supports the [little-endian][endianness] variant of [UTF-16][].
* `'utf16le'` (alias: `'utf-16le'`): Multi-byte encoded Unicode characters.
Unlike `'utf8'`, each character in the string will be encoded using either 2
or 4 bytes. Node.js only supports the [little-endian][endianness] variant of
[UTF-16][].

* `'latin1'`: Latin-1 stands for [ISO-8859-1][]. This character encoding only
supports the Unicode characters from `U+0000` to `U+00FF`. Each character is
Expand Down Expand Up @@ -132,11 +136,11 @@ The following legacy character encodings are also supported:
* `'binary'`: Alias for `'latin1'`. See [binary strings][] for more background
on this topic. The name of this encoding can be very misleading, as all of the
encodings listed here convert between strings and binary data. For converting
between strings and `Buffer`s, typically `'utf-8'` is the right choice.
between strings and `Buffer`s, typically `'utf8'` is the right choice.

* `'ucs2'`: Alias of `'utf16le'`. UCS-2 used to refer to a variant of UTF-16
that did not support characters that had code points larger than U+FFFF.
In Node.js, these code points are always supported.
* `'ucs2'`, `'ucs-2'`: Aliases of `'utf16le'`. UCS-2 used to refer to a variant
of UTF-16 that did not support characters that had code points larger than
U+FFFF. In Node.js, these code points are always supported.

```js
Buffer.from('1ag', 'hex');
Expand Down Expand Up @@ -900,7 +904,7 @@ Returns `true` if `encoding` is the name of a supported character encoding,
or `false` otherwise.

```js
console.log(Buffer.isEncoding('utf-8'));
console.log(Buffer.isEncoding('utf8'));
// Prints: true

console.log(Buffer.isEncoding('hex'));
Expand Down
4 changes: 2 additions & 2 deletions doc/api/http.md
Original file line number Diff line number Diff line change
Expand Up @@ -2347,7 +2347,7 @@ changes:
-->

* `chunk` {string | Buffer}
* `encoding` {string} Optional, **Default**: `utf-8`
* `encoding` {string} Optional, **Default**: `utf8`
* `callback` {Function} Optional
* Returns: {this}

Expand Down Expand Up @@ -2589,7 +2589,7 @@ changes:
-->

* `chunk` {string | Buffer}
* `encoding` {string} **Default**: `utf-8`
* `encoding` {string} **Default**: `utf8`
* `callback` {Function}
* Returns {boolean}

Expand Down
2 changes: 1 addition & 1 deletion doc/api/stream.md
Original file line number Diff line number Diff line change
Expand Up @@ -2759,7 +2759,7 @@ const fs = require('fs');

pipeline(
fs.createReadStream('object.json')
.setEncoding('utf-8'),
.setEncoding('utf8'),
new Transform({
decodeStrings: false, // Accept string input rather than Buffers
construct(callback) {
Expand Down