Skip to content

Commit e419b63

Browse files
author
Alan Shaw
authored
feat: custom length encoding and decoding (#8)
This PR allows a custom length encoding/decoding function to be passed to `encode`/`decode`. There's also a int32BE fixed length encoder/decoder. e.g. ```js const lp = require('it-length-prefixed') const { int32BEDecode, int32BEDecode } = require('it-length-prefixed') await pipe( [Buffer.from('hello world')], lp.encode({ lengthEncoder: int32BEDecode }), lp.decode({ lengthDecoder: int32BEDecode }), async source => { for await (const chunk of source) { console.log(chunk.toString()) } } ) ``` See updated README for more info. BREAKING CHANGE: Additional validation now checks for messages with a length that is too long to prevent a possible DoS attack. The error code `ERR_MSG_TOO_LONG` has changed to `ERR_MSG_DATA_TOO_LONG` and the error code `ERR_MSG_LENGTH_TOO_LONG` has been added. License: MIT Signed-off-by: Alan Shaw <alan.shaw@protocol.ai>
1 parent b651ba4 commit e419b63

15 files changed

+216
-65
lines changed

.travis.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
language: node_js
22
node_js:
33
- 10
4+
- 12
45

56
script:
67
- npm run lint

README.md

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -61,23 +61,31 @@ console.log(decoded)
6161

6262
- `opts: Object`, optional
6363
- `poolSize: 10 * 1024`: Buffer pool size to allocate up front
64+
- `minPoolSize: 8`: The minimum size the pool can be before it is re-allocated. Note: it is important this value is greater than the maximum value that can be encoded by the `lengthEncoder` (see the next option). Since encoded lengths are written into a buffer pool, there needs to be enough space to hold the encoded value.
65+
- `lengthEncoder: Function`: A function that encodes the length that will prefix each message. By default this is a [`varint`](https://www.npmjs.com/package/varint) encoder. It is passed a `value` to encode, an (optional) `target` buffer to write to and an (optional) `offset` to start writing from. The function should encode the `value` into the `target` (or alloc a new Buffer if not specified), set the `lengthEncoder.bytes` value (the number of bytes written) and return the `target`.
66+
- The following additional length encoders are available:
67+
- **int32BE** - `const { int32BEEncode } = require('it-length-prefixed')`
6468

65-
All messages will be prefixed with a varint.
69+
Returns a [transform](https://gist.github.com/alanshaw/591dc7dd54e4f99338a347ef568d6ee9#transform-it) that yields [`BufferList`](https://www.npmjs.com/package/bl) objects. All messages will be prefixed with a length, determined by the `lengthEncoder` function.
6670

67-
Returns a [transform](https://gist.github.com/alanshaw/591dc7dd54e4f99338a347ef568d6ee9#transform-it) that yields [`BufferList`](https://www.npmjs.com/package/bl) objects.
68-
69-
### `encode.single(chunk)`
71+
### `encode.single(chunk, [opts])`
7072

7173
- `chunk: Buffer|BufferList` chunk to encode
74+
- `opts: Object`, optional
75+
- `lengthEncoder: Function`: See description above. Note that this encoder will _not_ be passed a `target` or `offset` and so will need to allocate a buffer to write to.
7276

7377
Returns a `BufferList` containing the encoded chunk.
7478

7579
### `decode([opts])`
7680

7781
- `opts: Object`, optional
78-
- `maxDataLength`: If provided, will not decode messages longer than the size specified, if omitted will use the current default of 4MB.
82+
- `maxLengthLength`: If provided, will not decode messages whose length section exceeds the size specified, if omitted will use the default of 147 bytes.
83+
- `maxDataLength`: If provided, will not decode messages whose data section exceeds the size specified, if omitted will use the default of 4MB.
7984
- `onLength(len: Number)`: Called for every length prefix that is decoded from the stream
8085
- `onData(data: BufferList)`: Called for every chunk of data that is decoded from the stream
86+
- `lengthDecoder: Function`: A function that decodes the length that prefixes each message. By default this is a [`varint`](https://www.npmjs.com/package/varint) decoder. It is passed some `data` to decode which is a [`BufferList`](https://www.npmjs.com/package/bl). The function should decode the length, set the `lengthDecoder.bytes` value (the number of bytes read) and return the length. If the length cannot be decoded, the function should throw a `RangeError`.
87+
- The following additional length decoders are available:
88+
- **int32BE** - `const { int32BEDecode } = require('it-length-prefixed')`
8189

8290
Returns a [transform](https://gist.github.com/alanshaw/591dc7dd54e4f99338a347ef568d6ee9#transform-it) that yields [`BufferList`](https://www.npmjs.com/package/bl) objects.
8391

@@ -87,8 +95,10 @@ Behaves like `decode` except it only reads the exact number of bytes needed for
8795

8896
- `reader: Reader`: An [it-reader](https://github.com/alanshaw/it-reader)
8997
- `opts: Object`, optional
90-
- `maxDataLength`: If provided, will not decode messages longer than the size specified, if omitted will use the current default of 4MB.
98+
- `maxLengthLength`: If provided, will not decode messages whose length section exceeds the size specified, if omitted will use the default of 147 bytes.
99+
- `maxDataLength`: If provided, will not decode messages whose data section exceeds the size specified, if omitted will use the default of 4MB.
91100
- `onData(data: BufferList)`: Called for every chunk of data that is decoded from the stream
101+
- `lengthEncoder: Function`: See description above.
92102

93103
Returns a [transform](https://gist.github.com/alanshaw/591dc7dd54e4f99338a347ef568d6ee9#transform-it) that yields [`BufferList`](https://www.npmjs.com/package/bl) objects.
94104

src/decode.js

Lines changed: 20 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,49 +1,39 @@
11
'use strict'
22

33
const BufferList = require('bl/BufferList')
4-
const Varint = require('varint')
4+
const varintDecode = require('./varint-decode')
55

6-
const MSB = 0x80
7-
const isEndByte = byte => !(byte & MSB)
6+
// Maximum length of the length section of the message
7+
const MAX_LENGTH_LENGTH = 8 // Varint.encode(Number.MAX_SAFE_INTEGER).length
8+
// Maximum length of the data section of the message
89
const MAX_DATA_LENGTH = 1024 * 1024 * 4
910

10-
const toBufferProxy = bl => new Proxy({}, {
11-
get: (_, prop) => prop[0] === 'l' ? bl[prop] : bl.get(parseInt(prop))
12-
})
13-
1411
const Empty = Buffer.alloc(0)
15-
1612
const ReadModes = { LENGTH: 'readLength', DATA: 'readData' }
1713

1814
const ReadHandlers = {
1915
[ReadModes.LENGTH]: (chunk, buffer, state, options) => {
2016
// console.log(ReadModes.LENGTH, chunk.length)
21-
let endByteIndex = -1
22-
23-
// BufferList bytes must be accessed via .get
24-
const getByte = chunk.get ? i => chunk.get(i) : i => chunk[i]
17+
buffer = buffer.append(chunk)
2518

26-
for (let i = 0; i < chunk.length; i++) {
27-
if (isEndByte(getByte(i))) {
28-
endByteIndex = i
29-
break
19+
let dataLength
20+
try {
21+
dataLength = options.lengthDecoder(buffer)
22+
} catch (err) {
23+
if (buffer.length > options.maxLengthLength) {
24+
throw Object.assign(err, { message: 'message length too long', code: 'ERR_MSG_LENGTH_TOO_LONG' })
3025
}
26+
if (err instanceof RangeError) {
27+
return { mode: ReadModes.LENGTH, buffer }
28+
}
29+
throw err
3130
}
3231

33-
if (endByteIndex === -1) {
34-
return { mode: ReadModes.LENGTH, buffer: buffer.append(chunk) }
35-
}
36-
37-
endByteIndex = buffer.length + endByteIndex
38-
buffer = buffer.append(chunk)
39-
40-
const dataLength = Varint.decode(toBufferProxy(buffer.shallowSlice(0, endByteIndex + 1)))
41-
4232
if (dataLength > options.maxDataLength) {
43-
throw Object.assign(new Error('message too long'), { code: 'ERR_MSG_TOO_LONG' })
33+
throw Object.assign(new Error('message data too long'), { code: 'ERR_MSG_DATA_TOO_LONG' })
4434
}
4535

46-
chunk = buffer.shallowSlice(endByteIndex + 1)
36+
chunk = buffer.shallowSlice(options.lengthDecoder.bytes)
4737
buffer = new BufferList()
4838

4939
if (options.onLength) options.onLength(dataLength)
@@ -77,6 +67,8 @@ const ReadHandlers = {
7767

7868
function decode (options) {
7969
options = options || {}
70+
options.lengthDecoder = options.lengthDecoder || varintDecode
71+
options.maxLengthLength = options.maxLengthLength || MAX_LENGTH_LENGTH
8072
options.maxDataLength = options.maxDataLength || MAX_DATA_LENGTH
8173

8274
return source => (async function * () {
@@ -127,4 +119,5 @@ decode.fromReader = (reader, options) => {
127119
}
128120

129121
module.exports = decode
122+
module.exports.MAX_LENGTH_LENGTH = MAX_LENGTH_LENGTH
130123
module.exports.MAX_DATA_LENGTH = MAX_DATA_LENGTH

src/encode.js

Lines changed: 15 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,29 @@
11
'use strict'
22

3-
const Varint = require('varint')
43
const { Buffer } = require('buffer')
54
const BufferList = require('bl/BufferList')
5+
const varintEncode = require('./varint-encode')
66

7-
const MIN_POOL_SIZE = 147 // Varint.encode(Number.MAX_VALUE).length
7+
const MIN_POOL_SIZE = 8 // Varint.encode(Number.MAX_SAFE_INTEGER).length
88
const DEFAULT_POOL_SIZE = 10 * 1024
99

1010
function encode (options) {
1111
options = options || {}
12-
options.poolSize = Math.max(options.poolSize || DEFAULT_POOL_SIZE, MIN_POOL_SIZE)
12+
13+
const poolSize = Math.max(options.poolSize || DEFAULT_POOL_SIZE, options.minPoolSize || MIN_POOL_SIZE)
14+
const encodeLength = options.lengthEncoder || varintEncode
1315

1416
return source => (async function * () {
15-
let pool = Buffer.alloc(options.poolSize)
17+
let pool = Buffer.alloc(poolSize)
1618
let poolOffset = 0
1719

1820
for await (const chunk of source) {
19-
Varint.encode(chunk.length, pool, poolOffset)
20-
poolOffset += Varint.encode.bytes
21-
const encodedLength = pool.slice(poolOffset - Varint.encode.bytes, poolOffset)
21+
encodeLength(chunk.length, pool, poolOffset)
22+
const encodedLength = pool.slice(poolOffset, poolOffset + encodeLength.bytes)
23+
poolOffset += encodeLength.bytes
2224

2325
if (pool.length - poolOffset < MIN_POOL_SIZE) {
24-
pool = Buffer.alloc(options.poolSize)
26+
pool = Buffer.alloc(poolSize)
2527
poolOffset = 0
2628
}
2729

@@ -31,7 +33,11 @@ function encode (options) {
3133
})()
3234
}
3335

34-
encode.single = c => new BufferList([Buffer.from(Varint.encode(c.length)), c])
36+
encode.single = (chunk, options) => {
37+
options = options || {}
38+
const encodeLength = options.lengthEncoder || varintEncode
39+
return new BufferList([encodeLength(chunk.length), chunk])
40+
}
3541

3642
module.exports = encode
3743
module.exports.MIN_POOL_SIZE = MIN_POOL_SIZE

src/index.js

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,9 @@
22

33
exports.encode = require('./encode')
44
exports.decode = require('./decode')
5+
6+
exports.varintEncode = require('./varint-encode')
7+
exports.varintDecode = require('./varint-decode')
8+
9+
exports.int32BEEncode = require('./int32BE-encode')
10+
exports.int32BEDecode = require('./int32BE-decode')

src/int32BE-decode.js

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
'use strict'
2+
3+
const int32BEDecode = data => {
4+
if (data.length < 4) throw RangeError('Could not decode int32BE')
5+
return data.readInt32BE(0)
6+
}
7+
8+
int32BEDecode.bytes = 4 // Always because fixed length
9+
10+
module.exports = int32BEDecode

src/int32BE-encode.js

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
'use strict'
2+
3+
const { Buffer } = require('buffer')
4+
5+
const int32BEEncode = (value, target, offset) => {
6+
target = target || Buffer.allocUnsafe(4)
7+
target.writeInt32BE(value, offset)
8+
return target
9+
}
10+
11+
int32BEEncode.bytes = 4 // Always because fixed length
12+
13+
module.exports = int32BEEncode

src/varint-decode.js

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
'use strict'
2+
3+
const Varint = require('varint')
4+
5+
const toBufferProxy = bl => new Proxy({}, {
6+
get: (_, prop) => prop[0] === 'l' ? bl[prop] : bl.get(parseInt(prop))
7+
})
8+
9+
const varintDecode = data => {
10+
const len = Varint.decode(Buffer.isBuffer(data) ? data : toBufferProxy(data))
11+
varintDecode.bytes = Varint.decode.bytes
12+
return len
13+
}
14+
15+
module.exports = varintDecode

src/varint-encode.js

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
'use strict'
2+
3+
const Varint = require('varint')
4+
const { Buffer } = require('buffer')
5+
6+
// Encode the passed length `value` to the `target` buffer at the given `offset`
7+
const varintEncode = (value, target, offset) => {
8+
const ret = Varint.encode(value, target, offset)
9+
varintEncode.bytes = Varint.encode.bytes
10+
// If no target, create Buffer from returned array
11+
return target || Buffer.from(ret)
12+
}
13+
14+
module.exports = varintEncode

test/_helpers.js

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
'use strict'
2+
3+
const { map } = require('streaming-iterables')
4+
const randomInt = require('random-int')
5+
const randomBytes = require('random-bytes')
6+
7+
module.exports.toBuffer = map(c => c.slice())
8+
module.exports.times = (n, fn) => Array.from(Array(n), fn)
9+
module.exports.someBytes = n => randomBytes(randomInt(1, n || 32))

0 commit comments

Comments
 (0)