Skip to content

Commit e419b63

Browse files
author
Alan Shaw
authored
feat: custom length encoding and decoding (#8)
This PR allows a custom length encoding/decoding function to be passed to `encode`/`decode`. There's also a int32BE fixed length encoder/decoder. e.g. ```js const lp = require('it-length-prefixed') const { int32BEDecode, int32BEDecode } = require('it-length-prefixed') await pipe( [Buffer.from('hello world')], lp.encode({ lengthEncoder: int32BEDecode }), lp.decode({ lengthDecoder: int32BEDecode }), async source => { for await (const chunk of source) { console.log(chunk.toString()) } } ) ``` See updated README for more info. BREAKING CHANGE: Additional validation now checks for messages with a length that is too long to prevent a possible DoS attack. The error code `ERR_MSG_TOO_LONG` has changed to `ERR_MSG_DATA_TOO_LONG` and the error code `ERR_MSG_LENGTH_TOO_LONG` has been added. License: MIT Signed-off-by: Alan Shaw <alan.shaw@protocol.ai>
1 parent b651ba4 commit e419b63

15 files changed

+216
-65
lines changed

.travis.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
language: node_js
22
node_js:
33
- 10
4+
- 12
45

56
script:
67
- npm run lint

README.md

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -61,23 +61,31 @@ console.log(decoded)
6161

6262
- `opts: Object`, optional
6363
- `poolSize: 10 * 1024`: Buffer pool size to allocate up front
64+
- `minPoolSize: 8`: The minimum size the pool can be before it is re-allocated. Note: it is important this value is greater than the maximum value that can be encoded by the `lengthEncoder` (see the next option). Since encoded lengths are written into a buffer pool, there needs to be enough space to hold the encoded value.
65+
- `lengthEncoder: Function`: A function that encodes the length that will prefix each message. By default this is a [`varint`](https://www.npmjs.com/package/varint) encoder. It is passed a `value` to encode, an (optional) `target` buffer to write to and an (optional) `offset` to start writing from. The function should encode the `value` into the `target` (or alloc a new Buffer if not specified), set the `lengthEncoder.bytes` value (the number of bytes written) and return the `target`.
66+
- The following additional length encoders are available:
67+
- **int32BE** - `const { int32BEEncode } = require('it-length-prefixed')`
6468

65-
All messages will be prefixed with a varint.
69+
Returns a [transform](https://gist.github.com/alanshaw/591dc7dd54e4f99338a347ef568d6ee9#transform-it) that yields [`BufferList`](https://www.npmjs.com/package/bl) objects. All messages will be prefixed with a length, determined by the `lengthEncoder` function.
6670

67-
Returns a [transform](https://gist.github.com/alanshaw/591dc7dd54e4f99338a347ef568d6ee9#transform-it) that yields [`BufferList`](https://www.npmjs.com/package/bl) objects.
68-
69-
### `encode.single(chunk)`
71+
### `encode.single(chunk, [opts])`
7072

7173
- `chunk: Buffer|BufferList` chunk to encode
74+
- `opts: Object`, optional
75+
- `lengthEncoder: Function`: See description above. Note that this encoder will _not_ be passed a `target` or `offset` and so will need to allocate a buffer to write to.
7276

7377
Returns a `BufferList` containing the encoded chunk.
7478

7579
### `decode([opts])`
7680

7781
- `opts: Object`, optional
78-
- `maxDataLength`: If provided, will not decode messages longer than the size specified, if omitted will use the current default of 4MB.
82+
- `maxLengthLength`: If provided, will not decode messages whose length section exceeds the size specified, if omitted will use the default of 147 bytes.
83+
- `maxDataLength`: If provided, will not decode messages whose data section exceeds the size specified, if omitted will use the default of 4MB.
7984
- `onLength(len: Number)`: Called for every length prefix that is decoded from the stream
8085
- `onData(data: BufferList)`: Called for every chunk of data that is decoded from the stream
86+
- `lengthDecoder: Function`: A function that decodes the length that prefixes each message. By default this is a [`varint`](https://www.npmjs.com/package/varint) decoder. It is passed some `data` to decode which is a [`BufferList`](https://www.npmjs.com/package/bl). The function should decode the length, set the `lengthDecoder.bytes` value (the number of bytes read) and return the length. If the length cannot be decoded, the function should throw a `RangeError`.
87+
- The following additional length decoders are available:
88+
- **int32BE** - `const { int32BEDecode } = require('it-length-prefixed')`
8189

8290
Returns a [transform](https://gist.github.com/alanshaw/591dc7dd54e4f99338a347ef568d6ee9#transform-it) that yields [`BufferList`](https://www.npmjs.com/package/bl) objects.
8391

@@ -87,8 +95,10 @@ Behaves like `decode` except it only reads the exact number of bytes needed for
8795

8896
- `reader: Reader`: An [it-reader](https://github.com/alanshaw/it-reader)
8997
- `opts: Object`, optional
90-
- `maxDataLength`: If provided, will not decode messages longer than the size specified, if omitted will use the current default of 4MB.
98+
- `maxLengthLength`: If provided, will not decode messages whose length section exceeds the size specified, if omitted will use the default of 147 bytes.
99+
- `maxDataLength`: If provided, will not decode messages whose data section exceeds the size specified, if omitted will use the default of 4MB.
91100
- `onData(data: BufferList)`: Called for every chunk of data that is decoded from the stream
101+
- `lengthEncoder: Function`: See description above.
92102

93103
Returns a [transform](https://gist.github.com/alanshaw/591dc7dd54e4f99338a347ef568d6ee9#transform-it) that yields [`BufferList`](https://www.npmjs.com/package/bl) objects.
94104

src/decode.js

Lines changed: 20 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,49 +1,39 @@
11
'use strict'
22

33
const BufferList = require('bl/BufferList')
4-
const Varint = require('varint')
4+
const varintDecode = require('./varint-decode')
55

6-
const MSB = 0x80
7-
const isEndByte = byte => !(byte & MSB)
6+
// Maximum length of the length section of the message
7+
const MAX_LENGTH_LENGTH = 8 // Varint.encode(Number.MAX_SAFE_INTEGER).length
8+
// Maximum length of the data section of the message
89
const MAX_DATA_LENGTH = 1024 * 1024 * 4
910

10-
const toBufferProxy = bl => new Proxy({}, {
11-
get: (_, prop) => prop[0] === 'l' ? bl[prop] : bl.get(parseInt(prop))
12-
})
13-
1411
const Empty = Buffer.alloc(0)
15-
1612
const ReadModes = { LENGTH: 'readLength', DATA: 'readData' }
1713

1814
const ReadHandlers = {
1915
[ReadModes.LENGTH]: (chunk, buffer, state, options) => {
2016
// console.log(ReadModes.LENGTH, chunk.length)
21-
let endByteIndex = -1
22-
23-
// BufferList bytes must be accessed via .get
24-
const getByte = chunk.get ? i => chunk.get(i) : i => chunk[i]
17+
buffer = buffer.append(chunk)
2518

26-
for (let i = 0; i < chunk.length; i++) {
27-
if (isEndByte(getByte(i))) {
28-
endByteIndex = i
29-
break
19+
let dataLength
20+
try {
21+
dataLength = options.lengthDecoder(buffer)
22+
} catch (err) {
23+
if (buffer.length > options.maxLengthLength) {
24+
throw Object.assign(err, { message: 'message length too long', code: 'ERR_MSG_LENGTH_TOO_LONG' })
3025
}
26+
if (err instanceof RangeError) {
27+
return { mode: ReadModes.LENGTH, buffer }
28+
}
29+
throw err
3130
}
3231

33-
if (endByteIndex === -1) {
34-
return { mode: ReadModes.LENGTH, buffer: buffer.append(chunk) }
35-
}
36-
37-
endByteIndex = buffer.length + endByteIndex
38-
buffer = buffer.append(chunk)
39-
40-
const dataLength = Varint.decode(toBufferProxy(buffer.shallowSlice(0, endByteIndex + 1)))
41-
4232
if (dataLength > options.maxDataLength) {
43-
throw Object.assign(new Error('message too long'), { code: 'ERR_MSG_TOO_LONG' })
33+
throw Object.assign(new Error('message data too long'), { code: 'ERR_MSG_DATA_TOO_LONG' })
4434
}
4535

46-
chunk = buffer.shallowSlice(endByteIndex + 1)
36+
chunk = buffer.shallowSlice(options.lengthDecoder.bytes)
4737
buffer = new BufferList()
4838

4939
if (options.onLength) options.onLength(dataLength)
@@ -77,6 +67,8 @@ const ReadHandlers = {
7767

7868
function decode (options) {
7969
options = options || {}
70+
options.lengthDecoder = options.lengthDecoder || varintDecode
71+
options.maxLengthLength = options.maxLengthLength || MAX_LENGTH_LENGTH
8072
options.maxDataLength = options.maxDataLength || MAX_DATA_LENGTH
8173

8274
return source => (async function * () {
@@ -127,4 +119,5 @@ decode.fromReader = (reader, options) => {
127119
}
128120

129121
module.exports = decode
122+
module.exports.MAX_LENGTH_LENGTH = MAX_LENGTH_LENGTH
130123
module.exports.MAX_DATA_LENGTH = MAX_DATA_LENGTH

src/encode.js

Lines changed: 15 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,29 @@
11
'use strict'
22

3-
const Varint = require('varint')
43
const { Buffer } = require('buffer')
54
const BufferList = require('bl/BufferList')
5+
const varintEncode = require('./varint-encode')
66

7-
const MIN_POOL_SIZE = 147 // Varint.encode(Number.MAX_VALUE).length
7+
const MIN_POOL_SIZE = 8 // Varint.encode(Number.MAX_SAFE_INTEGER).length
88
const DEFAULT_POOL_SIZE = 10 * 1024
99

1010
function encode (options) {
1111
options = options || {}
12-
options.poolSize = Math.max(options.poolSize || DEFAULT_POOL_SIZE, MIN_POOL_SIZE)
12+
13+
const poolSize = Math.max(options.poolSize || DEFAULT_POOL_SIZE, options.minPoolSize || MIN_POOL_SIZE)
14+
const encodeLength = options.lengthEncoder || varintEncode
1315

1416
return source => (async function * () {
15-
let pool = Buffer.alloc(options.poolSize)
17+
let pool = Buffer.alloc(poolSize)
1618
let poolOffset = 0
1719

1820
for await (const chunk of source) {
19-
Varint.encode(chunk.length, pool, poolOffset)
20-
poolOffset += Varint.encode.bytes
21-
const encodedLength = pool.slice(poolOffset - Varint.encode.bytes, poolOffset)
21+
encodeLength(chunk.length, pool, poolOffset)
22+
const encodedLength = pool.slice(poolOffset, poolOffset + encodeLength.bytes)
23+
poolOffset += encodeLength.bytes
2224

2325
if (pool.length - poolOffset < MIN_POOL_SIZE) {
24-
pool = Buffer.alloc(options.poolSize)
26+
pool = Buffer.alloc(poolSize)
2527
poolOffset = 0
2628
}
2729

@@ -31,7 +33,11 @@ function encode (options) {
3133
})()
3234
}
3335

34-
encode.single = c => new BufferList([Buffer.from(Varint.encode(c.length)), c])
36+
encode.single = (chunk, options) => {
37+
options = options || {}
38+
const encodeLength = options.lengthEncoder || varintEncode
39+
return new BufferList([encodeLength(chunk.length), chunk])
40+
}
3541

3642
module.exports = encode
3743
module.exports.MIN_POOL_SIZE = MIN_POOL_SIZE

src/index.js

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,9 @@
22

33
exports.encode = require('./encode')
44
exports.decode = require('./decode')
5+
6+
exports.varintEncode = require('./varint-encode')
7+
exports.varintDecode = require('./varint-decode')
8+
9+
exports.int32BEEncode = require('./int32BE-encode')
10+
exports.int32BEDecode = require('./int32BE-decode')

src/int32BE-decode.js

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
'use strict'
2+
3+
const int32BEDecode = data => {
4+
if (data.length < 4) throw RangeError('Could not decode int32BE')
5+
return data.readInt32BE(0)
6+
}
7+
8+
int32BEDecode.bytes = 4 // Always because fixed length
9+
10+
module.exports = int32BEDecode

src/int32BE-encode.js

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
'use strict'
2+
3+
const { Buffer } = require('buffer')
4+
5+
const int32BEEncode = (value, target, offset) => {
6+
target = target || Buffer.allocUnsafe(4)
7+
target.writeInt32BE(value, offset)
8+
return target
9+
}
10+
11+
int32BEEncode.bytes = 4 // Always because fixed length
12+
13+
module.exports = int32BEEncode

src/varint-decode.js

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
'use strict'
2+
3+
const Varint = require('varint')
4+
5+
const toBufferProxy = bl => new Proxy({}, {
6+
get: (_, prop) => prop[0] === 'l' ? bl[prop] : bl.get(parseInt(prop))
7+
})
8+
9+
const varintDecode = data => {
10+
const len = Varint.decode(Buffer.isBuffer(data) ? data : toBufferProxy(data))
11+
varintDecode.bytes = Varint.decode.bytes
12+
return len
13+
}
14+
15+
module.exports = varintDecode

src/varint-encode.js

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
'use strict'
2+
3+
const Varint = require('varint')
4+
const { Buffer } = require('buffer')
5+
6+
// Encode the passed length `value` to the `target` buffer at the given `offset`
7+
const varintEncode = (value, target, offset) => {
8+
const ret = Varint.encode(value, target, offset)
9+
varintEncode.bytes = Varint.encode.bytes
10+
// If no target, create Buffer from returned array
11+
return target || Buffer.from(ret)
12+
}
13+
14+
module.exports = varintEncode

test/_helpers.js

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
'use strict'
2+
3+
const { map } = require('streaming-iterables')
4+
const randomInt = require('random-int')
5+
const randomBytes = require('random-bytes')
6+
7+
module.exports.toBuffer = map(c => c.slice())
8+
module.exports.times = (n, fn) => Array.from(Array(n), fn)
9+
module.exports.someBytes = n => randomBytes(randomInt(1, n || 32))

test/decode.from-reader.spec.js

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,12 @@
44
const pipe = require('it-pipe')
55
const Reader = require('it-reader')
66
const { expect } = require('chai')
7-
const randomInt = require('random-int')
87
const randomBytes = require('random-bytes')
9-
const { map, collect } = require('streaming-iterables')
8+
const { collect } = require('streaming-iterables')
109
const Varint = require('varint')
10+
const { toBuffer, times, someBytes } = require('./_helpers')
1111

1212
const lp = require('../')
13-
const toBuffer = map(c => c.slice())
14-
const times = (n, fn) => Array.from(Array(n), fn)
15-
const someBytes = n => randomBytes(randomInt(1, n || 32))
1613

1714
describe('decode from reader', () => {
1815
it('should be able to decode from an it-reader', async () => {
@@ -47,7 +44,7 @@ describe('decode from reader', () => {
4744
collect
4845
)
4946
} catch (err) {
50-
expect(err.code).to.equal('ERR_MSG_TOO_LONG')
47+
expect(err.code).to.equal('ERR_MSG_DATA_TOO_LONG')
5148
return
5249
}
5350

test/decode.spec.js

Lines changed: 51 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,15 @@ const pipe = require('it-pipe')
55
const { expect } = require('chai')
66
const randomInt = require('random-int')
77
const randomBytes = require('random-bytes')
8-
const { map, collect } = require('streaming-iterables')
8+
const { collect } = require('streaming-iterables')
99
const Varint = require('varint')
1010
const BufferList = require('bl/BufferList')
1111
const defer = require('p-defer')
12+
const { toBuffer, times } = require('./_helpers')
1213

1314
const lp = require('../')
14-
const { MAX_DATA_LENGTH } = lp.decode
15-
const toBuffer = map(c => c.slice())
15+
const { MAX_LENGTH_LENGTH, MAX_DATA_LENGTH } = lp.decode
16+
const { int32BEDecode } = lp
1617

1718
describe('decode', () => {
1819
it('should decode single message', async () => {
@@ -83,7 +84,24 @@ describe('decode', () => {
8384
expect(output.slice(-byteLength)).to.deep.equal(bytes)
8485
})
8586

86-
it('should not decode a message that is too long', async () => {
87+
it('should not decode message length that is too long', async () => {
88+
// A value < 0x80 signifies end of varint so pass buffers of >= 0x80
89+
// so that it will keep throwing a RangeError until we reach the max length
90+
const lengths = times(5, () => Buffer.alloc(MAX_LENGTH_LENGTH / 4).fill(0x80))
91+
const bytes = await randomBytes(randomInt(2, 64))
92+
93+
const input = [...lengths, bytes]
94+
95+
try {
96+
await pipe(input, lp.decode(), toBuffer, collect)
97+
} catch (err) {
98+
expect(err.code).to.equal('ERR_MSG_LENGTH_TOO_LONG')
99+
return
100+
}
101+
throw new Error('did not throw for too long message')
102+
})
103+
104+
it('should not decode message data that is too long', async () => {
87105
const byteLength = MAX_DATA_LENGTH + 1
88106
const bytes = await randomBytes(byteLength)
89107

@@ -95,7 +113,7 @@ describe('decode', () => {
95113
try {
96114
await pipe(input, lp.decode(), toBuffer, collect)
97115
} catch (err) {
98-
expect(err.code).to.equal('ERR_MSG_TOO_LONG')
116+
expect(err.code).to.equal('ERR_MSG_DATA_TOO_LONG')
99117
return
100118
}
101119
throw new Error('did not throw for too long message')
@@ -179,4 +197,32 @@ describe('decode', () => {
179197

180198
await Promise.all([lengthDeferred.promise, dataDeferred.promise])
181199
})
200+
201+
it('should decode with custom length decoder (int32BE)', async () => {
202+
const byteLength0 = randomInt(2, 64)
203+
const encodedByteLength0 = Buffer.allocUnsafe(4)
204+
encodedByteLength0.writeInt32BE(byteLength0)
205+
const bytes0 = await randomBytes(byteLength0)
206+
207+
const byteLength1 = randomInt(1, 64)
208+
const encodedByteLength1 = Buffer.allocUnsafe(4)
209+
encodedByteLength1.writeInt32BE(byteLength1)
210+
const bytes1 = await randomBytes(byteLength1)
211+
212+
const input = [
213+
Buffer.concat([
214+
encodedByteLength0,
215+
bytes0.slice(0, 1)
216+
]),
217+
Buffer.concat([
218+
bytes0.slice(1),
219+
encodedByteLength1,
220+
bytes1
221+
])
222+
]
223+
224+
const output = await pipe(input, lp.decode({ lengthDecoder: int32BEDecode }), toBuffer, collect)
225+
expect(output[0].slice(-byteLength0)).to.deep.equal(bytes0)
226+
expect(output[1].slice(-byteLength1)).to.deep.equal(bytes1)
227+
})
182228
})

0 commit comments

Comments
 (0)