base64 decoder could be 2x faster when decoding wrapped base64

Node's base64 decoder currently uses a fast decoder and a slow decoder.

The fast decoder decodes 32-bit words at a time. If it sees a line-break or whitespace or garbage, then it switches permanently to the slow decoder which decodes a byte at a time, with a conditional branch per byte, instead of per 32-bit word.

I did some rough benchmarking to compare decoding a 4mb random buffer encoded as base64, and decoding the same base64 but with CRLFs added every 76 chars as per MIME base64:

```
Decode Fast: 8ms
Decode Fast: 9ms
Decode Fast: 9ms
Decode Slow (wrapped every 76 chars): 30ms
Decode Slow (wrapped every 76 chars): 20ms
Decode Slow (wrapped every 76 chars): 22ms
```

As far as I can see, there's no reason to switch permanently to the slow decoder. If the fast decoder detects that the 32-bit word contains an invalid character, it could just decode the next few bytes byte-by-byte, and then switch back to fast mode as soon as it has consumed 4 valid base64 characters and outputted 3 bytes. This could all be a sub-branch after the 32-bit word check so it should not affect the performance of the fast decoder in any way.

For base64 decoding MIME data, this should nearly double the throughput since the slow case is triggered only every 76 bytes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

base64 decoder could be 2x faster when decoding wrapped base64 #12114

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

base64 decoder could be 2x faster when decoding wrapped base64 #12114

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions