Skip to content

Buffers

Billy Bunn edited this page Mar 22, 2019 · 4 revisions

Largely based on a reading of Do you want a better understanding of Buffer in Node.js? Check this out. by Justice Mba

Summary of Buffers in Node.js

Topics like buffers can be intimidating if you don't come from a computer science background. Even the official Node.js docs can be confusing if you aren't familiar with the terminology used.

What's a Binary?

The article referenced above breaks down this definition from the docs:

…[A buffer is a] mechanism for reading or manipulating streams of binary data. The Buffer class was introduced as part of the Node.js API to make it possible to interact with octet streams in the context of things like TCP streams and file system operations.

Down to this more succinct statement:

The Buffer class was introduced as part of the Node.js API to make it possible to manipulate or interact with streams of binary data.

There's still a term in that statement you may not be familiar with: binary data.

In short, binary data is a collection of 1's and 0's. Here's a list of 5 different binaries (just 1's and 0's): 10, 01, 001, 1110, 00101011.

Each number in a binary, each 1 and 0 in a set are called a Bit, which is a short form of Binary digIT.

To store data, computes have to convert the data into a binary representation. Meaning it all gets turned into 1's and 0's. But there's a few steps to convert a character (like a single letter "L") into a binary representation that can be saved.

  1. Each character has a character code. This is a whole number (not just 1's and 0's).
    • Check this out by entering "L".charCodeAt(0) to get the number 76. 76 is "L"'s character code or code point.
    • So we got a character to be a number. But you'll notice that 76 isn't binary yet.
  2. Next, we need to google some stuff about Character Encoding. Specifically, UTF-8.
    • Character encoding deals with how many bits are used to represent the character code number.
    • UTF-8 is a type of character encoding that can encode any Unicode character using one to four 8-bit bytes.
    • UTF-8 states that characters should be encoded in bytes. A byte is a set of eight bits — eight 1s and 0s. So eight 1s and 0s should be used to represent the Code Point of any character in binary.

    • So if you convert "L" to a character code, you get 76. Convert 76 to binary and you'll get 1001100. But note that this is only 7 characters, and UTF-8 needs 8 bits. So add a 0 to the front and you'll get 01001100 as for the letter "L".

What's "streams of binary data" mean?

Stream in Node.js simply means a sequence of data being moved from one point to the other over time. The whole concept is, you have a huge amount of data to process, but you don’t need to wait for all the data to be available before you start processing it.

So a stream of data means data moving from one point to another.

At the end of the day, buffers look kinda like arrays of binary data. You might see this data in ASCII hexidecimal format if you're converting text to a buffer.

Resources

Read

  • node buffer basics

Skim

  • node buffer api docs

Watch

  • endian and little endian

Bookmark

ascii chart

  • bmp spec (full)
  • bmp spec (simplified)
  • bmp spec (pdf)