diff --git a/tutorial.html b/tutorial.html new file mode 100644 index 0000000..9166c41 --- /dev/null +++ b/tutorial.html @@ -0,0 +1,399 @@ +--- +layout: default +title: Tutorial +--- + + +
+

Tutorial

+

+ We're going to look at some CBOR-encoded information a byte at a time. +

+
+
+
+ + + + + + + +
a3687374616e64617264f56352464319
1b89697075626c6973686564c11a5268
4517
+ +

Consider the following CBOR data stream, where each pair of hex digits represents one byte:

+ +

That's a little scary, but it just corresponds to the following JavaScript object:

+ +
{
+  "standard": true,
+  "RFC": 7049,
+  "published": new Date(1382565143000)
+}
+ +

There is nothing special about JavaScipt with respect to CBOR. It should be usable in almost any programming language. JavaScript is merely convenient to describe the encoded objects in a more human-readable way.

+ +

We'll walk you through how decoding works. First, read a single byte from the input, and look at the most sigificant three bits. They tell you what the "Major Type" is of the data item we're reading. These three bits can signal one of eight Major Types:

+ + + + + + + + + + + + + + + + + + + + + + +
Top 3 BitsMajor TypeMeaningExamples
0000Positive integer123
0011Negative integer-124
0102Block of bytes
0113String"Hello!"
1004Array[1,2]
1015Map["foo": 6]
1106Tagnew Date("2013-10-23T21:52:23Z")
1117Constant or floating pointnull, 1.234
+ +

+

The lower 5 bits are "Additional Information". The Additional Information either encodes small integer value (if it is less than 24), or tells us to read more bytes (if it is 24 or higher). Let's look at the first byte in the stream we're decoding:

+ +

Start

+ + + + + + +
a3687374616e64617264f56352464319.hstandard.cRFC.
1b89697075626c6973686564c11a5268..ipublished..Rh
4517E.
+ +
+
Byte (hex)
+
a3
+
Byte (binary)
+
1010 0011
+
Major Type
+
101 = 5 = Map
+
Additional Information
+
0 0011 = 3
+
+ +

The first byte says that what follows is a Map of name/value pairs. For the Map major type, the additional information tells us how many pairs of items there will be in the map. In this case, there will be 3 pairs, so we're going to have to read 6 more items: a name, a value, a name, a value, a name, and a value.

+ +

Map: 6 items to go

+

Let's examine the next byte:

+ + + + + + + +
a3687374616e64617264f56352464319.hstandard.cRFC.
1b89697075626c6973686564c11a5268..ipublished..Rh
4517E.
+ +
+
Byte (hex)
+
68
+
Byte (binary)
+
0110 1000
+
Major Type
+
011 = 3 = UTF-8 String
+
Additional Information
+
0 1000 = 8
+
+ + + + + + + +
a3687374616e64617264f56352464319.hstandard.cRFC.
1b89697075626c6973686564c11a5268..ipublished..Rh
4517E.
+ +

We see that this is an eight character long string; the next eight bytes in the input are that string:

+ +

Since the string is length-counted, we didn't have to perform any further escape decoding. The string standard is the first name in the map we're reading.

+ +

Map: 5 items to go

+ + + + + + + +
a3687374616e64617264f56352464319.hstandard.cRFC.
1b89697075626c6973686564c11a5268..ipublished..Rh
4517E.
+ +

Let's keep going by reading another item, starting with its first byte:

+ +
+
Byte (hex)
+
f5
+
Byte (binary)
+
1111 0101
+
Major Type
+
111 = 7 = Constant or floating point
+
Additional Information
+
1 0101 = 21 (true)
+
+ +

If a Major Type 7 has additional information less than 256, it's a constant. Here are the currently-defined constants (or, as CBOR calls them "simple values"):

+ + + + + + + + + + + + + + + + + +
CBOR encoded (hex)CBOR encoded (binary)Additional InformationMeaning
f41111 010020False
f51111 010121True
f61111 011022Null
f71111 011123Undefined
+ +

Other values might be defined in the future. If you receive one you don't understand, feel free thow an error, ignore it, turn it into an integer, or whatever rule works best for your use case.

+ +

Here, the value in the first name/value pair in the map we're reading has the value true.

+ +

Map: 4 items to go

+ + + + + + + +
a3687374616e64617264f56352464319.hstandard.cRFC.
1b89697075626c6973686564c11a5268..ipublished..Rh
4517E.
+ +

On to the next item! This should be getting a little more familiar now.

+ +
+
Byte (hex)
+
63
+
Byte (binary)
+
0110 0011
+
Major Type
+
011 = 3 = UTF-8 String
+
Additional Information
+
0 1000 = 3
+
+ + + + + + + +
a3687374616e64617264f56352464319.hstandard.cRFC.
1b89697075626c6973686564c11a5268..ipublished..Rh
4517E.
+ +

We're going to need to read three more bytes, and treat them as a UTF-8 string. The string RFC is the name for the next name/value pair in the map.

+ +

Map: 3 items to go

+ + + + + + + +
a3687374616e64617264f56352464319.hstandard.cRFC.
1b89697075626c6973686564c11a5268..ipublished..Rh
4517E.
+ +

What's the value associated with the RFC name in the map? It's a positive integer, with additional information of 25:

+ +
+
Byte (hex)
+
19
+
Byte (binary)
+
0001 1001
+
Major Type
+
000 = 0 = Positive integer
+
Additional Information
+
1 1001 = 25
+
+ +

This is the first time we've seen an additional information that is greater than 23.

+ + + + + + + + + + + + + + + +
Additional InformationMeaning
0..23The corresponding number, 0-23
24Read one more byte, use the value of that byte
25Read two more bytes, treat them as a network-order 16-bit integer
26Read four more bytes, treat them as a network-order 32-bit integer
27Read eight more bytes, treat them as a network-order 64-bit integer
28RESERVED: throw an error
29RESERVED: throw an error
30RESERVED: throw an error
31Indeterminite length
+ + + + + + + +
a3687374616e64617264f56352464319.hstandard.cRFC.
1b89697075626c6973686564c11a5268..ipublished..Rh
4517E.
+ +

Since the additional information is 25, we read two more bytes: 1b 89. When we interpret them in network byte order as an integer, they decode to 7049. + +

Map: 2 items to go

+ + + + + + + +
a3687374616e64617264f56352464319.hstandard.cRFC.
1b89697075626c6973686564c11a5268..ipublished..Rh
4517E.
+ +

The last name in the map is a 9-byte string.

+ +
+
Byte (hex)
+
69
+
Byte (binary)
+
0110 1001
+
Major Type
+
000 = 0 = Positive integer
+
Additional Information
+
0 1001 = 9
+
+ + + + + + + +
a3687374616e64617264f56352464319.hstandard.cRFC.
1b89697075626c6973686564c11a5268..ipublished..Rh
4517E.
+ +

Here we see the name is published:

+ +

Map: 1 item to go

+ + + + + + + +
a3687374616e64617264f56352464319.hstandard.cRFC.
1b89697075626c6973686564c11a5268..ipublished..Rh
4517E.
+ +

The last value is identified by a "tag". Tags give semantic meaning to the following item. If your code doesn't support a particular tag number, it can safely parse the entire input byte stream if it chooses.

+ +
+
Byte (hex)
+
c1
+
Byte (binary)
+
1100 0001
+
Major Type
+
110 = 6 = Tag
+
Additional Information
+
0 0001 = 1 = Date
+
+ +

Tag 1 corresponds to a Date. The item after the tag is an integer or floating point number of seconds since the epoch. Some other tag values that have been defined include:

+ + + + + + + + + + + + + + + + +
TagItem TypesMeaning
0UTF-8 stringDate/Time as string
1float, integerDate/Time from epoch
32UTF-8 stringURI
35UTF-8 StringRegular expression
+ +

Tag: 1 item to go

+ + + + + + + +
a3687374616e64617264f56352464319.hstandard.cRFC.
1b89697075626c6973686564c11a5268..ipublished..Rh
4517E.
+ +

Let's read the tagged item, which starts with the byte 1a:

+ +
+
Byte (hex)
+
1a
+
Byte (binary)
+
0001 1010
+
Major Type
+
000 = 0 = Positive integer
+
Additional Information
+
1 1010 = 1a = 26 = Read 4 more bytes
+
+ + + + + + + +
a3687374616e64617264f56352464319.hstandard.cRFC.
1b89697075626c6973686564c11a5268..ipublished..Rh
4517E.
+ +

The tagged item is the four-byte integer 0x52684517, which with the Date/Time tag indicates the point in time 1382565143 seconds since the epoch, or Wed, 23 Oct 2013 21:52:23 GMT, the time when RFC 7409 was announced.

+

+ +

Assembling the map

+ +

We have now successfully read 3 pairs of items that followed the original map code. Let's look at it all at once, in the diagnostic text mode defined for CBOR:

+ +
{"standard": true, "RFC": 7049, "published": 1(1382565143)}
+ +

What else?

+ +

CBOR also supports arrays (Major Type 4, followed by Additional Information number of items), floating point numbers (Major Type 7 followed by 2, 4, or 8 bytes of network-order IEEE754 additional information), and byte strings (Major Type 2 followed by Additional Information number of bytes), all of which should be straightforward for you to figure out now.