Skip to content
Nightcrawler edited this page Jan 9, 2024 · 14 revisions


Bois Binary Format

Bois binary format is pretty straight forward. BOIS stands for Binary Object Indexed Serializer. Even tough the overall structure doesn't follow any specific rule, it still can be categorized as indexed sequential data format, hence the indexed word in name. Being indexed means that there is an index byte before every object. This index byte contains information about the the data that comes after it. It can even contain data by in itself. To know how continue reading.

The Specs

There are several type of index bytes that depending on the type of data that is going to be stored are used as the index byte.

Index Bytes

IB1 - Nullable: Generally used if the object/number is nullable.

index byte: [0_{null-flag}_0_0_0_0_0_0]
embedable integer: none

IB2 - Embed-able Nullable: Generally used if the object/number is nullable and is small enough to be embedded.

index byte: [{embedded-flag}_{null-flag}_0_0_0_0_0_0]
followed by optional data: [0_0_0_0_0_0_0_0]
embedable integer: 0..63

IB3 - Embed-able Nullable Signed Number: Used for signed numbers which is nullable and is small enough to be embedded.

index byte: [{embedded-flag}_{null-flag}_{negative-flag}_0_0_0_0_0]
followed by optional data: [0_0_0_0_0_0_0_0]
embedable integer: 0..31

IB4 - Embed-able Not-Null Signed Number: Used for signed numbers which can not be null and is small enough to be embedded.

index byte: [{embedded-flag}_{negative-flag}_0_0_0_0_0_0]
followed by optional data: [0_0_0_0_0_0_0_0]
embedable integer: 0..63

IB5 - Embed-able Nullable Unsigned Number: Used for unsigned numbers which can be null and is small enough to be embedded.

index byte: [{embedded-flag}_{null-flag}_0_0_0_0_0_0]
followed by optional data: [0_0_0_0_0_0_0_0]
embedable integer: 0..63

IB6 - Embed-able Not-Null Unsigned Number: Used for unsigned numbers which can not be null is be small enough to be embedded.

index byte: [{embedded-flag}_0_0_0_0_0_0_0]
followed by optional data: [0_0_0_0_0_0_0_0]
embedable integer: 0..127

More on Index Bytes

If you have noticed, some of these index bytes have same structure. I've done this to simplify the process of writing the program. But we still need more info about these bytes which is the the amount of data that be embedded. Before that Lets see how to embed data in index byte.

Embedding In Index Byte

If the number that is going to be stored is small enough it can be stored in the index byte by merging the number and the flags. The flags should be preserved at all times. Any misuse of the embedded flag may lead to invalid data. First we have to know how much data can can be stored. For example Int32 is type of IB4 which can store any number in 0...63 range.

As an example of a Unsinged Integer imagine we want to store number 50. Since the datatype is uint and is not nullable it falls into IB6 category. Because 50 is smaller than IB6 embeddable range it can be stored in the index byte. Finally because the number is embeded we have to set the flag.
50 decimal = [00110010] byte
IB6 Embedded flag = [10000000]
Final byte = [10110010]

Now imagine that we want to save the same number 50 but this time the data type is a nullable signed integer int?. This type falls into IB3 category which the largest embedable number is 31 so that means we cannot embed 50 into index byte. This is how it is stored.
50 decimal = [00110010] byte
IB3 Not null not embeded signed number flag = [00000000]
Final bytes = [0000000][00110010]
In here the first byte is index flag which its flags are not enabled and the second byte is the number itself.

Same process should be done while reading data. As the first step we have to determine the datatable from the schema, then decide which index bytes category it belongs to and finally check the flags and read the data and seperated it from any flags.

Simple Data Types

This section descirbes the category and also the structure of simple data types supported by the serializer.

byte or unsigned byte

Category: None
Structure: None.

byte? or nullable unsigned byte

Category: IB5
Structure: None.

sbyte or signed byte

Category: None
Structure: None.

sbyte? or nullable signed byte

Category: IB3
Structure: None.

int or signed integer

Category: IB4
Structure: None.

int? or nullable signed integer

uint or unsigned integer

uint? or nullable unsigned integer

long or signed big integer

long? or nullable signed big integer

ulong or unsigned big integer

ulong? or nullable unsigned big integer

int16 or short or signed small integer

int16? or nullable short or nullable signed small integer

uint16 or short or unsigned small integer

uint16? or nullable short or nullable unsigned small integer

bool or boolean

Category: None
Structure: byte.

bool? or nullable boolean

Category: IB2
Structure: byte?.

char or character

Category: IB6
Structure: ushort.

char? or nullable character

Category: IB2
Structure: ushort?.

Primitive Data Types

This section describes the types that require a simple structure in addition to the category.


   Structure: [data-length : uint?][string-data-encoded : byte-array]
   String Data: Byte-Array.
   Note: Encoding string to byte-array is done throught utf8 encoder by default.

double or 64-bit floating-point

   Structure: [data-length : uint][double-variable-data : byte-array]
   Data Format: Double value is converted to 16 bytes and only low values with actual data stored.
   TODO: explain.

double? or nullable 64-bit floating-point

   Structure: [data-length : uint?][double-variable-data : byte-array]
   Data Format: Same as double

decimal or 128-bit floating-point

   Structure: [data-length : uint][double-variable-data : byte-array]
   Data Format: Double value is converted to 8 bytes and only low values with actual data stored.
   TODO: explain.

decimal? or nullable 128-bit floating-point

   Structure: [data-length : uint?][double-variable-data : byte-array]
   Data Format: Double value is converted to 8 bytes and only low values with actual data stored.
   TODO: explain.


Clone this wiki locally