Skip to content
This repository has been archived by the owner on Dec 12, 2024. It is now read-only.

Commit

Permalink
feat: standard representation of a cell and a slight re-order of thin…
Browse files Browse the repository at this point in the history
…gs (#331)
  • Loading branch information
novusnota authored Jul 25, 2024
1 parent c3a166c commit f4b9a10
Show file tree
Hide file tree
Showing 2 changed files with 29 additions and 17 deletions.
44 changes: 28 additions & 16 deletions pages/book/cells.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ import { Callout } from 'nextra/components'

## Cells

`Cell{:tact}` is a [primitive][p] and a data structure, which [ordinarly](#cells-kinds) consists of up to $1023$ continuously laid out bits and up to $4$ references to other cells. Circular references are forbidden and cannot be created by the means of [TVM][tvm], which means cells can be viewed as [quadtrees][quadtree] or [directed acyclic graphs (DAGs)](https://en.wikipedia.org/wiki/Directed_acyclic_graph) of themselves. Contract code itself is represented by a tree of cells.
`Cell{:tact}` is a [primitive][p] and a data structure, which [ordinarly](#cells-kinds) consists of up to $1023$ continuously laid out bits and up to $4$ references (refs) to other cells. Circular references are forbidden and cannot be created by the means of [TVM][tvm], which means cells can be viewed as [quadtrees][quadtree] or [directed acyclic graphs (DAGs)](https://en.wikipedia.org/wiki/Directed_acyclic_graph) of themselves. Contract code itself is represented by a tree of cells.

Cells and [cell primitives](#cells-immutability) are bit-oriented, not byte-oriented: [TVM][tvm] regards data kept in cells as sequences (strings or streams) of up to $1023$ bits, not bytes. If necessary, contracts are free to use, say, $21$-bit integer fields serialized into [TVM][tvm] cells, thus using fewer persistent storage bytes to represent the same data.

Expand Down Expand Up @@ -47,11 +47,34 @@ Every cell, being a [quadtree][quadtree], has an attribute called _level_, which

[Exotic](#cells-kinds) cells have different rules for determining their level, which are described [on this page in TON Docs](https://docs.ton.org/develop/data-formats/exotic-cells).

### Standard representation [#cells-representation]
### Serialization [#cells-serialization]

Before a cell can be transferred over the network or stored on disk, it must be serialized. There are several common formats, such as [standard `Cell{:tact}` representation](#cells-representation) and [BoC](#cells-boc).

#### Standard representation [#cells-representation]

Standard [`Cell{:tact}`](#cells) representation is a common serialization format for cells first described in the [tvm.pdf](https://docs.ton.org/tvm.pdf). Its algorithm representing cells in octet (byte) sequences begins with serializing the first $2$ bytes called descriptors:

* _Refs descriptor_ is calculated according to this formula: $r + 8 * k + 32 * l$, where $r$ is the number of references contained in the cell (between $0$ and $4$), $k$ is a flag for the cell kind ($0$ for [ordinary](#cells-kinds) and $1$ for [exotic](#cells-kinds)), and $l$ is the [level](#cells-levels) of the cell (between $0$ and $3$).
* _Bits descriptor_ is calculated according to this formula $\lfloor\frac{b}{8}\rfloor + \lceil\frac{b}{8}\rceil$, where $b$ is the number of bits in the cell (between $0$ and $1023$).

Then, the data bits of the cell themselves are serialized as $\lceil\frac{b}{8}\rceil$ $8$-bit octets (bytes). If $b$ is not a multiple of eight, a binary $1$ and up to six binary $0$s are appended to the data bits.

Next, the $2$ bytes store the depth of the refs, i.e. the number of cells between the root of the cell tree (the current cell) and the deepest of the references, including it. For example, a cell containing only one reference and no further references would have a depth of $1$, while the referenced cell would have a depth of $0$.

Finally, for every reference cell the [SHA-256][sha-2] hash of its standard representation is stored, occupying $32$ bytes per each such cell and recursively repeating the said algorithm. Notice, that cyclic cell references are not allowed, so this recursion always ends in a well-defined manner.

If we were to compute the hash of the standard representation of this cell, all the bytes from steps above would be concatenated together and then hashed using [SHA-256][sha-2] hash. This is the algorithm behind [`HASHCU` and `HASHSU` instructions](https://docs.ton.org/learn/tvm-instructions/instructions) of [TVM][tvm] and respective [`Cell.hash(){:tact}`](/ref/core-cells#cellhash) and [`Slice.hash(){:tact}`](/ref/core-cells#slicehash) functions of Tact.

#### Bag of Cells [#cells-boc]

Bag of Cells, or _BoC_ for short, is a format for serializing and de-serializing cells into byte arrays as described in [boc.tlb](https://github.com/ton-blockchain/ton/blob/24dc184a2ea67f9c47042b4104bbb4d82289fac1/crypto/tl/boc.tlb#L25) [TL-B schema][tlb].

Read more about BoC in TON Docs: [Bag of Cells](https://docs.ton.org/develop/data-formats/cell-boc#bag-of-cells).

<Callout>

To be resolved by [#280](https://github.com/tact-lang/tact-docs/issues/280).
Advanced information on [`Cell{:tact}`](#cells) serialization: [Canonical `Cell{:tact}` Serialization](https://docs.ton.org/develop/research-and-development/boc).

</Callout>

Expand Down Expand Up @@ -98,7 +121,7 @@ While you may use them for [manual construction](#cnp-manually) of the cells, it

While you may use them for [manual parsing](#cnp-manually) of the cells, it's strongly recommended to use [Structs][struct] instead: [Parsing of cells with Structs](#cnp-structs).

## Serialization
## Serialization types

Similar to serialization options of [`Int{:tact}`](/book/integers) type, `Cell{:tact}`, `Builder{:tact}` and `Slice{:tact}` also have various representations for encoding their values in the following cases:

Expand Down Expand Up @@ -137,18 +160,6 @@ contract SerializationExample {

</Callout>

### Bag of Cells [#cells-boc]

Bag of Cells, or _BoC_ for short, is a format for serializing and deserializing cells into byte arrays as described in [boc.tlb](https://github.com/ton-blockchain/ton/blob/24dc184a2ea67f9c47042b4104bbb4d82289fac1/crypto/tl/boc.tlb#L25) [TL-B schema][tlb].

Read more about BoC in TON Docs: [Bag of Cells](https://docs.ton.org/develop/data-formats/cell-boc#bag-of-cells).

<Callout>

Advanced information on [`Cell{:tact}`](#cells) serialization: [Canonical `Cell{:tact}` Serialization](https://docs.ton.org/develop/research-and-development/boc).

</Callout>

## Operations

### Construct and parse [#operations-cnp]
Expand Down Expand Up @@ -365,6 +376,7 @@ let areSlicesNotEqual = aSlice.hash() != bSlice.hash(); // false

[tvm]: https://docs.ton.org/learn/tvm-instructions/tvm-overview
[tlb]: https://docs.ton.org/develop/data-formats/tl-b-language
[sha-2]: https://en.wikipedia.org/wiki/SHA-2#Hash_standard

[quadtree]: https://en.wikipedia.org/wiki/Quadtree
[bin-eq]: /book/operators#binary-equality
Expand Down
2 changes: 1 addition & 1 deletion pages/ref/core-strings.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,7 @@ extends fun toString(self: Address): String;

Extension function for the [`Address{:tact}`][p].

Returns a [`String{:tact}`] from an [`Address{:tact}`][p].
Returns a [`String{:tact}`][p] from an [`Address{:tact}`][p].

Usage example:

Expand Down

0 comments on commit f4b9a10

Please sign in to comment.