Skip to content
This repository has been archived by the owner on Jun 29, 2022. It is now read-only.

Commit

Permalink
fixup! dag-pb: specify sorting order of Links array
Browse files Browse the repository at this point in the history
  • Loading branch information
rvagg committed Sep 26, 2020
1 parent 4e8f689 commit f5228fb
Showing 1 changed file with 13 additions and 8 deletions.
21 changes: 13 additions & 8 deletions block-layer/codecs/dag-pb.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,12 @@ message PBNode {
The objects link names are specified in the 'Name' field of the PBLink object.
All link names in an object must either be omitted or unique within the object.

DAG-PB aims to have a canonical form for any given set of data. Therefore, in addition to the standard Protobuf parsing rules, DAG-PB decoders should enforce additional constraints to ensure canonical forms (where possible):

1. Fields must appear in their correct order as defined by the Protobuf schema above, blocks with out-of-order fields should be rejected. It is common for Protobuf decoders to accept out-of-order field entries.
2. Duplicate entries in the binary form are invalid, blocks with duplicate field values should be rejected. It is common for Protobuf decoders to accept _updates_ to fields that have already been set.
3. Fields and wire types other than those that appear in the Protobuf schema above are invalid and blocks containing these should be rejected. It is common for Protobuf decoders to skip data in each message type that does not match the fields in the schema.

## Logical Format

When we handle DAG-PB content at the Data Model level, we treat these objects as maps.
Expand All @@ -44,7 +50,7 @@ This layout can be expressed with [IPLD Schemas](../../schemas/README.md) as:

```ipldsch
type PBNode struct {
Links optional [PBLink]
Links [PBLink]
Data optional Bytes
}
Expand All @@ -60,16 +66,15 @@ Constraints:
* The first node in a block of DAG-PB data will match the `PBNode` type.
* `Data` may be omitted or a byte array with a length of zero or more.
* `Links`:
* an empty array is not allowed, the value should be omitted in this case (the binary form makes no distinction between an empty array and an omitted value)
* array elements must be sorted in ascending order by their `Name` values, which are compared by bytes rather than as strings
* an empty `PBLink` (with all values omitted) is a valid form
* must be present, even if empty; the binary form makes no distinction between an empty array and an omitted value, in the Data Model we always instantiate an array.
* elements must be sorted in ascending order by their `Name` values, which are compared by bytes rather than as strings.
* `Hash`:
* even though `Hash` is `optional` in the PB encoding, it should not be treated as optional when creating new blocks or decoding existing ones, an omitted `Hash` should be interpreted as a bad block
* the bytes in the encoding format is interpreted as the bytes of a CID, if the bytes cannot be converted to a CID then it should be treated as a bad block
* the data is encoded in the binary form as a byte array, it is therefore possible for a decoder to read a correct binary form but fail to convert a `Hash` to a CID and therefore treat it as a bad block
* even though `Hash` is `optional` in the Protobuf encoding, it should not be treated as optional when creating new blocks or decoding existing ones, an omitted `Hash` should be interpreted as a bad block
* the bytes in the encoding format is interpreted as the bytes of a CID, if the bytes cannot be converted to a CID then it should be treated as a bad block.
* the data is encoded in the binary form as a byte array, it is therefore possible for a decoder to read a correct binary form but fail to convert a `Hash` to a CID and therefore treat it as a bad block.
* When creating data, you can create maps using the standard Data Model concepts, and as long as they have exactly these fields. If additional fields are present, the DAG-PB codec will error, because there is no way to encode them.

The most recent [Go](https://github.com/ipld/go-ipld-prime-proto) and [JavaScript](https://github.com/ipld/js-dag-pb) implementations strictly expose this logical format via the Data Model and do not support alternative means of resolving paths via named links as the legacy implementations do (see below).
The most recent [JavaScript](https://github.com/ipld/js-dag-pb) implementation strictly exposes this logical format via the Data Model and does not support alternative means of resolving paths via named links as the legacy implementations do (see below). The most recent [Go](https://github.com/ipld/go-ipld-prime-proto) implementation also avoids alternate pathing mechanisms but does not yet support the strict logical format.

## Alternative (Legacy) Pathing

Expand Down

0 comments on commit f5228fb

Please sign in to comment.