diff --git a/src/http-gateways/trustless-gateway.md b/src/http-gateways/trustless-gateway.md index 946538b05..1dcb73a92 100644 --- a/src/http-gateways/trustless-gateway.md +++ b/src/http-gateways/trustless-gateway.md @@ -81,7 +81,7 @@ Below response types SHOULD be supported: - Disables IPLD/IPFS deserialization, requests a verifiable CAR stream to be returned, implementations MAY support optional CAR content type parameters (:cite[ipip-0412]), the explicit [CAR format signaling in HTTP Request](#car-format-signaling-in-request) - and the optional [CAR metadata block](#car-meta-content-type-parameter). + and the optional [metadata block](#meta-content-type-parameter). - [application/vnd.ipfs.ipns-record](https://www.iana.org/assignments/media-types/application/vnd.ipfs.ipns-record) - A verifiable :cite[ipns-record] (multicodec `0x0300`). @@ -302,31 +302,103 @@ of their presence in the DAG or the value assigned to the "dups" parameter, as the raw data is already present in the parent block that links to the identity CID. -## CAR `meta` (content type parameter) +## `meta` (content type parameter) -The `meta=eof` parameter allows clients to request the server to include additional metadata about the -CAR to be included at the end of the response body. +The `meta` parameter allows clients to request the server to include additional metadata about the CAR along with the response body. -This parameter SHOULD only be used with CAR `version=1`. -Values other than `eof` SHOULD be ignored. +The value of this parameter includes both the location where the metadata is given (e.g. `eof`) as well as the type of data received (e.g. `json`) separated by a `+`, to give a value such as `meta=eof+json` -When the parameter is not set, the server must not add any extra CAR blocks to the response. +When the location parameter is set to `eof`, which is currently the only supported value, the server SHOULD respond with <0x00 byte> . -The metadata block is a regular CAR block with the following properties: +The only supported value for the data type parameter is `json`. This signifies that the metadata MUST be a JSON object. -- CID specifies multicodec `car-metadata` (`0x04ff`), see - [multicodec#334](https://github.com/multiformats/multicodec/pull/334). +This parameter MUST only be used with CAR `version=1`. -- The payload contains metadata encoded as DAG-CBOR. +When the parameter is not set or does not equal `eof+json`, the server SHOULD not add any extra blocks to the response, neither the 0x00 byte nor any metadata. -The metadata MUST include the following fields: +When `meta=eof+json`, the JSON object SHOULD conform to the following [JSON schema](https://json-schema.org/). -- `len` - byte length of the CAR data (excluding the metadata block) -- `b3h` - Blake3 hash (checksum) of the CAR data (excluding the metadata block). -- `b3h_sig` - A signature over `` using server's Ed2559 identity. - - `len` is encoded as `varint`, - - `b3h` is encoded as 32 bytes, - - The effective query as executed by the gateway. This query is the request url - path and query string arguments. +```json +{ + "type": "object", + "properties": { + "data": { + "description": "Properties of the response" + "type": "object" + }, + "error": { + "description": "Error message" + "type": "string" + }, + "sig": { + "description": "A signature, using the server's Ed2559 identity, over the metadata properties object" + "type": "string" + }, + "required": [] + } +} +``` + +The properties object can include any fields that the server would like to implement. The following JSON schema explicitly mentions certain properties fields in order to reach a convention on their definition as they have existing use cases. + +```json +{ + "type": "object", + "properties": { + "car_bytes": { + "description": "The total byte length of the CAR stream (excluding the 0x00 byte and the metadata block)", + "type": "integer" + }, + "data_bytes": { + "description": "Total byte length of the flat file before it was encoded into a CAR file", + "type": "integer" + }, + "block_count": { + "description": "Total number of blocks present in the CAR stream (excluding the 0x00 byte and the metadata block, but including duplicates when present)", + "type": "integer" + }, + "car_cid": { + "description": "A hash of the CAR stream giving a CIDv1 with 0x0202 codec", + "type": "string" + }, + "b3checksum": { + "description": "A Blake3 hash (checksum) of the CAR stream (excluding the 0x00 byte and the metadata block)", + "type": "string" + }, + "dag_params": { + "description": "A map with DAG params like dag-scope, entity-bytes from [IPIP-402](https://specs.ipfs.tech/ipips/ipip-0402/)", + "type": "object", + "properties": { + "dag-scope": { + "description": "See [IPIP-402](https://specs.ipfs.tech/ipips/ipip-0402/) for the definition", + "type": "string" + }, + "entity-bytes": { + "description": "See [IPIP-402](https://specs.ipfs.tech/ipips/ipip-0402/) for the definition", + "type": "string" + } + }, + "required": [] + }, + "car_params": { + "description": "A map with CAR content type params like order and dups from [IPIP-412](https://specs.ipfs.tech/ipips/ipip-0412/)", + "type": "object", + "properties": { + "order": { + "description": "See [IPIP-412](https://specs.ipfs.tech/ipips/ipip-0412/) for the definition.", + "type": "string" + }, + "dups": { + "description": "See [IPIP-412](https://specs.ipfs.tech/ipips/ipip-0412/) for the definition.", + "type": "string" + } + }, + "required": [] + } + }, + "required": [] +} +``` ## CAR format parameters and determinism diff --git a/src/ipips/ipip-0431.md b/src/ipips/ipip-0431.md index f47fce654..bba336216 100644 --- a/src/ipips/ipip-0431.md +++ b/src/ipips/ipip-0431.md @@ -31,7 +31,7 @@ retrieval attestation after the entire response was sent to the client. Aside from this specific use case, the IPFS Ecosystem at large has no reliable mechanism to signal that a CAR file transmission over HTTP completed successfully. -However, we need this in order to be able to use CARs as a way of serving streaming +We need this in order to be able to use CARs as a way of serving streaming responses for queries. One way of solving this problem is to append an extra block at the end of the CAR stream with information that clients can use to check whether all CAR blocks have been received. @@ -46,11 +46,19 @@ HTTP client to opt-in via `Accept` header and Gateway to indicate via The proposed solution introduces a new parameter for the CAR content type in HTTP requests and responses: `meta`. -When the CAR content type parameter `meta` is set to `eof`, the Gateway will write one additional CAR -block with metadata to the response, after it sent all CAR blocks. +The `meta` parameter allows clients to request the server to include additional metadata about the CAR along with the response body. -The metadata format is DAG-CBOR and open to extension, allowing standardized -userland experimentation similar to the Extensible Data field from IPNS V2. +The value of this parameter includes both the location where the metadata is given (e.g. `eof`) as well as the type of data received (e.g. `json`) separated by a `+`, to give a value such as `meta=eof+json` + +When the location parameter is set to `eof`, which is currently the only supported value, the server SHOULD respond with <0x00 byte> . + +The only supported value for the data type parameter is `json`. This signifies that the metadata MUST be a JSON object. + +This parameter MUST only be used with CAR `version=1`. + +When the parameter is not set or does not equal `eof+json`, the server SHOULD not add any extra blocks to the response, neither the 0x00 byte nor any metadata. + +This results in a example content type of `application/vnd.ipld.car;version=1;meta=eof+json` See [CAR `meta` (content type parameter)](/http-gateways/trustless-gateway/#car-meta-content-type-parameter) in Trustless Gateway specification for more details. @@ -68,9 +76,65 @@ in the future. - Clients of trustless gateways can use the fields from the metadata as an attestation that they performed the retrieval from the given server. -- The `len` field in the metadata block allows clients to verify whether they received all CAR +- For example, the metadata block could include a `car_bytes` field, the byte length of the CAR stream (excluding the metadata block). This would allow clients to verify whether they received all CAR bytes, which provides a backward-compatible solution for the [CARv1 streaming problem](https://github.com/ipfs/specs/pull/332) until new CAR version is introduced. +- As another example, the metadata object includes the `error` field, allowing the server to pass back additional information about why the response is an error, such as why the CAR stream was incomplete. + +- In the SPARK use case, retrieval clients would like to prove they have retrieved an entire file from a specific retrieval provder that has implemented the trustless gateway spec. The additional metadata block allows checksums and signatures to be passed along with the data, allowing the retrieval client to create a proof of correct retrieval. For SPARK, the metadata properties object SHOULD include the following fields: + +```json +{ + "type": "object", + "properties": { + "car_bytes": { + "description": "The total byte length of the CAR stream (excluding the 0x00 byte and the metadata block)", + "type": "integer" + }, + "b3checksum": { + "description": "A Blake3 hash (checksum) of the CAR stream (excluding the 0x00 byte and the metadata block)", + "type": "string" + }, + "content_path": { + "description": "The url path in the request as executed by the gateway", + "type": "string" + }, + "dag_params": { + "description": "A map with DAG params like dag-scope, entity-bytes from [IPIP-402](https://specs.ipfs.tech/ipips/ipip-0402/)", + "type": "object", + "properties": { + "dag-scope": { + "description": "See [IPIP-402](https://specs.ipfs.tech/ipips/ipip-0402/) for the definition", + "type": "string" + }, + "entity-bytes": { + "description": "See [IPIP-402](https://specs.ipfs.tech/ipips/ipip-0402/) for the definition", + "type": "string" + } + }, + "required": [] + }, + "car_params": { + "description": "A map with CAR content type params like order and dups from [IPIP-412](https://specs.ipfs.tech/ipips/ipip-0412/)", + "type": "object", + "properties": { + "order": { + "description": "See [IPIP-412](https://specs.ipfs.tech/ipips/ipip-0412/) for the definition.", + "type": "string" + }, + "dups": { + "description": "See [IPIP-412](https://specs.ipfs.tech/ipips/ipip-0412/) for the definition.", + "type": "string" + } + }, + "required": [] + }, + "required": ["car_bytes", "b3checksum", "content_path", "dag_params", "car_params"] + } +} +``` +The metadata `sig` field SHOULD also be populated, returning a signature, using the server's Ed2559 identity, over the metadata properties object. + ### Compatibility The new feature requires clients to explicitly ask the server to include the extra block via `Accept` header, @@ -132,7 +196,15 @@ Spend energy on creating CARv3 that solves the problems from "Motivation" sectio - native truncation detection and standardized error handling and passing during streaming - support for things like [Large Blocks](https://discuss.ipfs.tech/t/supporting-large-ipld-blocks/15093/) -TODO: link to some public artifact about CARv3 +TODO: link to some public artifact about CARv3 + +#### Create a new multicodec for this metadata block + +Initially, we proposed to create a new multicodec for this metadata block called `car-metadata`. This was ruled out due to some concerns that you can find documented [here](https://github.com/multiformats/multicodec/pull/334#issuecomment-1668086641). + +#### Using CBOR instead of JSON for the metadata block + +We could use CBOR instead of JSON for the metadata block. However it was [decided](https://github.com/ipfs/specs/pull/431#issuecomment-1719634928) to opt for user readibility over number of bytes since CBOR doesn't greatly reduce the number of bytes in a key value map compared with JSON. ## Test fixtures