Skip to content

Commit 8b78e79

Browse files
achingbrain2colorSgtPooki
authored
fix: content-type response header hints how to process response (#426)
## JSON codec ```TypeScript import { verifiedFetch } from '@helia/verified-fetch' import * as json from 'multiformats/codecs/json' const res = await verifiedFetch('ipfs://bafyJSONCodec') // get object representation of JSON body console.info(await res.json()) // { ... } // alternative way to do the same thing const obj = json.decode(new Uint8Array(await res.arrayBuffer())) console.info(obj) // { ... } ``` ## DAG-JSON codec ```TypeScript import { verifiedFetch } from '@helia/verified-fetch' import * as dagJson from '@ipld/dag-json' const res = await verifiedFetch('ipfs://bafyDAGJSONCodec') // get spec-compliant plain-old-JSON object that happens to contain a CID console.info(await res.json()) // { cid: { '/': 'Qmfoo' } } // ...or use @ipld/dag-json to get rich-object version const obj = dagJson.decode(new Uint8Array(await res.arrayBuffer())) console.info(obj) // { cid: CID('Qmfoo') } ``` ## DAG-CBOR codec ```TypeScript import { verifiedFetch } from '@helia/verified-fetch' import * as dagCbor from '@ipld/dag-cbor' const res = await verifiedFetch('ipfs://bafyDAGCBORCodec') if (res.headers.get('content-type') === 'application/json') { // CBOR was JSON-friendly console.info(await res.json()) // { ... } } else { // CBOR was not JSON-friendly, must use @ipld/dag-cbor to decode const obj = dagCbor.decode(new Uint8Array(await res.arrayBuffer())) console.info(obj) // { ... } } ``` If the `DAG-CBOR` block contains anything that cannot round-trip to JSON (e.g. `CID`s, `Uint8Array`s, `BigInt`s, etc), the content type will be `application/octet-stream` - `.json()` will throw and so `@ipld/dag-cbor` must be used to decode the resolved value of `.arrayBuffer()`. --------- Co-authored-by: Daniel Norman <1992255+2color@users.noreply.github.com> Co-authored-by: Daniel N <2color@users.noreply.github.com> Co-authored-by: Russell Dempsey <1173416+SgtPooki@users.noreply.github.com>
1 parent ea39b48 commit 8b78e79

6 files changed

+730
-58
lines changed

README.md

+205-6
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,24 @@
1313
1414
# About
1515

16+
<!--
17+
18+
!IMPORTANT!
19+
20+
Everything in this README between "# About" and "# Install" is automatically
21+
generated and will be overwritten the next time the doc generator is run.
22+
23+
To make changes to this section, please update the @packageDocumentation section
24+
of src/index.js or src/index.ts
25+
26+
To experiment with formatting, please run "npm run docs" from the root of this
27+
repo and examine the changes made.
28+
29+
-->
30+
1631
`@helia/verified-fetch` provides a [fetch](https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API)-like API for retrieving content from the [IPFS](https://ipfs.tech/) network.
1732

18-
All content is retrieved in a [trustless manner](https://www.techopedia.com/definition/trustless), and the integrity of all bytes are verified by comparing hashes of the data.
33+
All content is retrieved in a [trustless manner](https://www.techopedia.com/definition/trustless), and the integrity of all bytes are verified by comparing hashes of the data. By default, CIDs are retrieved over HTTP from [trustless gateways](https://specs.ipfs.tech/http-gateways/trustless-gateway/).
1934

2035
This is a marked improvement over `fetch` which offers no such protections and is vulnerable to all sorts of attacks like [Content Spoofing](https://owasp.org/www-community/attacks/Content_Spoofing), [DNS Hijacking](https://en.wikipedia.org/wiki/DNS_hijacking), etc.
2136

@@ -45,7 +60,7 @@ const json = await resp.json()
4560
import { verifiedFetch } from '@helia/verified-fetch'
4661
import { CID } from 'multiformats/cid'
4762

48-
const cid = CID.parse('bafyFoo') // some image file
63+
const cid = CID.parse('bafyFoo') // some json file
4964
const response = await verifiedFetch(cid)
5065
const json = await response.json()
5166
```
@@ -126,7 +141,7 @@ const json = await resp.json()
126141

127142
### Custom content-type parsing
128143

129-
By default, `@helia/verified-fetch` sets the `Content-Type` header as `application/octet-stream` - this is because the `.json()`, `.text()`, `.blob()`, and `.arrayBuffer()` methods will usually work as expected without a detailed content type.
144+
By default, if the response can be parsed as JSON, `@helia/verified-fetch` sets the `Content-Type` header as `application/json`, otherwise it sets it as `application/octet-stream` - this is because the `.json()`, `.text()`, `.blob()`, and `.arrayBuffer()` methods will usually work as expected without a detailed content type.
130145

131146
If you require an accurate content-type you can provide a `contentTypeParser` function as an option to `createVerifiedFetch` to handle parsing the content type.
132147

@@ -140,14 +155,198 @@ import { fileTypeFromBuffer } from '@sgtpooki/file-type'
140155

141156
const fetch = await createVerifiedFetch({
142157
gateways: ['https://trustless-gateway.link'],
143-
routers: ['http://delegated-ipfs.dev'],
158+
routers: ['http://delegated-ipfs.dev']
159+
}, {
144160
contentTypeParser: async (bytes) => {
145161
// call to some magic-byte recognition library like magic-bytes, file-type, or your own custom byte recognition
146-
return fileTypeFromBuffer(bytes)?.mime
162+
const result = await fileTypeFromBuffer(bytes)
163+
return result?.mime
147164
}
148165
})
149166
```
150167

168+
### IPLD codec handling
169+
170+
IPFS supports several data formats (typically referred to as codecs) which are included in the CID. `@helia/verified-fetch` attempts to abstract away some of the details for easier consumption.
171+
172+
#### DAG-PB
173+
174+
[DAG-PB](https://ipld.io/docs/codecs/known/dag-pb/) is the codec we are most likely to encounter, it is what [UnixFS](https://github.com/ipfs/specs/blob/main/UNIXFS.md) uses under the hood.
175+
176+
##### Using the DAG-PB codec as a Blob
177+
178+
```typescript
179+
import { verifiedFetch } from '@helia/verified-fetch'
180+
181+
const res = await verifiedFetch('ipfs://Qmfoo')
182+
const blob = await res.blob()
183+
184+
console.info(blob) // Blob { size: x, type: 'application/octet-stream' }
185+
```
186+
187+
##### Using the DAG-PB codec as an ArrayBuffer
188+
189+
```typescript
190+
import { verifiedFetch } from '@helia/verified-fetch'
191+
192+
const res = await verifiedFetch('ipfs://Qmfoo')
193+
const buf = await res.arrayBuffer()
194+
195+
console.info(buf) // ArrayBuffer { [Uint8Contents]: < ... >, byteLength: x }
196+
```
197+
198+
##### Using the DAG-PB codec as a stream
199+
200+
```typescript
201+
import { verifiedFetch } from '@helia/verified-fetch'
202+
203+
const res = await verifiedFetch('ipfs://Qmfoo')
204+
const reader = res.body?.getReader()
205+
206+
while (true) {
207+
const next = await reader.read()
208+
209+
if (next?.done === true) {
210+
break
211+
}
212+
213+
if (next?.value != null) {
214+
console.info(next.value) // Uint8Array(x) [ ... ]
215+
}
216+
}
217+
```
218+
219+
##### Content-Type
220+
221+
When fetching `DAG-PB` data, the content type will be set to `application/octet-stream` unless a custom content-type parser is configured.
222+
223+
#### JSON
224+
225+
The JSON codec is a very simple codec, a block parseable with this codec is a JSON string encoded into a `Uint8Array`.
226+
227+
##### Using the JSON codec
228+
229+
```typescript
230+
import * as json from 'multiformats/codecs/json'
231+
232+
const block = new TextEncoder().encode('{ "hello": "world" }')
233+
const obj = json.decode(block)
234+
235+
console.info(obj) // { hello: 'world' }
236+
```
237+
238+
##### Content-Type
239+
240+
When the `JSON` codec is encountered, the `Content-Type` header of the response will be set to `application/json`.
241+
242+
### DAG-JSON
243+
244+
[DAG-JSON](https://ipld.io/docs/codecs/known/dag-json/) expands on the `JSON` codec, adding the ability to contain [CID](https://docs.ipfs.tech/concepts/content-addressing/)s which act as links to other blocks, and byte arrays.
245+
246+
`CID`s and byte arrays are represented using special object structures with a single `"/"` property.
247+
248+
Using `DAG-JSON` has two important caveats:
249+
250+
1. Your `JSON` structure cannot contain an object with only a `"/"` property, as it will be interpreted as a special type.
251+
2. Since `JSON` has no technical limit on number sizes, `DAG-JSON` also allows numbers larger than `Number.MAX_SAFE_INTEGER`. JavaScript requires use of `BigInt`s to represent numbers larger than this, and `JSON.parse` does not support them, so precision will be lost.
252+
253+
Otherwise this codec follows the same rules as the `JSON` codec.
254+
255+
##### Using the DAG-JSON codec
256+
257+
```typescript
258+
import * as dagJson from '@ipld/dag-json'
259+
260+
const block = new TextEncoder().encode(`{
261+
"hello": "world",
262+
"cid": {
263+
"/": "baeaaac3imvwgy3zao5xxe3de"
264+
},
265+
"buf": {
266+
"/": {
267+
"bytes": "AAECAwQ"
268+
}
269+
}
270+
}`)
271+
272+
const obj = dagJson.decode(block)
273+
274+
console.info(obj)
275+
// {
276+
// hello: 'world',
277+
// cid: CID(baeaaac3imvwgy3zao5xxe3de),
278+
// buf: Uint8Array(5) [ 0, 1, 2, 3, 4 ]
279+
// }
280+
```
281+
282+
##### Content-Type
283+
284+
When the `DAG-JSON` codec is encountered in the requested CID, the `Content-Type` header of the response will be set to `application/json`.
285+
286+
`DAG-JSON` data can be parsed from the response by using the `.json()` function, which will return `CID`s/byte arrays as plain `{ "/": ... }` objects:
287+
288+
```typescript
289+
import { verifiedFetch } from '@helia/verified-fetch'
290+
import * as dagJson from '@ipld/dag-json'
291+
292+
const res = await verifiedFetch('ipfs://bafyDAGJSON')
293+
294+
// either:
295+
const obj = await res.json()
296+
console.info(obj.cid) // { "/": "baeaaac3imvwgy3zao5xxe3de" }
297+
console.info(obj.buf) // { "/": { "bytes": "AAECAwQ" } }
298+
```
299+
300+
Alternatively, it can be decoded using the `@ipld/dag-json` module and the `.arrayBuffer()` method, in which case you will get `CID` objects and `Uint8Array`s:
301+
302+
```typescript
303+
import { verifiedFetch } from '@helia/verified-fetch'
304+
import * as dagJson from '@ipld/dag-json'
305+
306+
const res = await verifiedFetch('ipfs://bafyDAGJSON')
307+
308+
// or:
309+
const obj = dagJson.decode(await res.arrayBuffer())
310+
console.info(obj.cid) // CID(baeaaac3imvwgy3zao5xxe3de)
311+
console.info(obj.buf) // Uint8Array(5) [ 0, 1, 2, 3, 4 ]
312+
```
313+
314+
#### DAG-CBOR
315+
316+
[DAG-CBOR](https://ipld.io/docs/codecs/known/dag-cbor/) uses the [Concise Binary Object Representation](https://cbor.io/) format for serialization instead of JSON.
317+
318+
This supports more datatypes in a safer way than JSON and is smaller on the wire to boot so is usually preferable to JSON or DAG-JSON.
319+
320+
##### Content-Type
321+
322+
Not all data types supported by `DAG-CBOR` can be successfully turned into JSON and back into the same binary form.
323+
324+
When a decoded block can be round-tripped to JSON, the `Content-Type` will be set to `application/json`. In this case the `.json()` method on the `Response` object can be used to obtain an object representation of the response.
325+
326+
When it cannot, the `Content-Type` will be `application/octet-stream` - in this case the `@ipld/dag-json` module must be used to deserialize the return value from `.arrayBuffer()`.
327+
328+
##### Detecting JSON-safe DAG-CBOR
329+
330+
If the `Content-Type` header of the response is `application/json`, the `.json()` method may be used to access the response body in object form, otherwise the `.arrayBuffer()` method must be used to decode the raw bytes using the `@ipld/dag-cbor` module.
331+
332+
```typescript
333+
import { verifiedFetch } from '@helia/verified-fetch'
334+
import * as dagCbor from '@ipld/dag-cbor'
335+
336+
const res = await verifiedFetch('ipfs://bafyDagCborCID')
337+
let obj
338+
339+
if (res.headers.get('Content-Type') === 'application/json') {
340+
// DAG-CBOR data can be safely decoded as JSON
341+
obj = await res.json()
342+
} else {
343+
// response contains non-JSON friendly data types
344+
obj = dagCbor.decode(await res.arrayBuffer())
345+
}
346+
347+
console.info(obj) // ...
348+
```
349+
151350
## Comparison to fetch
152351

153352
This module attempts to act as similarly to the `fetch()` API as possible.
@@ -165,7 +364,7 @@ This library supports the following methods of fetching web3 content from IPFS:
165364
2. IPNS protocol: `ipns://<peerId>` & `ipns://<publicKey>` & `ipns://<hostUri_Supporting_DnsLink_TxtRecords>`
166365
3. CID instances: An actual CID instance `CID.parse('bafy...')`
167366

168-
As well as support for pathing & params for item 1 & 2 above according to [IPFS - Path Gateway Specification](https://specs.ipfs.tech/http-gateways/path-gateway) & [IPFS - Trustless Gateway Specification](https://specs.ipfs.tech/http-gateways/trustless-gateway/). Further refinement of those specifications specifically for web-based scenarios can be found in the [Web Pathing Specification IPIP](https://github.com/ipfs/specs/pull/453).
367+
As well as support for pathing & params for items 1 & 2 above according to [IPFS - Path Gateway Specification](https://specs.ipfs.tech/http-gateways/path-gateway) & [IPFS - Trustless Gateway Specification](https://specs.ipfs.tech/http-gateways/trustless-gateway/). Further refinement of those specifications specifically for web-based scenarios can be found in the [Web Pathing Specification IPIP](https://github.com/ipfs/specs/pull/453).
169368

170369
If you pass a CID instance, it assumes you want the content for that specific CID only, and does not support pathing or params for that CID.
171370

package.json

+9-7
Original file line numberDiff line numberDiff line change
@@ -142,25 +142,26 @@
142142
},
143143
"dependencies": {
144144
"@helia/block-brokers": "^2.0.1",
145-
"@helia/dag-cbor": "^3.0.0",
146-
"@helia/dag-json": "^3.0.0",
147145
"@helia/http": "^1.0.1",
148146
"@helia/interface": "^4.0.0",
149147
"@helia/ipns": "^6.0.0",
150-
"@helia/json": "^3.0.0",
151148
"@helia/routers": "^1.0.0",
152149
"@helia/unixfs": "^3.0.0",
153-
"@ipld/dag-cbor": "^9.1.0",
154-
"@ipld/dag-json": "^10.1.7",
155-
"@ipld/dag-pb": "^4.0.8",
150+
"@ipld/dag-cbor": "^9.2.0",
151+
"@ipld/dag-json": "^10.2.0",
152+
"@ipld/dag-pb": "^4.1.0",
156153
"@libp2p/interface": "^1.1.2",
157154
"@libp2p/peer-id": "^4.0.5",
155+
"cborg": "^4.0.9",
158156
"hashlru": "^2.3.0",
159157
"ipfs-unixfs-exporter": "^13.5.0",
160-
"multiformats": "^13.0.1",
158+
"multiformats": "^13.1.0",
161159
"progress-events": "^1.0.0"
162160
},
163161
"devDependencies": {
162+
"@helia/dag-cbor": "^3.0.0",
163+
"@helia/dag-json": "^3.0.0",
164+
"@helia/json": "^3.0.0",
164165
"@helia/utils": "^0.0.1",
165166
"@libp2p/logger": "^4.0.5",
166167
"@libp2p/peer-id-factory": "^4.0.5",
@@ -171,6 +172,7 @@
171172
"datastore-core": "^9.2.8",
172173
"helia": "^4.0.1",
173174
"it-last": "^3.0.4",
175+
"it-to-buffer": "^4.0.5",
174176
"magic-bytes.js": "^1.8.0",
175177
"sinon": "^17.0.1",
176178
"sinon-ts": "^2.0.0",

0 commit comments

Comments
 (0)