Skip to content

Commit 114f3a4

Browse files
2colorSgtPookiachingbrainsemantic-release-bot
authored
fix: set cache-control header correctly (#19)
* fix: set cache-control header conditionally Fixes #17 * chore: bump deps * feat: implement new ipns record&answer properties (#23) * feat: implement new ipns record&answer properties * fix: parseUrlString response includes defined ttl, set ttl if ipnsCached * test: fix firefox failure * feat: support http range header (#10) * chore: limit body parameters to the types used * chore: add response-header helper and tests * feat: add range header parsing support * feat: verified-fetch supports range-requests * test: fix dns test asserting test failure since we are catching it now * fix: return 500 error when streaming unixfs content throws * fix: cleanup code and unexecuting tests hiding errors * chore: some cleanup and code coverage * tmp: most things working * fix: stream slicing and test correctness * chore: fixed some ByteRangeContext tests * test: add back header helpers * fix: unixfs tests are passing * fix: range-requests on raw content * feat: tests are passing moved transform stream over to https://github.com/SgtPooki/streams * chore: log string casing * chore: use 502 response instead of 500 * chore: use libp2p/interface for types in src * chore: failing to create range resp logs error * chore: Apply suggestions from code review * chore: fix broken tests from github PR patches (my own) * chore: re-enable stream tests for ByteRangeContext * chore: clean up getBody a bit * chore: ByteRangeContext getBody cleanup * chore: apply suggestions from code review Co-authored-by: Alex Potsides <alex@achingbrain.net> * fix: getSlicedBody uses correct types * chore: remove extra stat call * chore: fix jsdoc with '*/' * chore: fileSize is public property, but should not be used * test: fix blob comparisons that broke or were never worjing properly * chore: Update byte-range-context.ts Co-authored-by: Alex Potsides <alex@achingbrain.net> * chore: jsdoc cleanup * Revert "chore: fileSize is public property, but should not be used" This reverts commit 46dc133. * chore: jsdoc comments explaining .fileSize use * chore: isRangeRequest is public * chore: getters/setters update * chore: remove unnecessary _contentRangeHeaderValue * chore: ByteRangeContext uses setFileSize and getFileSize * chore: remove .stat changes that are no longer needed --------- Co-authored-by: Alex Potsides <alex@achingbrain.net> * chore(release): 1.2.0 [skip ci] ## @helia/verified-fetch [1.2.0](https://github.com/ipfs/helia-verified-fetch/compare/@helia/verified-fetch-1.1.3...@helia/verified-fetch-1.2.0) (2024-03-15) ### Features * support http range header ([#10](#10)) ([9f5078a](9f5078a)) ### Trivial Changes * fix build ([#22](#22)) ([01261fe](01261fe)) * chore(release): 1.7.0 [skip ci] ## @helia/verified-fetch-interop [1.7.0](https://github.com/ipfs/helia-verified-fetch/compare/@helia/verified-fetch-interop-1.6.0...@helia/verified-fetch-interop-1.7.0) (2024-03-15) ### Dependencies * **@helia/verified-fetch:** upgraded to 1.2.0 * chore: apply pr comments * fix: some ipns ttl precision cleanup --------- Co-authored-by: Alex Potsides <alex@achingbrain.net> Co-authored-by: semantic-release-bot <semantic-release-bot@martynus.net> * chore: add matchUrlGroups typeguard * chore: remove unnecessary headerValue != null check * test: remove unnecessary redefinition of verifiedFetch --------- Co-authored-by: Daniel N <2color@users.noreply.github.com> Co-authored-by: Russell Dempsey <1173416+SgtPooki@users.noreply.github.com> Co-authored-by: Alex Potsides <alex@achingbrain.net> Co-authored-by: semantic-release-bot <semantic-release-bot@martynus.net>
1 parent a240056 commit 114f3a4

File tree

7 files changed

+284
-36
lines changed

7 files changed

+284
-36
lines changed

packages/verified-fetch/src/utils/parse-resource.ts

+3-2
Original file line numberDiff line numberDiff line change
@@ -32,8 +32,9 @@ export async function parseResource (resource: Resource, { ipns, logger }: Parse
3232
cid,
3333
protocol: 'ipfs',
3434
path: '',
35-
query: {}
36-
}
35+
query: {},
36+
ttl: 29030400 // 1 year for ipfs content
37+
} satisfies ParsedUrlStringResults
3738
}
3839

3940
throw new TypeError(`Invalid resource. Cannot determine CID from resource: ${resource}`)

packages/verified-fetch/src/utils/parse-url-string.ts

+81-19
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,11 @@ import { peerIdFromString } from '@libp2p/peer-id'
22
import { CID } from 'multiformats/cid'
33
import { TLRU } from './tlru.js'
44
import type { RequestFormatShorthand } from '../types.js'
5-
import type { IPNS, ResolveDNSLinkProgressEvents, ResolveResult } from '@helia/ipns'
5+
import type { DNSLinkResolveResult, IPNS, IPNSResolveResult, ResolveDNSLinkProgressEvents, ResolveResult } from '@helia/ipns'
66
import type { ComponentLogger } from '@libp2p/interface'
77
import type { ProgressOptions } from 'progress-events'
88

9-
const ipnsCache = new TLRU<ResolveResult>(1000)
9+
const ipnsCache = new TLRU<DNSLinkResolveResult | IPNSResolveResult>(1000)
1010

1111
export interface ParseUrlStringInput {
1212
urlString: string
@@ -23,30 +23,80 @@ export interface ParsedUrlQuery extends Record<string, string | unknown> {
2323
filename?: string
2424
}
2525

26-
export interface ParsedUrlStringResults {
27-
protocol: string
28-
path: string
29-
cid: CID
26+
interface ParsedUrlStringResultsBase extends ResolveResult {
27+
protocol: 'ipfs' | 'ipns'
3028
query: ParsedUrlQuery
29+
30+
/**
31+
* seconds as a number
32+
*/
33+
ttl?: number
3134
}
3235

36+
export type ParsedUrlStringResults = ParsedUrlStringResultsBase
37+
3338
const URL_REGEX = /^(?<protocol>ip[fn]s):\/\/(?<cidOrPeerIdOrDnsLink>[^/?]+)\/?(?<path>[^?]*)\??(?<queryString>.*)$/
3439
const PATH_REGEX = /^\/(?<protocol>ip[fn]s)\/(?<cidOrPeerIdOrDnsLink>[^/?]+)\/?(?<path>[^?]*)\??(?<queryString>.*)$/
3540
const PATH_GATEWAY_REGEX = /^https?:\/\/(.*[^/])\/(?<protocol>ip[fn]s)\/(?<cidOrPeerIdOrDnsLink>[^/?]+)\/?(?<path>[^?]*)\??(?<queryString>.*)$/
3641
const SUBDOMAIN_GATEWAY_REGEX = /^https?:\/\/(?<cidOrPeerIdOrDnsLink>[^/?]+)\.(?<protocol>ip[fn]s)\.([^/?]+)\/?(?<path>[^?]*)\??(?<queryString>.*)$/
3742

38-
function matchURLString (urlString: string): Record<string, string> {
43+
interface MatchUrlGroups {
44+
protocol: 'ipfs' | 'ipns'
45+
cidOrPeerIdOrDnsLink: string
46+
path?: string
47+
queryString?: string
48+
}
49+
50+
function matchUrlGroupsGuard (groups?: null | { [key in string]: string; } | MatchUrlGroups): groups is MatchUrlGroups {
51+
const protocol = groups?.protocol
52+
if (protocol == null) return false
53+
const cidOrPeerIdOrDnsLink = groups?.cidOrPeerIdOrDnsLink
54+
if (cidOrPeerIdOrDnsLink == null) return false
55+
const path = groups?.path
56+
const queryString = groups?.queryString
57+
58+
return ['ipns', 'ipfs'].includes(protocol) &&
59+
typeof cidOrPeerIdOrDnsLink === 'string' &&
60+
(path == null || typeof path === 'string') &&
61+
(queryString == null || typeof queryString === 'string')
62+
}
63+
64+
function matchURLString (urlString: string): MatchUrlGroups {
3965
for (const pattern of [URL_REGEX, PATH_REGEX, PATH_GATEWAY_REGEX, SUBDOMAIN_GATEWAY_REGEX]) {
4066
const match = urlString.match(pattern)
4167

42-
if (match?.groups != null) {
43-
return match.groups
68+
if (matchUrlGroupsGuard(match?.groups)) {
69+
return match.groups satisfies MatchUrlGroups
4470
}
4571
}
4672

4773
throw new TypeError(`Invalid URL: ${urlString}, please use ipfs://, ipns://, or gateway URLs only`)
4874
}
4975

76+
/**
77+
* determines the TTL for the resolved resource that will be used for the `Cache-Control` header's `max-age` directive.
78+
* max-age is in seconds
79+
*
80+
* @see https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control#response_directives
81+
*
82+
* If we have ipnsTtlNs, it will be a BigInt representing "nanoseconds". We need to convert it back to seconds.
83+
*
84+
* For more TTL nuances:
85+
*
86+
* @see https://github.com/ipfs/js-ipns/blob/16e0e10682fa9a663e0bb493a44d3e99a5200944/src/index.ts#L200
87+
* @see https://github.com/ipfs/js-ipns/pull/308
88+
*/
89+
function calculateTtl (resolveResult?: IPNSResolveResult | DNSLinkResolveResult): number | undefined {
90+
if (resolveResult == null) {
91+
return undefined
92+
}
93+
const dnsLinkTtl = (resolveResult as DNSLinkResolveResult).answer?.TTL
94+
const ipnsTtlNs = (resolveResult as IPNSResolveResult).record?.ttl
95+
// For some reason, ipns "nanoseconds" are 1e-8 of a second, instead of 1e-9.
96+
const ipnsTtl = ipnsTtlNs != null ? Number(ipnsTtlNs / BigInt(1e8)) : undefined
97+
return dnsLinkTtl ?? ipnsTtl
98+
}
99+
50100
/**
51101
* For dnslinks see https://specs.ipfs.tech/http-gateways/subdomain-gateway/#host-request-header
52102
* DNSLink names include . which means they must be inlined into a single DNS label to provide unique origin and work with wildcard TLS certificates.
@@ -89,32 +139,36 @@ export async function parseUrlString ({ urlString, ipns, logger }: ParseUrlStrin
89139
let cid: CID | undefined
90140
let resolvedPath: string | undefined
91141
const errors: Error[] = []
142+
let resolveResult: IPNSResolveResult | DNSLinkResolveResult | undefined
92143

93144
if (protocol === 'ipfs') {
94145
try {
95146
cid = CID.parse(cidOrPeerIdOrDnsLink)
147+
/**
148+
* no ttl set. @link {setCacheControlHeader}
149+
*/
96150
} catch (err) {
97151
log.error(err)
98152
errors.push(new TypeError('Invalid CID for ipfs://<cid> URL'))
99153
}
100154
} else {
101-
let resolveResult = ipnsCache.get(cidOrPeerIdOrDnsLink)
155+
// protocol is ipns
156+
resolveResult = ipnsCache.get(cidOrPeerIdOrDnsLink)
102157

103158
if (resolveResult != null) {
104159
cid = resolveResult.cid
105160
resolvedPath = resolveResult.path
106161
log.trace('resolved %s to %c from cache', cidOrPeerIdOrDnsLink, cid)
107162
} else {
108-
// protocol is ipns
109-
log.trace('attempting to resolve PeerId for %s', cidOrPeerIdOrDnsLink)
163+
log.trace('Attempting to resolve PeerId for %s', cidOrPeerIdOrDnsLink)
110164
let peerId = null
111165
try {
166+
// try resolving as an IPNS name
112167
peerId = peerIdFromString(cidOrPeerIdOrDnsLink)
113168
resolveResult = await ipns.resolve(peerId, { onProgress: options?.onProgress })
114-
cid = resolveResult?.cid
115-
resolvedPath = resolveResult?.path
169+
cid = resolveResult.cid
170+
resolvedPath = resolveResult.path
116171
log.trace('resolved %s to %c', cidOrPeerIdOrDnsLink, cid)
117-
ipnsCache.set(cidOrPeerIdOrDnsLink, resolveResult, 60 * 1000 * 2)
118172
} catch (err) {
119173
if (peerId == null) {
120174
log.error('could not parse PeerId string "%s"', cidOrPeerIdOrDnsLink, err)
@@ -126,6 +180,7 @@ export async function parseUrlString ({ urlString, ipns, logger }: ParseUrlStrin
126180
}
127181

128182
if (cid == null) {
183+
// cid is still null, try resolving as a DNSLink
129184
let decodedDnsLinkLabel = cidOrPeerIdOrDnsLink
130185
if (isInlinedDnsLink(cidOrPeerIdOrDnsLink)) {
131186
decodedDnsLinkLabel = dnsLinkLabelDecoder(cidOrPeerIdOrDnsLink)
@@ -138,7 +193,6 @@ export async function parseUrlString ({ urlString, ipns, logger }: ParseUrlStrin
138193
cid = resolveResult?.cid
139194
resolvedPath = resolveResult?.path
140195
log.trace('resolved %s to %c', decodedDnsLinkLabel, cid)
141-
ipnsCache.set(cidOrPeerIdOrDnsLink, resolveResult, 60 * 1000 * 2)
142196
} catch (err: any) {
143197
log.error('could not resolve DnsLink for "%s"', cidOrPeerIdOrDnsLink, err)
144198
errors.push(err)
@@ -155,6 +209,13 @@ export async function parseUrlString ({ urlString, ipns, logger }: ParseUrlStrin
155209
throw new AggregateError(errors, `Invalid resource. Cannot determine CID from URL "${urlString}"`)
156210
}
157211

212+
const ttl = calculateTtl(resolveResult)
213+
214+
if (resolveResult != null) {
215+
// use the ttl for the resolved resouce for the cache, but fallback to 2 minutes if not available
216+
ipnsCache.set(cidOrPeerIdOrDnsLink, resolveResult, ttl ?? 60 * 1000 * 2)
217+
}
218+
158219
// parse query string
159220
const query: Record<string, any> = {}
160221

@@ -177,9 +238,10 @@ export async function parseUrlString ({ urlString, ipns, logger }: ParseUrlStrin
177238
return {
178239
protocol,
179240
cid,
180-
path: joinPaths(resolvedPath, urlPath),
181-
query
182-
}
241+
path: joinPaths(resolvedPath, urlPath ?? ''),
242+
query,
243+
ttl
244+
} satisfies ParsedUrlStringResults
183245
}
184246

185247
/**

packages/verified-fetch/src/utils/response-headers.ts

+36
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,39 @@
1+
interface CacheControlHeaderOptions {
2+
/**
3+
* This should be seconds as a number.
4+
*
5+
* See https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control#response_directives
6+
*/
7+
ttl?: number
8+
protocol: 'ipfs' | 'ipns'
9+
response: Response
10+
}
11+
12+
/**
13+
* Implementations may place an upper bound on any TTL received, as noted in Section 8 of [rfc2181].
14+
* If TTL value is unknown, implementations should not send a Cache-Control
15+
* No matter if TTL value is known or not, implementations should always send a Last-Modified header with the timestamp of the record resolution.
16+
*
17+
* @see https://specs.ipfs.tech/http-gateways/path-gateway/#cache-control-response-header
18+
*/
19+
export function setCacheControlHeader ({ ttl, protocol, response }: CacheControlHeaderOptions): void {
20+
let headerValue: string
21+
if (protocol === 'ipfs') {
22+
headerValue = 'public, max-age=29030400, immutable'
23+
} else if (ttl == null) {
24+
/**
25+
* default limit for unknown TTL: "use 5 minute as default fallback when it is not available."
26+
*
27+
* @see https://github.com/ipfs/boxo/issues/329#issuecomment-1995236409
28+
*/
29+
headerValue = 'public, max-age=300'
30+
} else {
31+
headerValue = `public, max-age=${ttl}`
32+
}
33+
34+
response.headers.set('cache-control', headerValue)
35+
}
36+
137
/**
238
* This function returns the value of the `Content-Range` header for a given range.
339
* If you know the total size of the body, pass it as `byteSize`

packages/verified-fetch/src/verified-fetch.ts

+7-1
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ import { getETag } from './utils/get-e-tag.js'
2222
import { getStreamFromAsyncIterable } from './utils/get-stream-from-async-iterable.js'
2323
import { tarStream } from './utils/get-tar-stream.js'
2424
import { parseResource } from './utils/parse-resource.js'
25+
import { setCacheControlHeader } from './utils/response-headers.js'
2526
import { badRequestResponse, movedPermanentlyResponse, notAcceptableResponse, notSupportedResponse, okResponse, badRangeResponse, okRangeResponse, badGatewayResponse } from './utils/responses.js'
2627
import { selectOutputType, queryFormatToAcceptHeader } from './utils/select-output-type.js'
2728
import { walkPath } from './utils/walk-path.js'
@@ -441,11 +442,15 @@ export class VerifiedFetch {
441442
let cid: ParsedUrlStringResults['cid']
442443
let path: ParsedUrlStringResults['path']
443444
let query: ParsedUrlStringResults['query']
445+
let ttl: ParsedUrlStringResults['ttl']
446+
let protocol: ParsedUrlStringResults['protocol']
444447
try {
445448
const result = await parseResource(resource, { ipns: this.ipns, logger: this.helia.logger }, options)
446449
cid = result.cid
447450
path = result.path
448451
query = result.query
452+
ttl = result.ttl
453+
protocol = result.protocol
449454
} catch (err) {
450455
this.log.error('error parsing resource %s', resource, err)
451456

@@ -516,7 +521,8 @@ export class VerifiedFetch {
516521
}
517522

518523
response.headers.set('etag', getETag({ cid, reqFormat, weak: false }))
519-
response.headers.set('cache-control', 'public, max-age=29030400, immutable')
524+
525+
setCacheControlHeader({ response, ttl, protocol })
520526
// https://specs.ipfs.tech/http-gateways/path-gateway/#x-ipfs-path-response-header
521527
response.headers.set('X-Ipfs-Path', resource.toString())
522528

0 commit comments

Comments
 (0)