-
Notifications
You must be signed in to change notification settings - Fork 541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement HTTP caching #3231
Comments
I've looked into implementing this as an interceptor, and need some information in order to proceed:
|
Can you elaborate on what you have in mind?
The closest implementation will be the |
I have somehow interest to implement it... |
I was trying to work out the easiest way of caching the response object in-memory would be given that I'm assuming it would be coming through with the body as a readable-stream that needs to be cloned - I was assuming that the Web Cache would already need to implement such behavior. I'll step out of discussions of implementation details now if someone else is willing to take over.
@Uzlopak I'm very time poor so I'm more than happy to step back on my side! Note you'll probably want to implement rfc9111 which appears to obsolete rfc7234. |
I am also not very time "rich", but I already investigated some hours in reading into rfc 9111. The only thing which was not clear to me, is how we ensure that we dont open a security issue :D What I mean is: In a browser context you have one user and fetch calls can be cached, as the cache is user scoped. In node we dont have users. So a http cache would be globally activated (?) in nodejs. So we need to have something like setGlobalDispatcher but for caching? setGlobalHttpCache? |
I would start having an interceptor or Agent that implement the caching protocol of HTTP, and then worry about how we enable it all the time. I think it's possible to have it always enabled and not cache any private data: that's how cdn works after all. Before worrying about this, we need the implementation ;). |
Ok, i will start tonight. ;) |
I've implemented http-cache-semantics by reading the RFC start to finish, and translating each paragraph into code. I highly recommend that approach, because it results in good comments with references to their RFC sections (I wish I kept more of them in my code), and helps ensure that unobvious behaviors are not missed. |
Usually when I implement an RFC or any other spec I print it out. Then like you say, use the text for commenting the code. When I implemented a paragraph I mark it in the print out that it was implemented, too. Yellow marker for implemented, pink marker for has an issue and should be checked again. When the whole text is colored, then I know the spec is implemented completely ;). |
Any updates on this? |
ping |
We have already been working a bit on this. nxtedition/nxt-undici#3 Maybe should sync our efforts? |
That sgtm, how much progress have y'all made on it? |
Basing off of some ideas discussed here as well as ones in nxtedition/nxt-undici, what do y'all think of this api? interface CacheInterceptorGlobalOptions {
// Function for creating the cache key
makeKey?: (opts: Dispatcher.RequestOptions) => string;
store?: CacheStore;
}
// Interface responsible for storing the cached responses.
interface CacheStore {
get(key: string) => Promise<string>;
put(opts: CachePutOptions) => Promise<void>;
}
type CachePutOptions = {
key: string;
value: string;
size: number;
vary?: string;
expires?: number;
// Subject to change depending on implementation specifics, but the idea is to
// provide the cache store with all the info we're given in the cache control
// directives
}; Example usageconst { Client } = require('undici')
const cacheInterceptor = require('../lib/interceptor/cache')
const client = new Client('https://google.com')
.compose(cacheInterceptor({
makeKey: (opts) => `${opts.origin}:${opts.method}:${opts.path}`,
store: {
get: async (key) => {/* ... */},
set: async (opts) => {/* ... */}
}
}))
client.request({ path: '/', method: 'GET' }).then(/* ... */); Defaults
The spec recommends that the cache key is at least the request's uri and method in Section 2. For the default cache store, I think could use CREATE TABLE IF NOT EXISTS cacheInterceptor(
key TEXT PRIMARY KEY NOT NULL,
value TEXT NOT NULL,
vary TEXT,
size INTEGER,
expires INTEGER
-- Subject to change depending on implementation specifics
) STRICT;
CREATE INDEX IF NOT EXISTS idxCacheInterceptorExpires ON cacheInterceptor(expires); For purging old responses, we can go with the same approach in nxtedition/nxt-undici#4 and purge them in the |
Why can't this be simplified for something like: interface CacheInterceptorGlobalOptions {
// Function for creating the cache key
store?: CacheStore;
}
// Interface responsible for storing the cached responses.
interface CacheStore {
get(key: Dispatcher.RequestOptions) => Promise<??>;
put(key: Dispatcher.RequestOptions, opts: CachePutOptions) => Promise<void>;
}
type CachePutOptions = {
value: string;
size: number;
vary?: string;
expires?: number;
// Subject to change depending on implementation specifics, but the idea is to
// provide the cache store with all the info we're given in the cache control
// directives
}; I'm unsure we could always streamline the key to a string without overhead |
I think key can always be |
It probably will be, but imo it'd still be good to allow it to be overrided if there's some use case that calls for it. It's definitely not needed right now though and can always be added later |
Depending on the scope of the Cache, we might need to also add origin to the mix (as recommended by spec as well). |
I thought in this case the server should respond with a vary header containing |
I still didn't get an answer why we need to have a cache key as a string. I think we should keep it maximally flexible and pass in the request options. |
That won't work though? You can have different request options that have the same result. That's why the spec says you should set the key to method + path and then use the vary header to determine what in the request headers is relevant. |
RFC 9111 Section 2 (https://www.rfc-editor.org/rfc/rfc9111.html#section-2) says
I can see no mention of ‘method+path’; I would infer the URI must be absolute, as different hosts could provide unrelated responses with the same path. Am I missing something? I thought the ‘vary’ field cannot have a value which includes the host? According to https://httpwg.org/specs/rfc9110.html#field.vary,
|
method+path === "the request method and target URI" ?! |
https://www.ibm.com/docs/en/cics-ts/6.x?topic=concepts-components-url path = the specific resource in the host |
If different host can provide unrelated responses then they should reply with
That doesn't say anything about not including host header. |
Cache is always split per URL, and Vary only adds additional granularity within the single URL. (Cache is technically also split per method, but PUT/POST invalidates cached GET, so in practice you can just cache GET) Varying by |
Any chance you could refer to the relevant parts in the spec? |
+1
It allows for each cache store to have its own |
Response (corrected)
+1 on a custom way to generate cache key, only if advising that this will cause the cache to deviate from the spec. |
A request not matching existing A good HTTP cache should have two cache levels:
If the second lookup misses, it doesn't invalidate anything yet, only goes to fetch a response. If the server responds with the same Only when the server sends a different |
Just as an update: planning to have at least a draft pr open by this Friday. Not all of the caching directives or optional behaviors will be supported in it just to limit its scope (required ones will be implemented though ofc). Follow up prs will implement more |
Would be nice to see node cache similar to web api cache. https://developer.mozilla.org/en-US/docs/Web/API/Cache/matchAll |
I’d recommend to prioritize first the interceptor’s work, and we can further evaluate how to setup/port Cache as Web API |
Implements bare-bones http caching as per rfc9111 Closes nodejs#3231 Closes nodejs#2760 Closes nodejs#2256 Closes nodejs#1146 Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
Implements bare-bones http caching as per rfc9111 Closes nodejs#3231 Closes nodejs#2760 Closes nodejs#2256 Closes nodejs#1146 Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
Implements bare-bones http caching as per rfc9111 Closes nodejs#3231 Closes nodejs#2760 Closes nodejs#2256 Closes nodejs#1146 Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
Implements bare-bones http caching as per rfc9111 Closes nodejs#3231 Closes nodejs#2760 Closes nodejs#2256 Closes nodejs#1146 Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com> Co-authored-by: Carlos Fuentes <me@metcoder.dev> Co-authored-by: Robert Nagy <ronagy@icloud.com>
Implements bare-bones http caching as per rfc9111 Closes nodejs#3231 Closes nodejs#2760 Closes nodejs#2256 Closes nodejs#1146 Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com> Co-authored-by: Carlos Fuentes <me@metcoder.dev> Co-authored-by: Robert Nagy <ronagy@icloud.com> Co-authored-by: Isak Törnros <isak.tornros@hotmail.com>
Implements bare-bones http caching as per rfc9111 Closes nodejs#3231 Closes nodejs#2760 Closes nodejs#2256 Closes nodejs#1146 Co-authored-by: Carlos Fuentes <me@metcoder.dev> Co-authored-by: Robert Nagy <ronagy@icloud.com> Co-authored-by: Isak Törnros <isak.tornros@hotmail.com> Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
This would solve...
Performance and robustness issues associated with fetching the same HTTP resource multiple times.
The implementation should look like...
Use an interceptor(?) to implement http caching using the
Cache-Control
header. I'm inclined to think that theExpires
header should not be implemented as http 1.0 has extremely limited use.Additional context
Originally raised in #3221.
The text was updated successfully, but these errors were encountered: