-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apollo Server 2.0 - Caching #1163
Conversation
081abe1
to
edbb2c8
Compare
edbb2c8
to
395da0f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great start! We could probably merge now, so that we have an api that people can start playing with. Without some documentation, I hesitate to hit the button. I'll attempt to put together some thing for your morning.
My biggest worry right now is that the caching does not occur cross-request, since a new cache is created during each request. Also I didn't see a place for the user to pass in their own cache, which will soon be possible in #1169. Completely understand the hesitancy, since the api's have been changing so much
export abstract class RESTDataSource<TContext = any> { | ||
abstract baseURL: string; | ||
|
||
httpCache!: HTTPCache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally like variable scopes explicit, public
in this case.
const httpCache = new HTTPCache(); | ||
|
||
for (const dataSource of Object.values(dataSources)) { | ||
dataSource.httpCache = httpCache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This explicit coupling of data sources to an httpCache is worrisome. Maybe a lifecycle method, such as willReceiveCache
or willReceiveContext
might be a bit better. In addition to the httpCache, the primitive cache should be an argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other option would be to provide them as arguments to the dataSources api. Also I'm worried that these methods are called for each request, so with the current setup, it doesn't appear that there would be any caching, since the HttpCache is created each time
|
||
export type Params = { [name: string]: any }; | ||
|
||
export abstract class RESTDataSource<TContext = any> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! We'll eventually create an interface to share between DataSource
s. Starting with a concrete use case is definitely the way to go.
Starting to write some tests. It might be nice to have the context value be sources instead of |
It appears that we require that a rest response be parsable json. If a rest endpoint returns a string, then we should treat it as such |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks nice!
That said, the examples so far don't really make use of the fact that each DataSource is attached to an incoming request (created anew) — not clear to me that it gives you anything you don't get just from import { id } from './datasources
. Presumably there are plans here, like integrating DataLoader, tracing, auth/context, etc? Would be nice to see at least one of those things existing in practice before 2.0 release.
@@ -121,6 +122,10 @@ export class ApolloServerBase { | |||
delete requestOptions.persistedQueries; | |||
} | |||
|
|||
if (!requestOptions.cache) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should stick to the philosophy that all extra features can be turned off with false
.
@@ -0,0 +1,114 @@ | |||
import fetch, { Request, Response, Headers } from 'node-fetch'; | |||
|
|||
Object.assign(global, { fetch, Request, Response, Headers }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be conditional, or ideally not actually global? I think it would be cleaner to have code explicitly import these symbols from apollo-server-env. I know I would want to be able to find (while reading say RESTDataSource) where that URL and Request symbol come from...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been going back and forth on this, so definitely open to changing it. The reason for making them global was that this is how environments other than Node (browser, Cloudflare, fly.io) expose them.
We also have existing code (including apollo-engine-reporting
) relying on the types being global (because it compiles with the dom
lib), so this would require those to also import from apollo-server-env
.
@@ -0,0 +1,47 @@ | |||
import { URL, URLSearchParams } from 'url'; | |||
|
|||
Object.assign(global, { URL, URLSearchParams }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be conditional, or ideally not actually global?
"dependencies": { | ||
"apollo-server-env": "2.0.0-beta.7", | ||
"http-cache-semantics": "^4.0.0", | ||
"lru-cache": "^4.1.3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're using quick-lru for PQs — unless there's a compelling reason we should try to use the same one for both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lru-cache
supports max size based on the size of the items in it (not just the number of items), so that seemed important for a response cache.
); | ||
} | ||
|
||
private async fetch( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like this might as well be protected if you wanted to be able to call it with a calculated method
response: Response, | ||
policy: CachePolicy, | ||
): Promise<Response> { | ||
if (!response.headers.has('Cache-Control') || !policy.storable()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This first clause deserves a comment. Specifically, this means we are not caching the "cacheable by default" response types like 200 based on a heuristic based on its last-modified. Seems reasonable but worth documenting.
It also means we don't support the somewhat older "expires" header, which maybe was an oversight?
} | ||
const response = await this.httpCache.fetch(request, init); | ||
if (response.ok) { | ||
return response.json(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So several times while reading this code I've been confused by a fact about async functions that perhaps I should just internalize more strongly: you can return a Promise explicitly from them, or return an expression that is calling another async function, without having to explicitly await, because nested Promises are unwrapped.
As far as I understand, in an async function f1, calling another async function f2 like return f2()
and return await f2()
are identical. But I definitely felt odd reading this function and thinking "ok, you have to await response.text()
but not response.json()
? that's weird!".
Would it be reasonable to consistently use await when calling nested async functions (assuming you aren't actually trying to manipulate the Promise of course) or should I just learn the language?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mental model for this is that you're always returning a promise from an async function, and with that returning the result of another async function doesn't seem surprising. Also not entirely sure if the generated code with an explicit await is identical, although I'd expect it to be optimized away. We may want to see what conventions other people have adopted.
} | ||
const response = await this.httpCache.fetch(request, init); | ||
if (response.ok) { | ||
return response.json(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also @evans has a much more concrete point: we should be able to use this method to fetch non-JSON results as well.
|
||
const cacheKey = cacheKeyFor(request); | ||
|
||
const body = await response.text(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines assume that you are always getting text/UTF8 and not say some binary protocol.
quick-lru or keyv (I forget which one) uses https://www.npmjs.com/package/json-buffer which is nice I think.
|
||
if (policy.satisfiesWithoutRevalidation(policyRequestFrom(request))) { | ||
const headers = policy.responseHeaders(); | ||
return new Response(body, { status: policy._status, headers }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of depending on this internal detail, just save status
separately in our cache alongside policy
and body
. (There's another case of it being used below; I'm pretty sure that revalidatedPolicy._status is always equal to policy._status so you can just use the same saved value.)
willSendRequest?(request: Request): void; | ||
public willSendRequest?(request: Request): void; | ||
|
||
public willReceiveCache(httpCache: HTTPCache) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, not sure I understand the benefits of this. It will actually open up users to mistakes when they override these methods and forget to call super (or set the properties themselves).
If the idea is that we'd like to have a lifecycle method that's called when the datasource is fully initialized, we should probably define it separately and call it when both cache and context have been set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense! We don't want users to mess with those "lifecycles", then we should probably have a ready
or willReceiveRequest
lifecycle for after the httpCache and context are set.
|
||
public willSendRequest?(request: Request): void; | ||
|
||
public willReceiveCache(httpCache: HTTPCache) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@glasser pointed out that these names aren't the best, since the data source is really just receiving the cache, so something like receiveCache
would probably be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced we need these at all, but if we do I like the consistency with React's willReceiveProps
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
want to make sure #1163 (comment) doesn't get buried
This PR adds a first version of an
apollo-server-caching
package that will be used by the new Data Source API. It usesjest
for testing, which means that its tests don't yet run as part of CI.In addition to the new package, this PR attempts to fix types for
fetch
and related classes likeRequest
andResponse
. Previously, thelib
entry intsconfig.json
containeddom
, which brings in a lot of types that don't make sense in server environments. I tried usingwebworker
instead, but that also doesn't accurately represent the actual environments we'll be running in. In addition, the types we do use from those declarations contained various omissions and mistakes.To fix this, I've added a top-level
types
directory with slimmed down declarations forfetch
and related types, as well as the WHATWGURL
andURLSearchParams
(which are used in Data Sources). These are all available in global scope on Cloudflare Workers and fly.io. On Node, this requires polyfills that for now I've included inapollo-server-caching
. We may want to move these toapollo-server-core
or extract them into a separate package. I'm not sure what the best approach is, so suggestions are very welcome.