What is caching?
A software component that stores data, that the future requests for that data can be served faster (infographic).
![IMPORTANT]
Caching a GraphQL API is different than other endpoint based APIs (e.g. things like REST).
It is harder in GraphQL.
POST /graphql HTTP 1.1
content-type: application/json
'
{
"query": "{ shop { name } }",
"variable": null,
"operationName": null
}
'
With POST
HTTP method we cannot use normal HTTP caching. But GraphQL Spec also do not ask us to only use POST
. So we can utilize GET
HTTP verb.
It is really great when:
- We have immutable resources with a very long
max-age
.- Things like JS/CSS assets.
max-age
is a year in the future.- Add a hash or version number to the file name, thus whenever wanna deploy a new version you can simply rename it. Then your client knows that they need to fetch the new version.
- We have mutable resources with a very clear expiration date.
- Our client is browsers;
- Set the proper headers on your backend
- Browsers will take care of the rest1 for you. No coding required.
Caution
It is not good if:
-
Our resources are mutable, and server needs to revalidate the cache. Meaning the request has to go to the server anyway.
And to do so we need to add
etags
, orlast-modified
headers to our requests.And most APIs are like this!
-
You have different clients that ain't browsers.
They do not need to do anything to benefit from caching. This means that we need to take care of it at the application level :).
Note
Make sure to read the consideration section!
-
Caching a single resolver:
Here this particular resolver is probably super slow, so we wanna make it faster.
-
Caching frequently accessed data:
- Our app uses the cache instead of querying it from the underlying DB.
- E.g. IdentityCache which caches data in a Memcached.
-
Cache all
Queries
.- We can do this by resolver level caching with directives.
- In Apollo Serve it is well known as per-field basis server-side caching.
Note
It is kinda I guess obvious that we do not wanna cache Mutations
and Introspection*.
*If it is enabled and accessible.
- We say whether an
Object
is cacheable or not. - Cache key is dictating "cache miss".
- Cache key structure:
AppName:Query:Variables:OperationName
Important
Query
and Variables
need to be first normalized and then hashed. We use the hashed value in this string of course.
- Measure how many request you have.
- How many of them are cacheable.
- How many ain't.
- How frequently you cache gets invalid.
If you have a ton of cacheable requests, then if you implement it right, it will have a big impact.
Query complexity (and query depth) is important since:
-
Our cache storage might run out of space if we cache a lot of queries.
-
Or on the other hand we might be kicking cached data out of our storage too soon if the cardinality2 of our queries are too high.
Note This won't be an issue if you're serving internal clients since the variety of queries won't be out of control.
Note
Learn more about query complexity in NestJS here and query depth in NodeJS here.