-
Notifications
You must be signed in to change notification settings - Fork 0
Cache
Making a request to the database layer is quite slow and serlo.org has so many users that our infrastructure would crash if we would perform a database request for each visit of our site. Thus we store responses of the database layer in a cache which we can reuse for following requests. Behind the scenes we use Redis for this job which is a key-value database. This means that Redis can store values which can be adressed / modified / retrieved given a unique key.
Let's see how we use the cache in an example: When you perform a request to dataSources.serlo.getUuid({ id: 1 })
we first look in the cache whether we already have saved an older response for this request. In order to do so we have a function which assigns to each service request a unique key. In this case it is the key de.serlo.org/api/uuid/1
. Note that given the key de.serlo.org/api/uuid/1
we know that the request must have been dataSources.serlo.getUuid({ id: 1 })
and vice versa. Thus the relationship service request <-> cache key
is a 1:1 relationship.
Now we look into our cache whether we have stored an older response for dataSources.serlo.getUuid({ id: 1 })
under the key de.serlo.org/api/uuid/1
. If this is the case we return it. If not we perform a request to the database layer, store the returned value for later in the cache and return it:
Congratulations! 😄 We have a cache and once we have stored a response in it we never have to repeat the same request twice. 🎉 Right? ... Unfortunately this is not the case. 😢 There is a reason for the quote "There are only two hard things in Computer Science: cache invalidation and naming things." Keeping the cache in sync with the database is a pain in the a** and invalid or old cache value have already been the reason for a lot of problems. So, these are our strategies for updating the cache...
Each mutation changes something. Thus, most mutations need to be reflected in an update of the cache as well. Therefore the createMutation()
function for service endpoints in the model has the possibility to specify a updateCache()
function. This function is called after the mutation with the given arguments of the mutation and the returned value of the mutation in order to update the cache:
createThread = createMutation({
...
updateCache({ payload, value }) {
// implement updates of the cache here
}
})
See https://github.com/serlo/api.serlo.org/blob/main/packages/server/src/model/serlo.ts for some examples.
We are still in the process of migrating from the legacy monolithic serlo.org system to our new infrastructure. Thus there are some mutating requests which are directly handled by the legacy system and which are not passed through the API. In this case the API cannot update its cache directly and needs to be informed otherwise that there has been a change.
For this case the api exposes mutation endpoints for updating the cache, see ~/schema/cache/types.graphql
:
type Mutation {
_cache: _cacheMutation!
}
type _cacheMutation {
# Updates the cache entry to a certain value
set(input: CacheSetInput!): CacheSetResponse!
# Deletes a cache entry so that it need to be refetched
remove(input: CacheRemoveInput!): CacheRemoveResponse!
# Forces the API to update some of the follwing keys
update(input: CacheUpdateInput!): CacheUpdateResponse!
}
In the legacy serlo.org system there are listeners watching for changes of the database which haven't been passed through the api (for example changing a taxonomy term). In case of a change they use the above endpoints to update the cache of the API or to invalidate some keys.
In order to have a mechanism to update cache values regularly we use the stale-while-revalidate algorithm. Here we assign a maximum age for their cache entries to each service endpoint. When a cache value is older than this age, we say that the cache entry is stale
. We still return it to have fast responses, however, we update the cache entry in the background.
To do so we put the cache key in a job queue of cache entries which need to be updated. There is a list of swr queue workers
which take a key from this queue and update it. With this architecture we perform update requests in sequence and not parallel in case a lot of cache keys get stale at once. This ensures we do not overload the database layer and the database with a lot of requests at once. So the algorithm is like this:
To be able to control the flow of request and not overwhelm the backends with update requests, we decided to implement a parameter swrFrequency
that lets us limit the percentage of stale cache keys that actually get put on the queue. For example: If it is set to 10%, only 10% a stale value is requested it is put on the queue. It can be activated in times of high requests and deactivated when it is no longer necessary.
Even with the above mechanism we can end up with invalid cache entries due to bugs or other reasons. When a user of reports such an error we might want to react instantly and not want to wait until the cache entry gets updated. In those cases you can update the cache manually. In order to do this you can perform a query directly against the _removeCache
endpoint:
- Go to https://serlo.org and log in
- Go to https://frontend.serlo.org/___graphql and perform the following request
mutation {
_cache {
remove(key: "<cache key you want to delete>")
}
}
Have a look at the definition for the service endpoint (like serlo.ts
) to see what the cache key for a particular resource is.
Note that we have allowed this only for software developers at serlo.org. If you do not have permissions to remove an invalid cache go to https://community.serlo.org/channel/feature-requests-and-bugs and make a request in this channel.
- Home
- Serlo Infrastructure
- Serlo Infrastructure for Non programmers
- Resources for new programmers
- Setup of the toolchain
- Best Practices
- Data Privacy for Devs
- How Tos
- Single Sign On
- Integration with the Data Wallet
- User Journey
- Integration of "Datenraum" into the Serlo Editor
- Introduction to the Serlo editor
- Core concepts of the Serlo editor
- Packages of the Serlo editor
- Creating a new plugin (outdated)
- Redux process in the Serlo editor
- The content format of the Serlo editor
- Serlo Editor Plugin Initial State
- How the Serlo Editor is integrated into edu-sharing via LTI
- Learner Events and xAPI