-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement InMemoryCache garbage collection and eviction. #5310
Conversation
|
||
// Although the EntityCache class is abstract, it contains concrete | ||
// implementations of the various NormalizedCache interface methods that | ||
// are inherited by the Root and Layer subclasses. | ||
|
||
public toObject(): NormalizedCacheObject { | ||
return this.data; | ||
return { ...this.data }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the StoreWriter
class updates entities non-destructively, this shallow copy of this.data
is sufficient to provide a complete, immutable snapshot of the cache.
|
||
// If you're ever tempted to do this, you probably want to use cache.clear() | ||
// instead, but evicting the ROOT_QUERY should work at least. | ||
expect(cache.evict({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also called "Welcome to the fragment zone"
if (idsToRemove.length) { | ||
let root: EntityCache = this; | ||
while (root instanceof Layer) root = root.parent; | ||
idsToRemove.forEach(root.delete, root); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, this idiom is equivalent to
idsToRemove.forEach(id => root.delete(id))
except that it does not allocate a new function object.
This implementation currently requires calling cache.gc() manually, since the timing of garbage collection is subject to developer taste.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benjamn as always, this is wonderful work filled with helpful comments and the occasional chuckle when reading the tests. I really appreciate your styling in the commits and actual implementation here.
I've left one comment that I'd love answered before moving this to an approve. Namely, do we still need the query
parameter to the eviction option?
public gc() { | ||
const ids = this.getRootIdSet(); | ||
const snapshot = this.toObject(); | ||
ids.forEach(id => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method for creating a stack is just so cool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree!!
const idsToRemove = Object.keys(snapshot); | ||
if (idsToRemove.length) { | ||
let root: EntityCache = this; | ||
while (root instanceof Layer) root = root.parent; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so this works up the layers to delete ids
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I think about it:
- Optimistic layers are temporary, so we don't really need to worry about garbage collecting them, since they should be totally removed sometime soon.
- The goal of garbage collection, then, is to remove unreachable IDs from the
Root
layer of the cache. - The
gc()
method might have been called against aLayer
object, which is important because optimisticLayer
s can retain otherwise unreachable entities.- If the layers didn't matter for garbage collection, we could just skip to the root at the beginning of the
gc()
method. - Instead, we skip to the root at the end of the
gc()
method to perform the final deletions of unreachable entities. That's what this code does.
- If the layers didn't matter for garbage collection, we could just skip to the root at the beginning of the
@@ -125,27 +211,27 @@ export namespace EntityCache { | |||
// of the EntityCache.Root class. | |||
class Layer extends EntityCache { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably out of scope for this PR, but I think it would be quite helpful to write a set of comments on the architecture of the cache and its layers.
In practice (by which I mean while writing tests), I found it way too difficult to keep track of owner objects effectively. If an application developer wishes to implement a more sophisticated root ID tracking system, they have all the tools they need. In theory, the developer should balance retain and release calls, though one can always call release repeatedly until it returns 0 to release the ID completely.
Eviction always succeeds if the given options.rootId is contained by the cache, but it does not automatically trigger garbage collection, since the developer might want to perform serveral evictions before triggering a single garbage collection. Resolves apollographql/apollo-feature-requests#4. Supersedes #4681.
Per this discussion with @jbaxleyiii: #5310 (comment) tl;dr: This could be a breaking change for other Apollo Cache implementations, but InMemoryCache and the Hermes cache are the only two cache implementations we know about, and neither of them have implemented the evict method so far.
a1fd24c
to
7c5b4e5
Compare
Per this discussion with @jbaxleyiii: #5310 (comment) tl;dr: This could be a breaking change for other Apollo Cache implementations, but InMemoryCache and the Hermes cache are the only two cache implementations we know about, and neither of them have implemented the evict method so far.
7c5b4e5
to
3b659f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is AMAZING @benjamn! Thanks very much for working on this, and integrating these changes in such a well thought out and easy to grok manner. I have a few comments below (mostly thoughts or questions), and an additional one here - it looks like we have a circular dependency:
Circular dependency: src/cache/inmemory/inMemoryCache.ts -> src/cache/inmemory/writeToStore.ts -> src/cache/inmemory/inMemoryCache.ts
It isn't caused by the changes in this PR, but just a heads up in-case you want to address it here.
Thanks again!
Based on discussion with @hwillson here: #5310 (comment)
Per this discussion with @jbaxleyiii: #5310 (comment) tl;dr: This could be a breaking change for other Apollo Cache implementations, but InMemoryCache and the Hermes cache are the only two cache implementations we know about, and neither of them have implemented the evict method so far.
Based on discussion with @hwillson here: #5310 (comment)
Thank you for this.
Is this still the case? |
This PR adds a handful of powerful new methods to the official
InMemoryCache
class:cache.gc()
method removes any unreachable objects from the normalized cache, using a tracing garbage collection strategy (as opposed to reference counting). Reachability is determined starting from a set of root entity IDs, which includes any IDs that have been directly written to the cache—including the syntheticROOT_QUERY
ID, which is the default ID used when no specific ID is provided. From this root set of reachable IDs, garbage collection traces all child references (computed lazily and cached aggressively) until the entire reachable reference graph has been visited. Any remaining unvisited IDs are then removed from the cache, and thecache.gc()
function returns a list of those removed ID strings.The
cache.retain(id)
method can be called to treat specific entities as immediately reachable "root" entities, which protects them (and all their transitive child references) from garbage collection. Note that this method is called automatically for the IDs written usingwriteQuery
orwriteFragment
. Our hope is that this method will not be needed very often, though it can be essential when you don't want certain data to be removed.The
cache.release(id)
method undoes the effect ofcache.retain(id)
, allowing the given ID to be garbage collected once it becomes otherwise unreachable. Note thatcache.release(id)
must be called at least as many times ascache.retain(id)
was called for a givenid
. Both methods return the new retainment count. This retainment count is not a "reference count" in the usual sense, since most entities will never need to be retained or released, and we are not using a reference counting garbage collection strategy. In particular,cache.release(id)
does not automatically trigger garbage collection when the retainment count falls back to zero.The
cache.evict(id)
method forcibly removes the givenrootId
from the cache, but does not immediately trigger garbage collection, since you might want to evict multiple IDs before triggering a single garbage collection. With that said, removing an entity from the cache can have significant consequences for the reachability of other entities, so eviction should usually be followed by garbage collection.Closes #3965, #4681, #4917, #4916
Part of apollographql/apollo-feature-requests#4