-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
better gc #3679
Comments
I think stored, re-scalable bloom filer could be a good way to go. We already use bloom filter for Has request filtering. Both GC and Has caching would benefit from it. |
Another disk based alternative is to use two sorted lists. First write out a list of the colored set, then sort it (duplicates can be removed during the sorting phase). Then create another list of all keys that are candidates to be removed and sort it. Then simply compare the two lists. The second list is not strictly needed as a binary search can be performed on the colored-set list, but will help with performance as the keys are currently not guaranteed to be returned in any order. As just a way to reduce memory. Forgo any set data-structures and just use a list as before, but keep it memory. A sorted list is bound to be more memory efficient than any any deterministic set data-structure. By not using a set for the colored list, there is a risk of doing duplicate work (as you can't instantly check if a node has already been visited). This may or not be a problem. Just an idea. |
In practice, this is a huge cost. Very frequently we have situations where there are 100 or so nodes that point to the same large graphs (example, the go-ipfs source directory in my repo that i add to ipfs very frequently) |
Though I do like the idea of pushing the colored set to disk as a sorted list (alternatively, a binary search tree, if you want to do heap style indexing) |
@lgierth you were wanting to discuss better garbage collection strategies. Want to weigh in here? |
I think its worth us adding an on-disk bloom filter, we can make it quite large so the false positive rate is low enough, and just keep it up to date as new pins are added. I would also like to have an LRU cache of recently used objects so we can avoid garbage collecting things we might be working with but havent pinned yet. It could even be time based (don't gc any objects created in the past hour?) |
Dumping this tweet thread here https://twitter.com/TimSweeneyEpic/status/1019008128933924869 @ TimSweeneyEpic
@ F_Vaggi
@ pragmatism
@ stephenjudkins
@ Elrohir_Ancalin
@ NicTVG
|
A quick note on alternatives to LRU for asset caching; I came up with something for a similar application; http://minkirri.apana.org.au/wiki/DecayingLFUCacheExpiry It should be only a tiny bit more complicated than LRU, but should handle scans better. |
It is probably very useful to study the current state of the art in GC algorithms, which are all on the JVM. |
@ianopolous problem is, all/most collectors for Java are moving which is much harder to do when your dataset is on disk. |
@Kubuxu Yep, I'm aware of that, but you could easily do that with multiple datastores if needed. Though I don't think that is critical to many of their ideas. E.g. ZGC is a single generation GC, so the only moving there is compaction. Naively I would guess that you could just noop that part with the rest of the algorithm mostly unchanged. However you could also use a datastore which was trivially a single large file, with a mapping from Cid to offset, and then you could use one of their algorithms "unchanged". I'm not a GC expert, but I'm sure there are useful ideas here in the state of the art. |
Programming language GC systems aren't really the correct target. We should look at content-addressed/deduplicating filesystems (almost all of which use reference counting). |
@Stebalien Do any of those filesystems have the concept of deep graphs of links between parts of files like ipfs's ipld? More than just a single layer of, say, metadata linking to data fragments (I don't know)? If not, then the topology of a managed language's object graph would seem more similar to me, and chasing that graph is the main task of a GC. The main difference I can see is the absence of circular graphs in a merkle tree. Regardless, I just wanted to point to a potential source of good ideas. |
I've heard of at least one that uses merkle-links for directories (so directory copies are free). But the one I'm thinking of is a networked filesystem. We'll probably want to look at network-based filesystem papers for inspiration. My concern with programming languages is that:
|
I have a large enough ipfs repo (over 300 million blocks) that it takes a very large amount of memory to perform a gc. computing the marked set is expensive.
I'm thinking that using something like a bloom filter could make this process use up much less memory, at the expense of not cleaning out every block. The difficulty here is that false positives while enumerating the set of pinned objects could result in drastically lowered performance (we could accidentally think a block that points to everything is pinned and end up cleaning out nothing) so selecting parameters to try and avoid this is important.
Another (potentially more complicated but more accurate) option is to use a disk backed prefix tree to store the enumerated pinset (with heavy caching up to some memory limit to make perf tolerable). This just offloads the memory cost of storing the sets to disk, which is generally acceptable, but can be an issue as it would prevent people from running a GC if their disk was full, this is generally considered to be a bad thing.
I'm also interested in strategies that can do "a little gc". Something that allows us to quickly free a smaller subset of the blocks, without the overhead of performing an entire gc scan. Implementing something that lets us do this may require a rethinking of how pinsets and objects are stored.
The text was updated successfully, but these errors were encountered: