-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Garbage collection removes filestore/urlstore references #5765
Comments
Also, maybe tangential, but seems like it could help: we should keep a mapping of the urls/filepaths we've added -> the resultant hashes. This would make using the urlstore a bit easier for some users (makes it easier to check if ipfs already knows about a given url) and also makes the above 'pinning' of filestore stuff easier. |
does this happen because pins are at the block level, so if you're not storing any blocks there's nothing to pin and the object gets GC'd? that sort of makes sense but definitely isn't the behavior I'd expect if this is hard to fix we should at minimum update the docs, e.g. |
Any ideas on fixing this @whyrusleeping - since you found the code where the problem was happening, I'd hoped it would be a quick fix. |
I think there's a bit of a disconnect here. This is issue is about an enhancement, not a bug. The issue here is: Filestore references are tiny, we (optionally) shouldn't GC them even if they're not pinned because keeping them won't really waste much space. This issue isn't about us garbage collecting pinned content (with or without the filestore). If you're seeing pinned content getting GCed, that's a bug; please file a new report. |
Hm. Reading the linked conversation, it looks like you and @whyrusleeping may also have been talking past each-other. Are you pinning these files? |
@whyrusleeping said that urlstore automatically pinned things, and I believe his analysis was that this wasn't the problem. Our problem is simple ... we have a process for adding a URL via the HTTP API to a local IPFS go instance (with arguments as provided by @parkan) Then at some random point in the future all these references become invalid. The comments above about GC, Intermediate blocks, and Pinning are just analysis by various people as to WHY the bug is happening. By the way ... this bug is serious enough that when you combine IPFS losing data with being unable to tell what has been lost; and clients not getting errors when trying to retrieve has meant we've had to turn IPFS off for most purposes because there is no way to reliably use urlstore and retrieve the references stored. |
Looks like that analysis was incorrect. |
And to be clear ...I'm assuming that "pinning" in this case it means that urlstore will keep the reference, and not that urlstore will make a copy (and pin) the content. |
thanks @Stebalien! my head kind of hurts after trying to understand the semantics of the various node types even w/o filestore/urlstore, so I cannot be super helpful here the only thing that gives me pause w.r.t. your diagnosis is that the repo in this case appears to sometimes lose normally added items (thumbnail images) as well, but maybe these are two separate issues @mitra42 yes, pinning would mean that the reference is retained (i.e. the "repo loses items" problem stops, at least for urlstore objects) |
Yes. When adding urlstore/filestore content, we create "fake" pointer blocks and "pin" those.
Are these being added via a simple |
If we can fix the urlstore items getting lost then we might be able to diagnose them separately. That could have been a different issue, and might be fixed now. |
Let's make sure to test our assumptions here. Once we merge this, and get the archive node running it, we should try pulling a few things through the urlstore, and then explicitly running a gc and making sure that doesnt cause trouble. One other potential issue is if @mitra42 is using the http api, then maybe we should be explicitly passing the |
This shouldn't be necessary. If it is, that's another bug. |
I'm more than happy to add pin=true to the Http API if that is what is required ... but people keep saying it shouldnt be necessary. |
It's not necessary, I've just checked. |
You mean with the fix in the PR commented above ? If so then do you have any idea how long before that fix hits a new stable version? |
Yes. Currently, the only way to do this is to use
No idea. I'd like to get some of our bitswap improvements in before the next release, if possible, but we won't hold the release for them. |
Fixed in 0.4.19. |
There should be an option for garbage collection to not collect filestore/urlstore references, theyre pretty small, and some users would probably want to keep that around while clearing up the other cached content.
One caveat here is that if we have the intermediate nodes of something added via the filestore, if we're not garbage collecting the filestore stuff, we shouldnt be garbage collecting those intermediate nodes.
The text was updated successfully, but these errors were encountered: