-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow ipfs cat on 251 MB .ipfs #2174
Comments
What are the |
I now run v4.0 and things are much better. With almost empty .ipfs : With 251MB .ipfs : But this is still a x20 factor. |
How long does it take to |
Sorry, but what is this |
|
On a small .ipfs :
With 251MB .ipfs :
|
ic, that would be the worst case. Or rather, the hash lookup should be more equivalent to e: clarification |
Tested the small .ipfs: ~110MB .ipfs ~475MB .ipfs The scaling looks reasonable. I can't reproduce the bug (~2s |
I'm seeing a similar issue, currently looking at these stats:
With a My version is
So it seems that there is something specific about Update: |
Here is a profiling report: https://ipfs.io/ipfs/QmUBq6RW1fLcuUpKqrkGTdaXaVKGzFgFwNVPG84QaLGyTY/ipfs.svg Looks like for some reason
It is due to |
It does, since A middle ground that maintains this feature is to cache information of the repo size. Additionally, this could be used to provide the repo's dedup ratio (much like in zfs). zfs has quota as an option as well. |
I am thinking about using custom walk function and using ModTime to filter out changes. As files are immutable in we would only need to store sizes of directories. @dignifiedquire can you run: |
@Kubuxu there you go
|
This still doesn't address the fact that all the blocks have to be traversed. As stated, I'm for caching (list of keys, reference count, and size) into a dedup table, saved into a file in |
We don't need to traverse all blocks. If ModTime of directory is same as last registered we could not walk it. This already should improve times by factor of 4 basing on @dignifiedquire data. I am all for full caching but it requires changes in repo format (and more things to keep in sync and so on) The quick and dirty option would be to disable |
ic, though the leap from "keeping track of ModTime of directories" to "keeping track of the keys and their properties" is not that far. The full caching can be implemented as an in-memory table throughout the daemon's (or a daemon-less command) life, then serialized/synced only when the repo closes.
|
I don't even think we need to traverse the directory. We can likely just keep a log of our current size based on blocks added and removed (track datastore puts and deletes) and write it out periodically to the datastore. I don't think we care about a perfectly exact size on disk, its just a rough estimate to determine if we want to GC now or later. |
This is quite important bug, it makes big repos on my server almost unusable. |
Is there a reason I am thinking a quick fix would be to just disable the Conditional GC. A better fix would be to run the Conditional GC every now and then on commands that add blocks such as |
The reason its on ipfs cat (albeit, a poor reason) is that ipfs cat will fetch content from the network and store it on disk before sending it back to you. We wanted to make sure we always tried our hardest to have enough space to execute the command. I think removing that call for now is okay until we find a better way to do the conditional gc stuff |
Ahh, forgot that it could fetch from the network. |
Going to close this for now since we no longer do a conditional GC for cat. |
This is on an ipfs node with a .ipfs size of 251,7 MB
On a new empty ifps node the same thing is instantaneous.
Is it normal that its so slow? Is ipfs.io/ipfs as slow as that?
ps : running 0.4 dev
The text was updated successfully, but these errors were encountered: