-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automagic GC #43
Comments
The only form of entry-deletion that
|
Overall -- I think it is worth pursuing on both (joblib and diskcache) fronts. joblib -- I do not think it has any central DB across items to add such information, so might be a bit too much to ask but since would be generally useful, worth raising that question/issue and see what joblib authors think. diskcache - worth raising an issue suggesting flufl.lock (I myself had no experience with it) for locking to provide support for NFS-mounted partitions. If gets implemented, we should look then indeed into our own -- I take my idea back on writing our own as a "good way forward" ;) but I wonder if we could/should provide abstraction to allow users to choose between joblib or diskcache for their specific use cases? I think joblib might be needed if arguments or return value would return numpy ndarrays -- not sure if diskcache would provide support for that. I do not think we have immediate usecases like that but I think they might arise. |
note: another, more subtle need for GC or may be even an alternative caching mechanism -- we ran out of inodes on drogon, and fscacher's cache seems to be the major contributor :-/ |
Followed up on joblib with joblib/joblib#1183 |
joblib got
ATM with heavy use of fscacher in dandi-cli creating thousands of small files we are getting some file transfers affected. It would be great to get back to this issue and finalize the solution. I think behaving |
I feel that we need to think about garbage collection now. /home on drogon ran out of space, and fscacher has contributed -- e.g. in my case it was 4GB for
~/.cache/fscacher
.Ideally it should some kind of a LRU mechanism.
Unfortunately I think we cannot rely on atime for deciding which cached files to be removed since filesystem might be mounted with
relatime or noatime
, so we might need another way to decide on what is recent and what is not.another idea could be just to help user to decide by creating lead directories for those extra tags (versions) we add into function signature (but they could become too long too fast :-/). But we could then may be make decision based on that level -- the oldest created/modified ones to go. If it was modified recently - means that the particular version is still in use
Any ideas @jwodder ?
The text was updated successfully, but these errors were encountered: