Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automagic GC #43

Open
yarikoptic opened this issue Mar 28, 2021 · 6 comments
Open

Automagic GC #43

yarikoptic opened this issue Mar 28, 2021 · 6 comments
Assignees

Comments

@yarikoptic
Copy link
Member

yarikoptic commented Mar 28, 2021

I feel that we need to think about garbage collection now. /home on drogon ran out of space, and fscacher has contributed -- e.g. in my case it was 4GB for ~/.cache/fscacher.

  • Ideally it should some kind of a LRU mechanism.
    Unfortunately I think we cannot rely on atime for deciding which cached files to be removed since filesystem might be mounted with relatime or noatime, so we might need another way to decide on what is recent and what is not.

  • another idea could be just to help user to decide by creating lead directories for those extra tags (versions) we add into function signature (but they could become too long too fast :-/). But we could then may be make decision based on that level -- the oldest created/modified ones to go. If it was modified recently - means that the particular version is still in use

Any ideas @jwodder ?

@jwodder
Copy link
Member

jwodder commented Mar 29, 2021

The only form of entry-deletion that joblib.Memory supports is deleting an entire cache. Here are various ideas I've come up with, in no particular order:

  • We ask joblib to add LRU-based cache clearing.
  • We ask joblib for the ability to iterate through a Memory cache & delete individual items and then implement LRU-based cache clearing ourselves via either:
    • Deleting items based on the file access time (as of the time the function was called & cached) in the file fingerprints
    • Asking joblib to add a "last accessed time" property to each cache item and using that to delete old items
  • We ask diskcache to support use of flufl.lock for NFS-safe locking, and then we switch fscacher from joblib to diskcache.
  • We write our own caching system in which each function's cache is a single SQLite3 database file with a column for last access time and with locking performed using flufl.lock.

@yarikoptic
Copy link
Member Author

Overall -- I think it is worth pursuing on both (joblib and diskcache) fronts.

joblib -- I do not think it has any central DB across items to add such information, so might be a bit too much to ask but since would be generally useful, worth raising that question/issue and see what joblib authors think.

diskcache - worth raising an issue suggesting flufl.lock (I myself had no experience with it) for locking to provide support for NFS-mounted partitions. If gets implemented, we should look then indeed into diskcache as a replacement (but see below) for joblib.

our own -- I take my idea back on writing our own as a "good way forward" ;) but I wonder if we could/should provide abstraction to allow users to choose between joblib or diskcache for their specific use cases? I think joblib might be needed if arguments or return value would return numpy ndarrays -- not sure if diskcache would provide support for that. I do not think we have immediate usecases like that but I think they might arise.

@yarikoptic
Copy link
Member Author

note: another, more subtle need for GC or may be even an alternative caching mechanism -- we ran out of inodes on drogon, and fscacher's cache seems to be the major contributor :-/

@yarikoptic
Copy link
Member Author

Followed up on joblib with joblib/joblib#1183

@yarikoptic
Copy link
Member Author

joblib got

ATM with heavy use of fscacher in dandi-cli creating thousands of small files we are getting some file transfers affected. It would be great to get back to this issue and finalize the solution. I think behaving git gc style, triggering gc after some X days from prior consideration would be handy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants