Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-Tiered HF Cache #2816

Open
XenonMolecule opened this issue Jan 31, 2025 · 2 comments
Open

Multi-Tiered HF Cache #2816

XenonMolecule opened this issue Jan 31, 2025 · 2 comments

Comments

@XenonMolecule
Copy link

Is your feature request related to a problem? Please describe.
When working on a shared compute cluster, allocating a shared hugging face cache for popular models can be helpful. However, if everyone uses only one cache, as people use one-off models and datasets, the cache directory will quickly fill up with unnecessary files.

Describe the solution you'd like
Ideally, there could be a two-tiered cache structure. One read-only cache which first checks whether or not the server has already downloaded the model. Such a cache could be filled by some admin users based on the agreed models to include. Then the users personal read-write cache would be the second level of cache hit. With this cache they could download their personal use models and datasets that won't be useful as a shared server resource.

Describe alternatives you've considered

  1. N-tiered cache structure: Let users just specify a list of cache dirs and on a cache miss on all then write to the highest or lowest level write-enabled cache. This would subsume the solution I described above and be a more future proof design, but adds more complexity than is potentially necessary for this application.
  2. As a user just share a read-write cache for all members of the server. This is not ideal, because if this cache fills up on disk space it will prevent anyone from downloading more models. Users will be forced to move to personal caches anyway and this defeats the purpose of the shared cache.

Additional context
Others are using shared huggingface caches: https://benjijang.com/posts/2024/01/shared-hf-cache/ it would be great to add this multitiered cache support! Apologies if such a feature exists and I'm unaware!

@Wauplin
Copy link
Contributor

Wauplin commented Jan 31, 2025

Hi @XenonMolecule, thanks for explaining in details your feature request. At the moment the codebase doesn't support having a read-only shared cache + a read-write personal one. I understand the pros of having such a feature but adding this would be a major change in the download logic -and might be error-prone-. For now I'd like to gauge interest from the community before starting any plans. I'll keep this issue open, anyone is welcome to comment and react if interested :)

As a workaround, here are the few things you might be interested in:

@XenonMolecule
Copy link
Author

Thanks, @Wauplin! Even after I suggested it, I thought this might be tricky to coordinate between huggingface_hub and the various libraries built on it, such as datasets and transformers, which seem to assume a single cache dir. Still, I'm keeping my hopes up that this can make it into the roadmap for a longer horizon update in the future!

I appreciate your suggestions; I'll use them for now! Excited to see how much community interest there is in this idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants