Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding allow_patterns and ignore_patterns to the Hugging Face cache. #480

Merged
merged 5 commits into from
Jul 31, 2023

Conversation

varunshenoy
Copy link
Contributor

@varunshenoy varunshenoy commented Jul 26, 2023

Introduces allow_patterns and ignore_patterns in the config file to prevent users from downloading the same weights multiple times. Often, HF repos include model weights in different formats in the same revision, such as .bin, .h5, .fp16.safetensors, .safetensors, etc.

Both allow_patterns and ignore_patterns follow fnmatch conventions.

Also -- this PR swaps repo_id and file_name on cache copying to have cleaner logs.

TODO: adding tests.

@varunshenoy varunshenoy marked this pull request as ready for review July 27, 2023 17:37
@varunshenoy varunshenoy requested a review from bolasim July 27, 2023 17:37
@varunshenoy varunshenoy merged commit 05d877b into main Jul 31, 2023
@varunshenoy varunshenoy deleted the varunshenoy/hf-cache-snapshot branch July 31, 2023 21:26
aspctu added a commit that referenced this pull request Aug 3, 2023
* Cleanup old asyncio threadpool settings.

* Add `version.txt` into build to prevent cache migrations (#490)

* added hf token to dockerfile

* add prints

* move hf token to the part where its needed

* add version.txt

* added version.txt

* remove unnecessary files

* adding files back

* reverting

* Adding `allow_patterns` and `ignore_patterns` to the Hugging Face cache. (#480)

* remove assert

* both ignore and allow working

* fix for verbose

* remove prints

* Fix streaming issues.

* Added documentation for HF caching (#492)

* added hf_cache docs

* fix ticks

* Update configuration.md

* clean up Truss docs (#491)

* clean up Truss docs

* fix links and lints

* lint

* Comment updates.

* Bump version.

* Controlling supervisord retries (#496)

* update live reload docs (#494)

* update live reload docs

* remove leading space

* remove dead links from flan-t5 readme

* Update README.md

* Removing extraneous file (#498)

* Enable Hugging Face secrets during build from Truss  (#499)

* added hf token to dockerfile

* add prints

* move hf token to the part where its needed

* successfully mounting secrets

* update cache warmer to grab secret

* match data dir copy

* bump pyproject

* add os to cache_warmer

* bump pyproject

* add is_trusted

* revert version

* change names to be hf_access_token

* rename is_trusted and use Path

* Adding support for VLLM server (#495)

---------

Co-authored-by: Sidharth Shanker <sid.shanker@baseten.co>
Co-authored-by: Varun Shenoy <vnshenoy@stanford.edu>
Co-authored-by: Philip Kiely - Baseten <98474633+philipkiely-baseten@users.noreply.github.com>
Co-authored-by: joostinyi <63941848+joostinyi@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants