-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements non git mixin #618
Improvements non git mixin #618
Conversation
Bring up to speed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite a nice improvement to the existing PR. Feel free to merge that PR in mine, and then let's discuss what is needed on it before we can merge.
What needs to be taken care of before we can merge in main
, in my opinion:
- Taking care of the 5GB files and chunking them before uploading them. Ideally the size would be given manually to the method, similarly to how it's done in
datasets
'push_to_hub
method which already respects that paradigm (see Push to hub capabilities forDataset
andDatasetDict
datasets#3098) - I may be mistaken but I don't think progress bars are implemented yet for
upload_file
- Very explicit documentation mentioning that for complex workflows,
Repository
should be used as it respects the git workflow
Anything else? Exciting to see progress on this!
if commit_message is not None: | ||
headers["Commit-Summary"] = commit_message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Nope! That should be about it for this PR. Will make the adjustment + merge, then will work towards the list |
…erzr/huggingface_hub into improvements-non-git-mixin
Add in commit_message + tests Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Add in commit_message + tests Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Add in commit_message + tests Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* Propose a non-git mixin * Fix merge * Improvements non git mixin (#618) Add in commit_message + tests Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> * Big refactor * FIX typos in contributing.md * Remove redefined logger in HfApi.py * Use upload_folder in both mixins + some docstring * moved back logger to top of hf_api >< * space in documentation can be ambiguous * WIP started deprecation * deprecate skip lfs file and use Path * added decorator to deprecate specific arguments + unittests for it * simplified tests * proper decorators * fix docstring * hubmixin: fixed existing tests + add http one * unique repo names across tests * make push_to_hub_keras work + tests * logs are not overwritten in push_to_hub_keras * flake8 * refacto push_to_hub from mixin.save_pretrained * deprecate positional argument in version 0.12 * remove docstring for deprecated skip_lfs_files * delete old logs when uploading keras model to hub * remove TODO in tests * remove useless todo * Update src/huggingface_hub/hub_mixin.py * flake8 * remove un-explicit _generate_url helper * exclude some folders from flake8 * clean pr * move out tests fixing to other issue * introduce decorator to deprecate tests * docstring * fix pattern in expect_deprecation decorator * docstring * remove extras['ml'] integration * Apply suggestions from code review Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * use expect_deprecation decorator in test * remove caret in comment * explicit mocked model * delete unused and failing tett * optional path_in_repo for root * revert back token docstring * Update src/huggingface_hub/hf_api.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Zachary Mueller <muellerzr@gmail.com> Co-authored-by: Wauplin <lucainp@gmail.com> Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
This is the start of getting the
improvements-non-git-mixin
up to speed and worked towards its completion.Basic changes found here:
upload_file
now takes in acommit_message
. This was part of the intended implementation, so trickled it up. This also allows for the back-ends to write custom messages for if something is a weight, tokenizer, etc, mimicking how Repository can do so already.test_hubmixin
that makes sure the timings for uploading a new file to a blank repo, and uploading a new file to a repo with fairly large files is about the same. Currently the fake repo uploads 5 .5mb files (locally this took 12 seconds to accomplish. If we want to lower that to one .5mb file that's okay)