-
Notifications
You must be signed in to change notification settings - Fork 581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented CLI upload functionality #1618
Conversation
The documentation is not available anymore as the PR was closed or merged. |
Hey @martinbrose, thanks again for the great job on the PR! Together with #1617, it will be a great improvement for the users 🔥 FYI, I just pushed a new commit (acc570b) in which I:
I also extensively tested the command locally. I hope I have covered all possible use cases. Here is the current UX: # Upload file (implicit path in repo)
huggingface-cli upload my-cool-model ./my-cool-model.safetensors
# Upload file (explicit path in repo)
huggingface-cli upload my-cool-model ./my-cool-model.safetensors model.safetensors
# Upload directory (implicit paths)
huggingface-cli upload my-cool-model
# Upload directory (explicit local path, explicit path in repo)
huggingface-cli upload my-cool-model ./models/my-cool-model .
# Upload filtered directory (example: tensorboard logs except for the last run)
huggingface-cli upload my-cool-model ./model/training /logs --include "*.tfevents.*" --exclude "*20230905*"
# Upload private dataset
huggingface-cli upload Wauplin/my-cool-dataset ./data . --repo-type=dataset --private
# Upload with token
huggingface-cli upload Wauplin/my-cool-model --token=hf_****
# Sync local Space with Hub (upload new files, delete removed files)
huggingface-cli upload Wauplin/space-example --repo-type=space --exclude="/logs/*" --delete="*" --commit-message="Sync local Space with Hub"
# Schedule commits every 30 minutes
huggingface-cli upload Wauplin/my-cool-model --every=30 cc @osanseviero @julien-c @LysandreJik for another review round, especially on the CLI signature. It's more or less what was described in #1543 (comment). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super cool to be able to schedule your upload with this as well! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation looks great to me!
I've added some comments regarding the API coming from linux expectations, but feel free to disregard if you feel strongly. The implementation is clean and it'll greatly simplify model uploading outside of python runtimes.
Thanks both for your work!
huggingface-cli upload my-cool-model ./models | ||
|
||
# Upload directory (implicit local_path, implicit path_in_repo) | ||
huggingface-cli upload my-cool-model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this supposed to do exactly? Reading it I understand it to be the following: upload the my-cool-model
folder and its content to the <user>/my-cool-model
repository, the contents of the folder being at the root of the repository.
However, it doesn't seem to be the case: it seems to be uploading all folders and files in the current working directory to the root of the <user>/my-cool-model
repository. Is this expected?
Reproducer:
mkdir hfh-test && cd hfh-test
mkdir folder-1
touch folder-1/file.txt
mkdir folder-2
touch folder-2/file.txt
# Current repo structure
# ./
# ../
# folder-1/
# file.txt
# folder-2/
# file.txt
huggingface-cli upload folder-1
I would expect folder-1
to contain the contents of folder-1
, but it contains everything that was in the working directory: https://huggingface.co/lysandre/folder-1/tree/main
I would expect the command to do what it just did to be huggingface-cli upload folder-1 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback @LysandreJik. My initial thought was that huggingface-cli upload my-cool-model
would upload the current directory to the Hub (at root level). So huggingface-cli upload my-cool-model
being equivalent to huggingface-cli upload my-cool-model . .
. A bit as when you have a local repo and you do git add . && git commit -m "something" && git push
.
That being said, the use case you've just described (e.g. repo_id == name of a local folder) is perfectly valid as well. I've updated the CLI in that sense. Now the behavior looks like this:
# Current repo structure
# ./
# ../
# folder-1/
# file.txt
# folder-2/
# file.txt
# Upload "./folder-1" content to "Wauplin/folder-1"
# On the hub: file.txt
>>> huggingface-cli upload folder-1
# Upload "./folder-1" content to "huggingface/folder-1"
# On the hub: file.txt
>>> huggingface-cli upload huggingface/folder-1
# Upload "./folder-1" and "./folder-2" to "Wauplin/my-cool-model"
# On the hub: folder-1/file.txt and folder-2/file.txt
>>> huggingface-cli upload my-cool-model .
# Upload "./folder-1" and "./folder-2" to "Wauplin/my-cool-model" under "./data"
# On the hub: data/folder-1/file.txt and data/folder-2/file.txt
>>> huggingface-cli upload my-cool-model . data/
# Raise exception => user must set local path explicitly
>>> huggingface-cli upload folder-3
docs/source/en/guides/upload.md
Outdated
# Upload directory (implicit path_in_repo) | ||
huggingface-cli upload my-cool-model ./models |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also kinda expect this to upload the contents of models
to my-cool-model
at the root of my-cool-model
, but it uploads them to the subfolder models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the CLI and now that's the case. Also added a unit test for this use case. By default all content is uploaded at root level unless specified differently.
docs/source/en/guides/upload.md
Outdated
# Upload file (implicit path_in_repo) | ||
huggingface-cli upload my-cool-model model.safetensors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works as I would expect it to work 👌
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Thanks for the great feedback @LysandreJik! I have revisited a bit the API given your comments:
TL;DR: no implicit stuff except when it's obvious I'll merge this as soon as the CI is green :) |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #1618 +/- ##
==========================================
- Coverage 82.28% 81.86% -0.43%
==========================================
Files 61 62 +1
Lines 6849 6964 +115
==========================================
+ Hits 5636 5701 +65
- Misses 1213 1263 +50
☔ View full report in Codecov by Sentry. |
Thank you both for your work! 🙌 |
Addressed #1543: