-
Notifications
You must be signed in to change notification settings - Fork 570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC API docstring improvements #731
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for this PR! I left a couple of suggestions 😄
Two quick questions
- Do we want new lines between arguments? We don't do this anywhere else I think
- Should we also add the argument types in the docstring?
src/huggingface_hub/file_download.py
Outdated
|
||
Cloudfront is replicated over the globe so downloads are way faster for the end user (and it also lowers our | ||
bandwidth costs). | ||
"""Resolve a url of a file from the given information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe people don't know what resolve a URL means. WDYT Of something along these lines?
"""Resolve a url of a file from the given information. | |
"""Creates a URL of a file based in the given information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think since there's an example, it should be clear. I'm not sure about using create here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe "construct the URL"?
src/huggingface_hub/file_download.py
Outdated
Args: | ||
repo_id: A user or an organization name and a repo name seperated by a | ||
``/``. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I don't think we add empty lines between parameters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put them there in cases where there are many arguments. Without the lines it's not very readable to me, and they don't affect the rendered version anyway. Happy to remove them if you think they really should be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would go for consistency now and update later, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also add the argument types in the docstring?
The docs suggest doing that only if type annotations are not present. I would be happier if we could put the default values in the doc though.
src/huggingface_hub/file_download.py
Outdated
|
||
Cloudfront is replicated over the globe so downloads are way faster for the end user (and it also lowers our | ||
bandwidth costs). | ||
"""Resolve a url of a file from the given information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think since there's an example, it should be clear. I'm not sure about using create here.
src/huggingface_hub/file_download.py
Outdated
Args: | ||
repo_id: A user or an organization name and a repo name seperated by a | ||
``/``. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put them there in cases where there are many arguments. Without the lines it's not very readable to me, and they don't affect the rendered version anyway. Happy to remove them if you think they really should be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this! 🚀 it looks great
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thanks for working on it! I've added comments, only related to the format.
src/huggingface_hub/file_download.py
Outdated
are more than a few MBs. | ||
|
||
Args: | ||
repo_id: A namespace (user or an organization) name and a repo name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather we add the types/default values directly in the docstring. That's what we do in the rest of the repository, and that's what's expected by the doc-builder tool which we'll use to build the docs for huggingface_hub
. See documentation here.
We also add newlines between the argument name and type, and the description of that argument.
I'll add a few proposals. Right now the docs were setup with the Sphinx/RST format in mind (`` for markdown's `, ` for markdown's *, etc.), so for consistency's sake the proposals I'll add will re-use that format as well. If it isn't clear, feel free to use the format as defined in the document shared above, as the conversion should take place in 1-2 days anyway.
Other examples of this in huggingface_hub
are in Repository
:
huggingface_hub/src/huggingface_hub/repository.py
Lines 371 to 403 in 6dac5f4
""" | |
Instantiate a local clone of a git repo. | |
If specifying a `clone_from`: | |
will clone an existing remote repository, for instance one | |
that was previously created using ``HfApi().create_repo(name=repo_name)``. | |
``Repository`` uses the local git credentials by default, but if required, the ``huggingface_token`` | |
as well as the git ``user`` and the ``email`` can be explicitly specified. | |
If `clone_from` is used, and the repository is being instantiated into a non-empty directory, | |
e.g. a directory with your trained model files, it will automatically merge them. | |
Args: | |
local_dir (``str``): | |
path (e.g. ``'my_trained_model/'``) to the local directory, where the ``Repository`` will be initalized. | |
clone_from (``str``, `optional`): | |
repository url (e.g. ``'https://huggingface.co/philschmid/playground-tests'``). | |
repo_type (``str``, `optional`): | |
To set when creating a repo: et to "dataset" or "space" if creating a dataset or space, default is model. | |
use_auth_token (``str`` or ``bool``, `optional`, defaults to ``True``): | |
huggingface_token can be extract from ``HfApi().login(username, password)`` and is used to authenticate against the hub | |
(useful from Google Colab for instance). | |
git_user (``str``, `optional`): | |
will override the ``git config user.name`` for committing and pushing files to the hub. | |
git_email (``str``, `optional`): | |
will override the ``git config user.email`` for committing and pushing files to the hub. | |
revision (``str``, `optional`): | |
Revision to checkout after initializing the repository. If the revision doesn't exist, a | |
branch will be created with that revision name from the default branch's current HEAD. | |
private (``bool``, `optional`, defaults to ``False``): | |
whether the repository is private or not. | |
skip_lfs_files (``bool``, `optional`, defaults to ``False``): | |
whether to skip git-LFS files or not. | |
""" |
repo_id: A namespace (user or an organization) name and a repo name | |
repo_id (``str``): | |
A namespace (user or an organization) name and a repo name |
src/huggingface_hub/file_download.py
Outdated
repo_id: A namespace (user or an organization) name and a repo name | ||
seperated by a ``/``. | ||
|
||
filename: The name of the file in the repo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filename: The name of the file in the repo. | |
filename (``str``): | |
The name of the file in the repo. |
src/huggingface_hub/file_download.py
Outdated
|
||
filename: The name of the file in the repo. | ||
|
||
subfolder: An optional value corresponding to a folder inside the repo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subfolder: An optional value corresponding to a folder inside the repo. | |
subfolder (``str``, `optional`): An optional value corresponding to a folder inside the repo. |
src/huggingface_hub/file_download.py
Outdated
user_agent: The user-agent info in the form of a dictionary or a | ||
string. | ||
|
||
force_download: Whether the file should be downloaded even if it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When there is a default value other than None
:
force_download: Whether the file should be downloaded even if it | |
force_download (``bool``, `optional`, defaults to ``False``): | |
Whether the file should be downloaded even if it |
Raises: | ||
In case of non-recoverable file (non-existent or inaccessible url + no cache on disk). | ||
- ``EnvironmentError`` if ``use_auth_token=True`` and the token cannot | ||
be found. | ||
|
||
- ``OSError`` if ETag cannot be determined. | ||
|
||
- ``ValueError`` if the file cannot be downloaded and cannot be found | ||
locally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love that!
# Note: at some point maybe this format of storage should actually replace | ||
# the flat storage structure we've used so far (initially from allennlp | ||
# if I remember correctly). | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better suited here, indeed!
Hope this looks better. If there aren't anything I've missed @LysandreJik please feel free to merge :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great work, thanks for that @adrinjalali! Merging.
This PR improves the docstrings for a few functions under
huggingface_hub
.