Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC API docstring improvements #731

Merged
merged 9 commits into from
Mar 15, 2022
Merged

DOC API docstring improvements #731

merged 9 commits into from
Mar 15, 2022

Conversation

adrinjalali
Copy link
Contributor

This PR improves the docstrings for a few functions under huggingface_hub.

@adrinjalali adrinjalali requested a review from LysandreJik March 2, 2022 13:59
src/huggingface_hub/file_download.py Outdated Show resolved Hide resolved
src/huggingface_hub/file_download.py Outdated Show resolved Hide resolved
@osanseviero osanseviero self-requested a review March 2, 2022 14:58
Copy link
Contributor

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for this PR! I left a couple of suggestions 😄

Two quick questions

  • Do we want new lines between arguments? We don't do this anywhere else I think
  • Should we also add the argument types in the docstring?


Cloudfront is replicated over the globe so downloads are way faster for the end user (and it also lowers our
bandwidth costs).
"""Resolve a url of a file from the given information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe people don't know what resolve a URL means. WDYT Of something along these lines?

Suggested change
"""Resolve a url of a file from the given information.
"""Creates a URL of a file based in the given information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think since there's an example, it should be clear. I'm not sure about using create here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe "construct the URL"?

Args:
repo_id: A user or an organization name and a repo name seperated by a
``/``.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think we add empty lines between parameters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put them there in cases where there are many arguments. Without the lines it's not very readable to me, and they don't affect the rendered version anyway. Happy to remove them if you think they really should be removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would go for consistency now and update later, WDYT?

src/huggingface_hub/file_download.py Outdated Show resolved Hide resolved
src/huggingface_hub/file_download.py Outdated Show resolved Hide resolved
src/huggingface_hub/file_download.py Show resolved Hide resolved
src/huggingface_hub/file_download.py Outdated Show resolved Hide resolved
src/huggingface_hub/file_download.py Outdated Show resolved Hide resolved
src/huggingface_hub/file_download.py Outdated Show resolved Hide resolved
src/huggingface_hub/file_download.py Outdated Show resolved Hide resolved
src/huggingface_hub/file_download.py Outdated Show resolved Hide resolved
Copy link
Contributor Author

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add the argument types in the docstring?

The docs suggest doing that only if type annotations are not present. I would be happier if we could put the default values in the doc though.


Cloudfront is replicated over the globe so downloads are way faster for the end user (and it also lowers our
bandwidth costs).
"""Resolve a url of a file from the given information.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think since there's an example, it should be clear. I'm not sure about using create here.

Args:
repo_id: A user or an organization name and a repo name seperated by a
``/``.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put them there in cases where there are many arguments. Without the lines it's not very readable to me, and they don't affect the rendered version anyway. Happy to remove them if you think they really should be removed.

src/huggingface_hub/file_download.py Outdated Show resolved Hide resolved
Copy link
Contributor

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this! 🚀 it looks great

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks for working on it! I've added comments, only related to the format.

are more than a few MBs.

Args:
repo_id: A namespace (user or an organization) name and a repo name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather we add the types/default values directly in the docstring. That's what we do in the rest of the repository, and that's what's expected by the doc-builder tool which we'll use to build the docs for huggingface_hub. See documentation here.

We also add newlines between the argument name and type, and the description of that argument.

I'll add a few proposals. Right now the docs were setup with the Sphinx/RST format in mind (`` for markdown's `, ` for markdown's *, etc.), so for consistency's sake the proposals I'll add will re-use that format as well. If it isn't clear, feel free to use the format as defined in the document shared above, as the conversion should take place in 1-2 days anyway.

Other examples of this in huggingface_hub are in Repository:

"""
Instantiate a local clone of a git repo.
If specifying a `clone_from`:
will clone an existing remote repository, for instance one
that was previously created using ``HfApi().create_repo(name=repo_name)``.
``Repository`` uses the local git credentials by default, but if required, the ``huggingface_token``
as well as the git ``user`` and the ``email`` can be explicitly specified.
If `clone_from` is used, and the repository is being instantiated into a non-empty directory,
e.g. a directory with your trained model files, it will automatically merge them.
Args:
local_dir (``str``):
path (e.g. ``'my_trained_model/'``) to the local directory, where the ``Repository`` will be initalized.
clone_from (``str``, `optional`):
repository url (e.g. ``'https://huggingface.co/philschmid/playground-tests'``).
repo_type (``str``, `optional`):
To set when creating a repo: et to "dataset" or "space" if creating a dataset or space, default is model.
use_auth_token (``str`` or ``bool``, `optional`, defaults to ``True``):
huggingface_token can be extract from ``HfApi().login(username, password)`` and is used to authenticate against the hub
(useful from Google Colab for instance).
git_user (``str``, `optional`):
will override the ``git config user.name`` for committing and pushing files to the hub.
git_email (``str``, `optional`):
will override the ``git config user.email`` for committing and pushing files to the hub.
revision (``str``, `optional`):
Revision to checkout after initializing the repository. If the revision doesn't exist, a
branch will be created with that revision name from the default branch's current HEAD.
private (``bool``, `optional`, defaults to ``False``):
whether the repository is private or not.
skip_lfs_files (``bool``, `optional`, defaults to ``False``):
whether to skip git-LFS files or not.
"""

Suggested change
repo_id: A namespace (user or an organization) name and a repo name
repo_id (``str``):
A namespace (user or an organization) name and a repo name

repo_id: A namespace (user or an organization) name and a repo name
seperated by a ``/``.

filename: The name of the file in the repo.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
filename: The name of the file in the repo.
filename (``str``):
The name of the file in the repo.


filename: The name of the file in the repo.

subfolder: An optional value corresponding to a folder inside the repo.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
subfolder: An optional value corresponding to a folder inside the repo.
subfolder (``str``, `optional`): An optional value corresponding to a folder inside the repo.

user_agent: The user-agent info in the form of a dictionary or a
string.

force_download: Whether the file should be downloaded even if it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there is a default value other than None:

Suggested change
force_download: Whether the file should be downloaded even if it
force_download (``bool``, `optional`, defaults to ``False``):
Whether the file should be downloaded even if it

Comment on lines 661 to +668
Raises:
In case of non-recoverable file (non-existent or inaccessible url + no cache on disk).
- ``EnvironmentError`` if ``use_auth_token=True`` and the token cannot
be found.

- ``OSError`` if ETag cannot be determined.

- ``ValueError`` if the file cannot be downloaded and cannot be found
locally.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love that!

Comment on lines +97 to +100
# Note: at some point maybe this format of storage should actually replace
# the flat storage structure we've used so far (initially from allennlp
# if I remember correctly).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better suited here, indeed!

@adrinjalali
Copy link
Contributor Author

Hope this looks better. If there aren't anything I've missed @LysandreJik please feel free to merge :)

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great work, thanks for that @adrinjalali! Merging.

@LysandreJik LysandreJik merged commit df4c6d9 into huggingface:main Mar 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants