Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement list_files_info + tests #1435

Merged
merged 8 commits into from
Apr 18, 2023
Merged

Implement list_files_info + tests #1435

merged 8 commits into from
Apr 18, 2023

Conversation

Wauplin
Copy link
Contributor

@Wauplin Wauplin commented Apr 12, 2023

This PR adds list_files_info that is a wrapper around 2 endpoints:

  1. /paths-info to get info about a list of paths
  2. /tree to recursively get info about files in a folder. It is paginated so we return a Python generator instead of a list.

With list_files_info we should be able to implement a CommitOperationCopy-ish operator that was discussed in #1083 and https://github.com/huggingface/moon-landing/issues/4370 (internal link). For more implementation details, see related discussion on slack (internal) as well as this comment.

TODO:

Examples:

  1. Get information about files on a repo:
list(list_files_info("lysandre/arxiv-nlp", ["README.md", "config.json"]))
# [RepoFile: {
# {'blob_id': '43bd404b159de6fba7c2f4d3264347668d43af25',
# 'lfs': None,
# 'rfilename': 'README.md',
# 'size': 391,
# 'type': 'file'}
# }, RepoFile: {
# {'blob_id': '2f9618c3a19b9a61add74f70bfb121335aeef666',
# 'lfs': None,
# 'rfilename': 'config.json',
# 'size': 554,
# 'type': 'file'}
# }]
  1. List LFS files from the "vae/" folder in "stabilityai/stable-diffusion-2"
>>> [info.rfilename for info in list_files_info("stabilityai/stable-diffusion-2", "vae") if info.lfs is not None]
['vae/diffusion_pytorch_model.bin', 'vae/diffusion_pytorch_model.safetensors']
  1. List all files
>>> [info.rfilename for info in list_files_info("glue", repo_type="dataset")]
['.gitattributes', 'README.md', 'dataset_infos.json', 'glue.py']

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Apr 12, 2023

The documentation is not available anymore as the PR was closed or merged.

@codecov
Copy link

codecov bot commented Apr 13, 2023

Codecov Report

Patch coverage: 97.36% and project coverage change: +0.10 🎉

Comparison is base (25e20ef) 84.34% compared to head (0ff64da) 84.44%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1435      +/-   ##
==========================================
+ Coverage   84.34%   84.44%   +0.10%     
==========================================
  Files          52       52              
  Lines        5430     5465      +35     
==========================================
+ Hits         4580     4615      +35     
  Misses        850      850              
Impacted Files Coverage Δ
src/huggingface_hub/__init__.py 75.75% <ø> (ø)
src/huggingface_hub/hf_api.py 89.28% <97.29%> (+0.38%) ⬆️
src/huggingface_hub/hf_file_system.py 90.55% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@Wauplin Wauplin added this to the in next release? milestone Apr 17, 2023
@Wauplin Wauplin changed the title Implement get_files_info + tests Implement list_files_info + tests Apr 17, 2023
Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extensive testing suite! Thanks for your work @Wauplin

@Wauplin Wauplin merged commit a433410 into main Apr 18, 2023
@Wauplin Wauplin deleted the feat-get-paths-infos branch April 18, 2023 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants