Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transfer the hffs code to hfh #1420

Merged
merged 19 commits into from
Apr 6, 2023
Merged

Transfer the hffs code to hfh #1420

merged 19 commits into from
Apr 6, 2023

Conversation

mariosasko
Copy link
Contributor

@mariosasko mariosasko commented Apr 3, 2023

Transfers the hffs code (and documentation from huggingface/hffs#13) to hfh.

The CI failures seem unrelated.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Apr 3, 2023

The documentation is not available anymore as the PR was closed or merged.

@mariosasko mariosasko requested a review from Wauplin April 3, 2023 16:59
@mariosasko mariosasko marked this pull request as ready for review April 3, 2023 16:59
@mariosasko
Copy link
Contributor Author

Cc @stevhliu for the docs review

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, this looks great! 👍

docs/source/guides/overview.mdx Outdated Show resolved Hide resolved
docs/source/guides/filesystem.mdx Outdated Show resolved Hide resolved
docs/source/guides/filesystem.mdx Outdated Show resolved Hide resolved
docs/source/guides/filesystem.mdx Outdated Show resolved Hide resolved
docs/source/guides/filesystem.mdx Outdated Show resolved Hide resolved
docs/source/guides/filesystem.mdx Outdated Show resolved Hide resolved
docs/source/package_reference/hf_filesystem.mdx Outdated Show resolved Hide resolved
docs/source/package_reference/hf_filesystem.mdx Outdated Show resolved Hide resolved
src/huggingface_hub/hf_file_system.py Outdated Show resolved Hide resolved
mariosasko and others added 5 commits April 3, 2023 19:42
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on that @mariosasko! I left some comments to fix but mostly cosmetic. Nothing to change logic-wise. Really looking forward to get this merged and released! 🎉

out = set()
path = [self._strip_protocol(p) for p in path]
for p in path:
if has_magic(p):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never used this method, what is it doing? (it doesn't seem to be documented in the official docs 😕)

docs/source/_toctree.yml Outdated Show resolved Hide resolved
docs/source/guides/filesystem.mdx Outdated Show resolved Hide resolved
docs/source/package_reference/hf_filesystem.mdx Outdated Show resolved Hide resolved
docs/source/guides/filesystem.mdx Outdated Show resolved Hide resolved
src/huggingface_hub/hf_file_system.py Outdated Show resolved Hide resolved
src/huggingface_hub/hf_file_system.py Outdated Show resolved Hide resolved
src/huggingface_hub/hf_file_system.py Outdated Show resolved Hide resolved
src/huggingface_hub/hf_file_system.py Outdated Show resolved Hide resolved
src/huggingface_hub/hf_file_system.py Outdated Show resolved Hide resolved
@mariosasko mariosasko requested a review from Wauplin April 5, 2023 13:44
@mariosasko
Copy link
Contributor Author

Ready for the final review

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final approval from me!

I made a small update to rename ResolvedPath to HfFileSystemResolvedPath to make it more explicitly hffs-related. It shouldn't be used by the user anyway but since it's exposed at root-level it's best to be explicit IMO.

@codecov
Copy link

codecov bot commented Apr 6, 2023

Codecov Report

Patch coverage: 90.55% and project coverage change: +29.80 🎉

Comparison is base (7a3e1e8) 54.55% compared to head (4a7ec33) 84.36%.

❗ Current head 4a7ec33 differs from pull request most recent head efa3d92. Consider uploading reports for the commit efa3d92 to get more accurate results

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #1420       +/-   ##
===========================================
+ Coverage   54.55%   84.36%   +29.80%     
===========================================
  Files          48       49        +1     
  Lines        5006     5238      +232     
===========================================
+ Hits         2731     4419     +1688     
+ Misses       2275      819     -1456     
Impacted Files Coverage Δ
src/huggingface_hub/__init__.py 75.75% <ø> (ø)
src/huggingface_hub/hf_api.py 88.90% <ø> (+44.65%) ⬆️
src/huggingface_hub/hf_file_system.py 90.51% <90.51%> (ø)
src/huggingface_hub/utils/__init__.py 100.00% <100.00%> (ø)

... and 32 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@lhoestq
Copy link
Member

lhoestq commented Apr 6, 2023

For some reason those are not working:

This one raises FileNotFoundError but it's maybe normal - we expect unquoted revisions as params

>>> fs.ls("datasets/lhoestq/demo1", revision="refs%2Fconvert%2Fparquet")
File ~/.pyenv/versions/3.9.15/envs/hf-datasets/lib/python3.9/site-packages/huggingface_hub/hf_file_system.py:168, in HfFileSystem.resolve_path(self, path, revision)
    166                     raise FileNotFoundError(path) from err
    167             else:
--> 168                 raise FileNotFoundError(path) from err
    169 else:
    170     repo_id = path

FileNotFoundError: lhoestq/demo1

but those one should work IMO:

>>> fs.ls("datasets/lhoestq/demo1", revision="refs/convert/parquet")
File ~/.pyenv/versions/3.9.15/envs/hf-datasets/lib/python3.9/site-packages/huggingface_hub/hf_file_system.py:120, in HfFileSystem.resolve_path.<locals>._align_revision_in_path_with_revision(revision_in_path, revision)
    118 if revision is not None:
    119     if revision_in_path is not None and revision_in_path != revision:
--> 120         raise ValueError(
    121             f'Revision specified in path ("{revision_in_path}") and in `revision` argument ("{revision}")'
    122             " are not the same."
    123         )
    124 else:
    125     revision = revision_in_path

ValueError: Revision specified in path ("refs") and in `revision` argument ("refs/convert/parquet") are not the same.
>>> fs.ls("datasets/lhoestq/demo1@refs%2Fconvert%2Fparquet")
File ~/.pyenv/versions/3.9.15/envs/hf-datasets/lib/python3.9/site-packages/huggingface_hub/hf_file_system.py:120, in HfFileSystem.resolve_path.<locals>._align_revision_in_path_with_revision(revision_in_path, revision)
    118 if revision is not None:
    119     if revision_in_path is not None and revision_in_path != revision:
--> 120         raise ValueError(
    121             f'Revision specified in path ("{revision_in_path}") and in `revision` argument ("{revision}")'
    122             " are not the same."
    123         )
    124 else:
    125     revision = revision_in_path

ValueError: Revision specified in path ("refs") and in `revision` argument ("refs/convert/parquet") are not the same.

while it does exist on the Hub: https://huggingface.co/datasets/lhoestq/demo1/tree/refs%2Fconvert%2Fparquet

@mariosasko
Copy link
Contributor Author

This one raises FileNotFoundError but it's maybe normal - we expect unquoted revisions as params

Yes, we want to be consistent with HfApi

@mariosasko mariosasko merged commit e9dbca6 into main Apr 6, 2023
@mariosasko mariosasko deleted the hffs branch April 6, 2023 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants