Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List branches and tags from a repo #1276

Merged
merged 6 commits into from
Dec 21, 2022
Merged

List branches and tags from a repo #1276

merged 6 commits into from
Dec 21, 2022

Conversation

Wauplin
Copy link
Contributor

@Wauplin Wauplin commented Dec 20, 2022

Resolve #1260 (cc @antoche).

List branches, tags and "converts" from a repo on the Hub. Implemented, tested and documented the list_repo_refs method + RepoRefs data structure.
I did not document extensively what the convert ref is as I don't expect much usage of this in hfh (see docs)

>>> from huggingface_hub import HfApi
>>> api = HfApi()
>>> api.list_repo_refs("gpt2")
RepoRefs(branches=[RefInfo(name='main', ref='refs/heads/main', target_commit='e7da7f221d5bf496a48136c0cd264e630fe9fcc8')], converts=[], tags=[])

>>> api.list_repo_refs("bigcode/the-stack", repo_type='dataset')
RepoRefs(
    branches=[
        RefInfo(name='main', ref='refs/heads/main', target_commit='18edc1591d9ce72aa82f56c4431b3c969b210ae3'),
        RefInfo(name='v1.1.a1', ref='refs/heads/v1.1.a1', target_commit='f9826b862d1567f3822d3d25649b0d6d22ace714')
    ],
    converts=[],
    tags=[
        RefInfo(name='v1.0', ref='refs/tags/v1.0', target_commit='c37a8cd1e382064d8aced5e05543c5f7753834da')
    ]
)

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Dec 20, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very clean PR, love it!

@@ -2449,6 +2449,41 @@ def test_request_space_hardware(self, post_mock: Mock) -> None:
)


class ListRepoRefsTest(unittest.TestCase):
@classmethod
@with_production_testing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smart to only define it on the setUpClass for it to act on all subsequent methods. I wonder if it wouldn't be cleaner to put it on the class itself so that it's very clear that it should affect the entire class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I did that as a workaround because

@with_production_testing
class ListRepoRefsTest(unittest.TestCase):
    @classmethod
    def setUpClass(cls):

was not working for me 😢
But I have to admit my debug time on that was approximately 30s, just the time to try some possibilities and take the first one that worked 🙄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha in that case we can also keep it there, it was just a nit, to be fair.

Copy link
Member

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just left a small nit on naming

src/huggingface_hub/__init__.py Outdated Show resolved Hide resolved
target_commit: str

@classmethod
def from_dict(cls, data: Dict) -> "RefInfo":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're not just overriding the init from dataclass?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(dont remember the best practice in Python dataclasses)

Copy link
Contributor Author

@Wauplin Wauplin Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep why not. Best practice depends on your use case I guess.

Here the GitRefs will never be initialized by someone else than use from server-side data so it makes sense to put it in the init. It's not good practice when you force the users to have your data format as input instead of letting the default dataclass constructor. (at least IMO)

@Wauplin Wauplin merged commit 3c422cf into main Dec 21, 2022
@Wauplin Wauplin deleted the 1260-list-branches branch December 21, 2022 10:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

List available branches for a given model/dataset
5 participants