-
Notifications
You must be signed in to change notification settings - Fork 581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check if repo or file exists #1591
Check if repo or file exists #1591
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all of your work @martinbrose ! I've carefully reviewed you PR and added some comments to it. Most of them are cosmetic changes to format the docstring as expected.
I have also added a suggestion for to tests the new feature on private repos.
Finally, you also need to add repo_exists
and file_exists
as root methods in huggingface_hub
. All main methods are accessible either with a custom HfApi
client or at root level. To do so, you need to:
- define the root methods here
- add them to this list in
__init__.py
- run
make style
(which will runpython utils/check_static_imports.py --update
) - commit your changes
Please let me know if you have any questions! 🤗
tests/test_hf_api.py
Outdated
@retry_endpoint | ||
def test_repo_exists(self): | ||
self.assertTrue(self._api.repo_exists("gpt2")) | ||
self.assertFalse(self._api.repo_exists("repo-that-does-not-exist")) # missing repo | ||
|
||
@retry_endpoint | ||
def test_file_exists(self): | ||
self.assertTrue(self._api.file_exists("config.json", "gpt2")) | ||
self.assertFalse(self._api.file_exists("config.json", "repo-that-does-not-exist")) # missing repo | ||
self.assertFalse(self._api.file_exists("file-does-not-exist", "gpt2")) # missing file | ||
self.assertFalse( | ||
self._api.file_exists("config.json", "gpt2", revision="revision-that-does-not-exist") | ||
) # missing revision | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you've asked, here is a suggestion to test private repos. I would create a new class HfApiRepoFileExistsTest
to test your two new methods so that tests are self-contained. In the setup of the tests, you create a private repo + upload a dumb file to it. Then you can try repo/file existence in various cases. Finally, you delete the repo in the tearDown
method. All of this runs on the staging environment.
class HfApiRepoFileExistsTest(HfApiCommonTest):
def setUp(self) -> None:
super().setUp()
self.repo_id = self._api.create_repo(repo_name(), private=True).repo_id
self._api.upload_file(repo_id=self.repo_id, path_in_repo="file.txt", path_or_fileobj=b"content")
def tearDown(self) -> None:
self._api.delete_repo(self.repo_id)
return super().tearDown()
@retry_endpoint
def test_repo_exists(self):
self.assertTrue(self._api.repo_exists(self.repo_id))
self.assertTrue(self._api.repo_exists(self.repo_id, token=False)) # private repo
self.assertFalse(self._api.repo_exists("repo-that-does-not-exist")) # missing repo
@retry_endpoint
def test_file_exists(self):
self.assertTrue(self._api.file_exists(self.repo_id, "file.txt"))
self.assertFalse(self._api.file_exists("repo-that-does-not-exist", "file.txt")) # missing repo
self.assertFalse(self._api.file_exists(self.repo_id, "file-does-not-exist")) # missing file
self.assertFalse(
self._api.file_exists(self.repo_id, "file.txt", revision="revision-that-does-not-exist")
) # missing revision
self.assertFalse(self._api.file_exists(self.repo_id, "file.txt", token=False)) # private repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this in addition to the test I have added to class HfApiEndpointsTest(HfApiCommonTest)
?
Currently, some of the tests in HfApiRepoFileExistsTest(HfApiCommonTest)
are failing...
I believe I don't understand all the above-proposed tests completely yet.
I need to read up on what behavior private repos should have and get a fresh look at this in a few days.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests would be a replacement to the tests you've written. I haven't changed much of the logic compared to what you did except:
- move them to a standalone class. It's better than extending existing classes. It allows to do
pytest tests/test_hf_api.py -k HfApiRepoFileExistsTest
to run only those ones. - use
HfApiCommonTest
=> used to setup aHfApi
client pointing to the staging environment. The user token is already set for this one (TOKEN
). It is a copy of the real Hub that we use only for tests. You can see it as a playground in which even if you do something bad, it's not a problem. - add a
setUp
method: this one is run before each test. It creates a new repo on the staging Hub + upload a first file to it. - add a
tearDown
method: this one is run after each test. It deletes the temporary repo we've created. - finally, redefine the tests you've written. I have added 2 cases to check that
file_exists
andrepo_exists
are returning false when ran on a private repo without the accurate token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I'd like to clarify. In the tests you propose for test_repo_exists
, you propose:
self.assertTrue(self._api.repo_exists(self.repo_id, token=False)) # private repo
Maybe I'm confusing something here, but shouldn't that be a self.assertFalse
?
Because we are providing a false token, so we shouldn't see the private repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm confusing something here, but shouldn't that be a self.assertFalse?
Yes totally! My bad 😁
Many thanks for the detailed review points! I'll attend to them in the next few days. |
The documentation is not available anymore as the PR was closed or merged. |
Hi @Wauplin, I believe I addressed all of your review points. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Well done for the tests, those are really important! 👍 🎉
Addresses #36.