Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH Deprecate clone_from behavior #952

Merged
merged 29 commits into from
Aug 30, 2022

Conversation

merveenoyan
Copy link
Contributor

Solves #905.

@merveenoyan merveenoyan requested a review from osanseviero July 21, 2022 16:37
Copy link
Contributor

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! Thanks for this. Could you also add a small test that checks that a warning is raised please?

Here is an example:

with pytest.warns(
FutureWarning,
match="`task_name` input argument is deprecated. Pass `tags` instead.",
):

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jul 21, 2022

The documentation is not available anymore as the PR was closed or merged.

@merveenoyan
Copy link
Contributor Author

merveenoyan commented Jul 21, 2022

plot twist: this behavior was used mainly for tests? ✨ so many tests depend on clone_from creating repositories ✨
I took a look at the tests and they have setUp (e.g. test_clone_with_repo_name_and_org), which creates a repository remotely so I don't understand why it tries to create a repository with clone_from and raises the warning (which is only raised when you get HTTPError for not having the repository remotely). 🦆 I'll take a look tomorrow.

later edit: some setUps do not create repository remotely that's why they raise HTTPErrors. (e.g. ones in test_repocard.py and snapshot download tests)

@merveenoyan
Copy link
Contributor Author

merveenoyan commented Jul 21, 2022

I added creation of repositories to setups and managed to reduce failing tests to 3, will look for those on why they fail.
later edit: after next commit the test I've written for this has passed though. (needed to skip repo creation during setup.)

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! We should be proactive about reaching out to repositories that have the Repository set to create the repository when it doesn't exist. I think transformers does that under the hood, I'll see to change that there. Do we know any other libraries that leverage Repository to create their repositories on the hub?

@merveenoyan
Copy link
Contributor Author

maybe @osanseviero knows @LysandreJik 🙂

Copy link
Contributor

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this PR! I think we can merge after two things:

  • I think we should revert the changes in the tests as I would not expect changes in existing tests since we're only adding a warning. That way we're sure nothing changes :)
  • Once the above is reverted, let's get to green and merge 🔥

@@ -655,6 +656,12 @@ def clone_from(self, repo_url: str, use_auth_token: Union[bool, str, None] = Non
" on Hugging Face Hub."
)
else:
warnings.warn(
"Creating a repository through clone_from is deprecated"
"will be removed in v0.10.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe let's change this to v0.11., as 0.9 is already the next release, that way we have 2 releases before deprecating

@@ -204,9 +204,12 @@ def setUpClass(cls):
def setUp(self) -> None:
self.repo_path = Path(tempfile.mkdtemp())
self.REPO_NAME = repo_name()
url = self._api.create_repo(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When adding some changes, if you have to change tests, that's a red flag to watch out! As you're just adding a warning, nothing should change in any existing tests. It's ok if current tests depend on clone_from, I would revert the changes to keep them as they were as we're not changing any behaviour now. We should change the text when actually deprecating this. WDYT?

@@ -140,6 +143,19 @@ def test_clone_from_space(self):
use_auth_token=self._token,
)

def test_clone_from_deprecation_warning(self):
"""SKIP CREATION"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this docstring as the name of the test is already great 🔥

WORKING_REPO_DIR,
clone_from=f"{USER}/{uuid.uuid4()}",
use_auth_token=self._token,
repo_type="dataset",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: feel free to remove repo_type. The simpler the test the easier to understand, so anything not strictly needed I would remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it because there was two of those tests, one was for model repository and one was for dataset repository.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that given the behaviour you're changing is ok to just cover one case

@osanseviero
Copy link
Contributor

osanseviero commented Jul 23, 2022

@LysandreJik @merveenoyan quick recap of 3rd party libraries with push_to_hub that I'm aware of:

🟡 Adapter Transformers (note they pin to be below 0.8) cc @calpt, pointer is at https://github.com/adapter-hub/adapter-transformers/blob/master/src/transformers/utils/hub.py#L1014. So this won't break as they pin the version, but we should update this nevertheless to either create the repo before or use the non-git-based methods
🟢 ML Agents: they create the repo before cloning it
🟢 Stable Baselines 3 and RL Zoo: use non-git based methods now! (cc @simoninithomas for visibility)
🟢 Sentence Transformers creates the repo before
🟢 SpaCy uses non-git based methods now! 🥳

@osanseviero
Copy link
Contributor

Hey @merveenoyan, we will likely do a release next week. There were some very minor changes to be done in the PR before merging, would be great if we could merge the PR some time early *next week.

@merveenoyan
Copy link
Contributor Author

@osanseviero I created a new branch from 1bec939 I'm testing atm. Once it's done I'll merge to this branch and ask you to review.

@Wauplin
Copy link
Contributor

Wauplin commented Aug 17, 2022

@merveenoyan I'm finally getting back to you !

PR has been merged and you can now use this decorator to deprecate arguments and this one to test it. Hope it will be easy from now on and sorry for the delay ! 😬

@merveenoyan
Copy link
Contributor Author

@Wauplin I feel like it's best if I just close this PR and start from scratch instead of rewinding and rebasing things 😅

@Wauplin
Copy link
Contributor

Wauplin commented Aug 29, 2022

Yes indeed, sorry for the mess

@merveenoyan
Copy link
Contributor Author

merveenoyan commented Aug 29, 2022

@Wauplin in this case, clone_from as an argument is not really deprecated, what is deprecated inside is using clone_from to create a new repo from scratch (you should still be able to clone a repo using clone_from argument) in that case, I shouldn’t be decorating the function taking clone_from, but only decorating the tests right?
(however, test decoration goes hand in hand with deprecation so this is not really useful)

so for this one, is it the best if I manually edit tests? or would decorating tests only work?

@merveenoyan merveenoyan reopened this Aug 29, 2022
@Wauplin
Copy link
Contributor

Wauplin commented Aug 29, 2022

@merveenoyan Indeed we don't have a decorator for deprecating a specific usage of a valid argument (I don't even know how we could do such a decorator 😁 ).

However you can still use the decorator in the tests. Cherry-picking this commit should solve your problem merveenoyan@19e82fc.

@merveenoyan
Copy link
Contributor Author

@Wauplin as said by Simon, -W error::FutureWarning turn matching warnings into exceptions so the future warning is not raised with pytest.warns as used by the decorator you've written thus the tests fail. (we need z for errors instead?)
see

def _inner_decorator(test_function: Callable) -> Callable:
        @wraps(test_function)
        def _inner_test_function(*args, **kwargs):
            with pytest.warns(FutureWarning, match=f".*'{function_name}'.*"):
                return test_function(*args, **kwargs)

        return _inner_test_function

I will have to edit the tests myself or we can change the pytest.warns or how the tests are executed. What do you say?

@Wauplin
Copy link
Contributor

Wauplin commented Aug 29, 2022

In theory pytest.warns is catching during the execution of the test. The -W error::FutureWarning flag turns the warning into an error only at the end when all tests are done and pytest gather the results.

I think if there is still an issue it less in a typo or something 😕 I'm having a quick look at it.

tests/test_repository.py Outdated Show resolved Hide resolved
tests/test_repository.py Outdated Show resolved Hide resolved
@Wauplin
Copy link
Contributor

Wauplin commented Aug 29, 2022

Seems like:

  • test_clone_with_endpoint is not decorated.
  • test_clone_with_repo_name_and_user_namespace: there are 2 of them. They should not be decorated but one is.
  • test_clone_with_repo_name_org_and_no_auth_token : there are 2 of them. They should be decorated but one is not.

I can't really make the changes directly because it's on a fork 😕

@codecov
Copy link

codecov bot commented Aug 29, 2022

Codecov Report

Merging #952 (801c560) into main (128d114) will increase coverage by 21.17%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##             main     #952       +/-   ##
===========================================
+ Coverage   60.27%   81.44%   +21.17%     
===========================================
  Files          31       31               
  Lines        3421     3423        +2     
===========================================
+ Hits         2062     2788      +726     
+ Misses       1359      635      -724     
Impacted Files Coverage Δ
src/huggingface_hub/repository.py 80.00% <100.00%> (+0.58%) ⬆️
src/huggingface_hub/__init__.py 78.37% <0.00%> (+2.70%) ⬆️
src/huggingface_hub/_commit_api.py 92.24% <0.00%> (+3.44%) ⬆️
src/huggingface_hub/utils/_deprecation.py 100.00% <0.00%> (+4.34%) ⬆️
src/huggingface_hub/commands/user.py 33.33% <0.00%> (+9.52%) ⬆️
src/huggingface_hub/community.py 91.66% <0.00%> (+16.66%) ⬆️
src/huggingface_hub/lfs.py 74.52% <0.00%> (+17.83%) ⬆️
src/huggingface_hub/file_download.py 85.35% <0.00%> (+22.37%) ⬆️
src/huggingface_hub/utils/_errors.py 94.36% <0.00%> (+26.76%) ⬆️
... and 9 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@merveenoyan merveenoyan requested a review from Wauplin August 29, 2022 18:32
@Wauplin Wauplin merged commit 7b57719 into huggingface:main Aug 30, 2022
@osanseviero
Copy link
Contributor

🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants