Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refacto Repository tests #1277

Merged
merged 16 commits into from
Jan 12, 2023
Merged

Refacto Repository tests #1277

merged 16 commits into from
Jan 12, 2023

Conversation

Wauplin
Copy link
Contributor

@Wauplin Wauplin commented Dec 21, 2022

TL;DR:

  1. Sorry for huge diff 😢
  2. Repository tests work on Windows (related to Add windows machine for CI #1112)
  3. Repository tests run in parallel
  4. Repository tests use /tmp dir
  5. Less redundancy

First of all, sorry for the huge diff in the PR 😢 .
I started it as a preliminary work for #1112 (Windows CI):

  1. I started to refacto how paths are generated => no more "directory/whatever_file.txt" as it breaks on Windows
  2. While doing that, I removed the WORKING_DIR_FIXTURE to replace it by a proper /tmp folder
    => Avoid to have to clean it after tests (when tests were failing, I often ended up with untracked folders in my huggingface_hub folder that I had to delete)
  3. Then I read more the tests and realized we were doing a lot of redundant work (creating 1 new repo on the Hub for each test even though we do not modify it)
    1. I've split repository tests between TestRepositoryShared and TestRepositoryUniqueRepos. In shared test, a single repo is created on the Hub and cloned multiple times. No files are pushed to this repo during the tests.
      => Save quite some time of setup/teardown. In Unique test, a repo is created per test (same as before) so that they are independent from each other.
    2. I also added attributes like self.repo_id, self.repo_path and self.repo_url and an helper self.clone_from(...) to avoid reduncancy (e.g. recomputing repo_id=f"{USER}/{self.REPO_NAME}", local_dir="{WORKING_DIR}/{REPO_NAME}" all the time).
  4. And finally I removed some tests that were duplicated (*).

=> Now we have a Repository test suite that should be "iso" compared to before but running in parallel (~1min instead of 8) and on Windows.

(*) For example in dataset tests, we are testing that cloning from a different repo type works. But also testing that we can commit, pull, push,... which is quite redundant because what we really want is to check if a repo can be cloned with a repo_type. All subsequent actions are already tested separately ("test only 1 feature at a time").

(EDIT: "non repository" tests are failing but it's not due to this PR 😕)

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Dec 21, 2022

The documentation is not available anymore as the PR was closed or merged.

@Wauplin Wauplin marked this pull request as ready for review December 22, 2022 11:12
@Wauplin Wauplin changed the title [WIP] Refacto repository tests Refacto Repository tests Dec 22, 2022
@huggingface huggingface deleted a comment from codecov bot Dec 22, 2022
Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much cleaner this way! Thanks for the refactor @Wauplin.

@Wauplin
Copy link
Contributor Author

Wauplin commented Jan 12, 2023

Thanks @LysandreJik for the review and sorry again for the big diff to review. I'll fix conflicts, wait for the checks and merge it :)

@Wauplin Wauplin merged commit 3179d1c into main Jan 12, 2023
@Wauplin Wauplin deleted the refacto-offline-repository-tests branch January 12, 2023 12:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants