Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Contrib test suite + tests for timm and sentence_transformers #1200

Merged
merged 27 commits into from
Nov 28, 2022

Conversation

Wauplin
Copy link
Contributor

@Wauplin Wauplin commented Nov 18, 2022

First discussed in #1190. Goal is to detect proactively breaking changes and deprecation warnings in downstream libraries. This is a very first implementation with tests for timm and sentence_transformers libraries to validate the concept. I think for a start we can have a github workflow only triggered
manually or contrib_ci_* branches. It doesn't really make sense to run it on each PR or on main.

How it works ?

The contrib folder contains simple end-to-end scripts to test integration of huggingface_hub in downstream libraries. The main goal is to proactively notice breaking changes and deprecation warnings. Each library is tested in its own virtualenv (with its own dependencies). Here is the workflow for "timm" library:

  • Create a virtualenv under ./contrib/timm/.venv
  • Install lib requirements ./contrib/timm/requirements.txt: install timm from main branch. Configurable for each lib.
  • Uninstall huggingface_hub (if any)
  • Install huggingface_hub from source code to test against latest version
  • Run timm tests pytest ./contrib/timm

See #1200 (comment) for more details.

How to add a new library ?

To add another contrib lib, one must:

  1. Create a subfolder with the lib name. Example: ./contrib/transformers
  2. Create a requirements.txt file specific to this lib. Example ./contrib/transformers/requirements.txt
  3. Implements tests for this lib. Example: ./contrib/transformers/test_push_to_hub.py
  4. Run make style to edit makefile and .github/workflows/contrib-tests.yml to add the new lib.

Run contrib tests in CI

Contrib tests can be manually triggered in GitHub with the Contrib tests workflow. CI is also triggered on branches starting by ci_contrib_*

Tests are not run in the default test suite (for each PR) as this would slow down development process. The goal is to notice breaking changes, not to avoid them. In particular, it is interesting to trigger it before a release to make sure it will not cause too much friction.

Run contrib tests locally

Tests are separated to avoid conflicts between version dependencies. Before running tests, a virtual env must be setup for each contrib library. To do so, run:

# Run setup in parallel to save time 
make contrib_setup -j4

# Run tests
# Optional: -j4 to run in parallel. Output will be messy in that case.
make contrib_test -j4

# Run only "timm" tests
make contrib_setup_timm
make contrib_test_timm

See #1200 (comment) for more details.

Todo:

  • separated virtual envs
  • working CI
  • working makefile with multiple contrib
  • README
  • from_pretrained test for timm

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Nov 18, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Contributor

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! I left some minor questions, but overall this looks to be going in the right direction! 🔥

contrib/requirements.txt Outdated Show resolved Hide resolved
.github/workflows/contrib-tests.yml Outdated Show resolved Hide resolved
contrib/test_timm.py Outdated Show resolved Hide resolved
contrib/requirements.txt Outdated Show resolved Hide resolved
@Wauplin
Copy link
Contributor Author

Wauplin commented Nov 21, 2022

@osanseviero thanks for your feedback, it helped a lot !
I make some changes to the contrib structure. Now I have a common requirements.txt. For each lib, I have a folder with a test file and another requirements.txt. The workflow is now to launch 1 process/job per dependent library:

  • create virtualenv for timm
  • install common requirements
  • install timm requirements
  • uninstall huggingface_hub (if any)
  • install huggingface_hub from source code
  • run timm tests

This way we don't have to handle conflicts between different dependencies / versions. It makes local tests a bit more complex as it requires 1 env per contrib. Hopefully not much people will really need to run those stuff locally. What I plan to do is to add a script (either separate, either in makefile) to handle all the stuff with the venvs. WDYT in general ?

@Wauplin
Copy link
Contributor Author

Wauplin commented Nov 22, 2022

Refactored a bit the makefile. Now possible to setup and run all tests locally by running:

# Setup all virtualenvs
make contrib_setup

# Run all tests
make contrib_tests

# Setup and run all tests at once
make contrib

# Delete all virtual envs (if corrupted)
make contrib_clear

And for a specific lib:

# Setup timm tests
make contrib_setup_timm

# Run timm tests
make contrib_test_timm

# Setup and run timm tests at once
make contrib_timm

# Delete timm virtualenv
make contrib_clear_timm

Copy link
Contributor

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking neat! I want to make a second pass through this PR and also let's see if others have any thoughts

contrib/README.md Outdated Show resolved Hide resolved
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
@Wauplin Wauplin changed the title [RFC] First draft for a contrib test suite + test for timm contrib [RFC] Contrib test suite + test for timm and sentence_transformers Nov 23, 2022
@Wauplin Wauplin changed the title [RFC] Contrib test suite + test for timm and sentence_transformers [RFC] Contrib test suite + tests for timm and sentence_transformers Nov 23, 2022
.github/workflows/contrib-tests.yml Outdated Show resolved Hide resolved
workflow_dispatch:
push:
branches:
- ci_contrib_*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we run this for main as well?

Copy link
Contributor Author

@Wauplin Wauplin Nov 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. If we want to make trigger it manually from main it's possible but if we run it all the time, the main branch could end up in a ❌ status as I expect contrib tests to fail (code can break in downstream library without a change on our side).

.github/workflows/contrib-tests.yml Show resolved Hide resolved
.github/workflows/contrib-tests.yml Outdated Show resolved Hide resolved
contrib/timm/test_timm.py Outdated Show resolved Hide resolved


@contextlib.contextmanager
def production_endpoint() -> Generator:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really needed? With HF_ENDPOINT you could change the endpoint you're using

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really needed?

I'd say yes, kinda the same as what already exists in the hfh tests/. The problem with HF_ENDPOINT is that it is evaluated only once at startup. What I want here is to make all calls to staging environment (especially pushing to repos) except some calls that have to be made to production environment (especially loading models).

Another solution could be to upload test models to staging but then we wouldn't notice if a model changed in production.

@Wauplin
Copy link
Contributor Author

Wauplin commented Nov 25, 2022

Thanks for the review @osanseviero. I made some changes and addresses all of your comments.
Let's have a final review from @LysandreJik once he is back from holidays and then merge the PR.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I left a few comments

Comment on lines +15 to +18
contrib: [
"sentence_transformers",
"timm",
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea to use a matrix here!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this matrix be governed by a ls first so that we have individual jobs for each folder under contrib and without the need to specify each folder here? (nitpick though, this should (or shouldn't) be done in a follow up PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in f20ab77 and 6cb96e7 a script that list contrib tests and update the Makefile and github workflow file accordingly. Script is integrated with make quality and make style to make it easy for contributors.

Comment on lines 11 to 12
4. Edit `makefile` to add the lib to `CONTRIB_LIBS` variable. Example: `CONTRIB_LIBS := timm transformers`
5. Edit `.github/workflows/contrib-tests.yml` to add the lib to `matrix.contrib` list. Example: `contrib: ["timm", "transformers"]`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd eventually look into automating these two so that it's slightly less error-prone, but as said above: nitpick

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now done by make style and make quality. Good call to reduce contribution efforts.


Contrib tests can be [manually triggered in GitHub](https://github.com/huggingface/huggingface_hub/actions) with the `Contrib tests` workflow.

Tests are not run in the default test suite (for each PR) as this would slow down development process. The goal is to notice breaking changes, not to avoid them. In particular, it is interesting to trigger it before a release to make sure it will not cause too much friction.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also run them once a week just to check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cron job added by d5949fa. Will run every week on Saturday midnight.

Comment on lines 1 to 2
pytest
pytest-env
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo we could just require the testing extra here instead of adding another requirements.txt file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Made the change.

Comment on lines +26 to +34
@pytest.mark.xfail(
reason=(
"Production endpoint is hardcoded in sentence_transformers when pushing to Hub."
)
)
def test_push_to_hub(
multi_qa_model: SentenceTransformer, repo_name: str, cleanup_repo: None
) -> None:
multi_qa_model.save_to_hub(repo_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to eventually have a test that doesn't fail to ensure that save to hub actually works :) we could have a specific org for that, like skops does, but I understand it's a bit complex to setup + very annoying to have testing artifacts on the actual hub

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to eventually have a test that doesn't fail to ensure that save to hub actually works :)

Yes, completely agree on that. I'd like to do that later. I am about to open an issue/PR on sentence transformers side to test that properly.
Worse case scenario, I have set a reminder to myself in 10 days.

Comment on lines +18 to +20
def test_push_to_hub(repo_name: str, cleanup_repo: None) -> None:
model = timm.create_model("resnet18")
timm.models.hub.push_to_hf_hub(model, repo_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also test that the model pushed is according to what we expect? For example, that we can redownload it and use it once again, as this would be the usual workflow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(discussed offline) Decision has been taken that the contrib/ test suite purpose is only to test the deprecation warnings in downstream libraries. Testing the validity of a pushed/downloaded model is therefore out of scope here.
This can be reevaluated in the future :)

Copy link
Contributor

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @LysandreJik feedback, other than that it LGTM! 🔥 great work

My only concern is identifying issues to 3rd party libraries close to release rather than earlier in the process, so the more often we could run this (e.g. every week sounds great), the better

@codecov
Copy link

codecov bot commented Nov 28, 2022

Codecov Report

Base: 84.37% // Head: 84.33% // Decreases project coverage by -0.03% ⚠️

Coverage data is based on head (b148848) compared to base (22c1431).
Patch has no changes to coverable lines.

❗ Current head b148848 differs from pull request most recent head f20ab77. Consider uploading reports for the commit f20ab77 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1200      +/-   ##
==========================================
- Coverage   84.37%   84.33%   -0.04%     
==========================================
  Files          44       44              
  Lines        4365     4355      -10     
==========================================
- Hits         3683     3673      -10     
  Misses        682      682              
Impacted Files Coverage Δ
src/huggingface_hub/file_download.py 88.09% <0.00%> (-0.35%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@Wauplin
Copy link
Contributor Author

Wauplin commented Nov 28, 2022

@osanseviero @LysandreJik thanks for the last reviews :)
I've made some last minute changes:

  • added a script to list contrib test and update Makefile/github action. It's now integrated to make style and make quality.
  • added a cron to run tests every Saturday midnight. Let's see how it goes. Later work should be to keep us updated about the cron result (slack message for instance ?). This can be done later as my main concern is to actually get more contrib tests first.
  • removed the contrib/requirements.txt in favor of .[testing]. Let's see if at some point we need a more specific requirement.

So I think we are now finally good to go 😄 🔥

@Wauplin Wauplin merged commit b33c1f2 into main Nov 28, 2022
@Wauplin Wauplin deleted the 1190-rfc-add-contrib-test-suite branch November 28, 2022 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create a contrib/ folder with example scripts from downstream libraries
4 participants