-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Contrib test suite + tests for timm and sentence_transformers #1200
Changes from all commits
f22d1cb
fc1da54
24427d3
64185aa
572ca56
e4766a7
af45cdb
821aa04
2a0d7c6
339af7f
68c26b2
1eb0a09
d4a72e7
aa1ffad
930c29e
625768d
7b28a5f
987ec8a
12aa4ba
173aff8
cd11be8
b148848
d8ddba6
f20ab77
d5949fa
6cb96e7
b299077
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
name: Contrib tests | ||
|
||
on: | ||
workflow_dispatch: | ||
schedule: | ||
- cron: '0 0 * * 6' # Run once a week, Saturday midnight | ||
push: | ||
branches: | ||
- ci_contrib_* | ||
|
||
jobs: | ||
build: | ||
runs-on: ubuntu-latest | ||
strategy: | ||
fail-fast: false | ||
matrix: | ||
contrib: [ | ||
"sentence_transformers", | ||
"timm", | ||
] | ||
Comment on lines
+17
to
+20
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Great idea to use a matrix here! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this matrix be governed by a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Set up Python 3.8 | ||
Wauplin marked this conversation as resolved.
Show resolved
Hide resolved
|
||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: 3.8 | ||
|
||
# Install pip | ||
- name: Install pip | ||
run: pip install --upgrade pip | ||
|
||
# Install downstream library and its specific dependencies | ||
- name: Install ${{ matrix.contrib }} | ||
run: pip install -r contrib/${{ matrix.contrib }}/requirements.txt | ||
|
||
# Install huggingface_hub from source code + testing extras | ||
- name: Install `huggingface_hub` | ||
run: | | ||
pip uninstall -y huggingface_hub | ||
pip install .[testing] | ||
|
||
# Run tests | ||
- name: Run tests | ||
run: pytest contrib/${{ matrix.contrib }} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,20 +1,61 @@ | ||
.PHONY: quality style test | ||
.PHONY: contrib quality style test | ||
|
||
|
||
check_dirs := tests src utils setup.py | ||
check_dirs := contrib src tests utils setup.py | ||
|
||
|
||
quality: | ||
black --check $(check_dirs) | ||
isort --check-only $(check_dirs) | ||
flake8 $(check_dirs) | ||
mypy src | ||
python utils/check_contrib_list.py | ||
python utils/check_static_imports.py | ||
|
||
style: | ||
black $(check_dirs) | ||
isort $(check_dirs) | ||
python utils/check_static_imports.py --update-file | ||
python utils/check_contrib_list.py --update | ||
python utils/check_static_imports.py --update | ||
|
||
test: | ||
pytest ./tests/ | ||
|
||
# Taken from https://stackoverflow.com/a/12110773 | ||
# Commands: | ||
# make contrib_setup_timm : setup tests for timm | ||
# make contrib_test_timm : run tests for timm | ||
# make contrib_timm : setup and run tests for timm | ||
# make contrib_clear_timm : delete timm virtual env | ||
# | ||
# make contrib_setup : setup ALL tests | ||
# make contrib_test : run ALL tests | ||
# make contrib : setup and run ALL tests | ||
# make contrib_clear : delete all virtual envs | ||
# Use -j4 flag to run jobs in parallel. | ||
CONTRIB_LIBS := sentence_transformers timm | ||
CONTRIB_JOBS := $(addprefix contrib_,${CONTRIB_LIBS}) | ||
CONTRIB_CLEAR_JOBS := $(addprefix contrib_clear_,${CONTRIB_LIBS}) | ||
CONTRIB_SETUP_JOBS := $(addprefix contrib_setup_,${CONTRIB_LIBS}) | ||
CONTRIB_TEST_JOBS := $(addprefix contrib_test_,${CONTRIB_LIBS}) | ||
|
||
contrib_clear_%: | ||
rm -rf contrib/$*/.venv | ||
|
||
contrib_setup_%: | ||
python3 -m venv contrib/$*/.venv | ||
./contrib/$*/.venv/bin/pip install -r contrib/$*/requirements.txt | ||
./contrib/$*/.venv/bin/pip uninstall -y huggingface_hub | ||
./contrib/$*/.venv/bin/pip install -e .[testing] | ||
|
||
contrib_test_%: | ||
./contrib/$*/.venv/bin/python -m pytest contrib/$* | ||
|
||
contrib_%: | ||
make contrib_setup_$* | ||
make contrib_test_$* | ||
|
||
contrib: ${CONTRIB_JOBS}; | ||
contrib_clear: ${CONTRIB_CLEAR_JOBS}; echo "Successful contrib tests." | ||
contrib_setup: ${CONTRIB_SETUP_JOBS}; echo "Successful contrib setup." | ||
contrib_test: ${CONTRIB_TEST_JOBS}; echo "Successful contrib tests." |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
# Contrib test suite | ||
|
||
The contrib folder contains simple end-to-end scripts to test integration of `huggingface_hub` in downstream libraries. The main goal is to proactively notice breaking changes and deprecation warnings. | ||
|
||
## Add tests for a new library | ||
|
||
To add another contrib lib, one must: | ||
1. Create a subfolder with the lib name. Example: `./contrib/transformers` | ||
2. Create a `requirements.txt` file specific to this lib. Example `./contrib/transformers/requirements.txt` | ||
3. Implements tests for this lib. Example: `./contrib/transformers/test_push_to_hub.py` | ||
4. Run `make style`. This will edit both `makefile` and `.github/workflows/contrib-tests.yml` to add the lib to list of libs to test. Make sure changes are accurate before committing. | ||
|
||
## Run contrib tests on CI | ||
|
||
Contrib tests can be [manually triggered in GitHub](https://github.com/huggingface/huggingface_hub/actions) with the `Contrib tests` workflow. | ||
|
||
Tests are not run in the default test suite (for each PR) as this would slow down development process. The goal is to notice breaking changes, not to avoid them. In particular, it is interesting to trigger it before a release to make sure it will not cause too much friction. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could also run them once a week just to check There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cron job added by d5949fa. Will run every week on Saturday midnight. |
||
|
||
## Run contrib tests locally | ||
|
||
Tests must be ran individually for each dependent library. Here is an example to run | ||
`timm` tests. Tests are separated to avoid conflicts between version dependencies. | ||
|
||
### Run all contrib tests | ||
|
||
Before running tests, a virtual env must be setup for each contrib library. To do so, run: | ||
|
||
```sh | ||
# Run setup in parallel to save time | ||
make contrib_setup -j4 | ||
``` | ||
|
||
Then tests can be run | ||
|
||
```sh | ||
# Optional: -j4 to run in parallel. Output will be messy in that case. | ||
make contrib_test -j4 | ||
``` | ||
|
||
Optionally, it is possible to setup and run all tests in a single command. However this | ||
take more time as you don't need to setup the venv each time you run tests. | ||
|
||
```sh | ||
make contrib -j4 | ||
``` | ||
|
||
Finally, it is possible to delete all virtual envs to get a fresh start for contrib tests. | ||
After running this command, `contrib_setup` will have to re-download/re-install all dependencies. | ||
|
||
``` | ||
make contrib_clear | ||
``` | ||
|
||
### Run contrib tests for a single lib | ||
|
||
Instead of running tests for all contrib libraries, you can run a specific lib: | ||
|
||
```sh | ||
# Setup timm tests | ||
make contrib_setup_timm | ||
|
||
# Run timm tests | ||
make contrib_test_timm | ||
|
||
# (or) Setup and run timm tests at once | ||
make contrib_timm | ||
|
||
# Delete timm virtualenv if corrupted | ||
make contrib_clear_timm | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
import time | ||
import uuid | ||
from typing import Generator | ||
|
||
import pytest | ||
|
||
from huggingface_hub import HfFolder, delete_repo | ||
|
||
|
||
@pytest.fixture(scope="session") | ||
def token() -> str: | ||
# Not critical, only usable on the sandboxed CI instance. | ||
return "hf_94wBhPGp6KrrTH3KDchhKpRxZwd6dmHWLL" | ||
|
||
|
||
@pytest.fixture(scope="session") | ||
def user() -> str: | ||
return "__DUMMY_TRANSFORMERS_USER__" | ||
|
||
|
||
@pytest.fixture(autouse=True, scope="session") | ||
def login_as_dummy_user(token: str) -> Generator: | ||
"""Login with dummy user token on machine | ||
|
||
Once all tests are completed, set back previous token.""" | ||
# Remove registered token | ||
old_token = HfFolder().get_token() | ||
HfFolder().save_token(token) | ||
|
||
yield # Run all tests | ||
|
||
# Set back token once all tests have passed | ||
if old_token is not None: | ||
HfFolder().save_token(old_token) | ||
|
||
|
||
@pytest.fixture | ||
def repo_name(request) -> None: | ||
""" | ||
Return a readable pseudo-unique repository name for tests. | ||
|
||
Example: "repo-2fe93f-16599646671840" | ||
""" | ||
prefix = request.module.__name__ # example: `test_timm` | ||
id = uuid.uuid4().hex[:6] | ||
ts = int(time.time() * 10e3) | ||
return f"repo-{prefix}-{id}-{ts}" | ||
|
||
|
||
@pytest.fixture | ||
def cleanup_repo(user: str, repo_name: str) -> None: | ||
"""Delete the repo at the end of the tests. | ||
|
||
TODO: Adapt to handle `repo_type` as well | ||
""" | ||
yield # run test | ||
delete_repo(repo_id=f"{user}/{repo_name}") |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
git+https://github.com/UKPLab/sentence-transformers.git#egg=sentence-transformers |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
import pytest | ||
|
||
from sentence_transformers import SentenceTransformer, util | ||
|
||
from ..utils import production_endpoint | ||
|
||
|
||
@pytest.fixture(scope="module") | ||
def multi_qa_model() -> SentenceTransformer: | ||
with production_endpoint(): | ||
return SentenceTransformer("multi-qa-MiniLM-L6-cos-v1") | ||
|
||
|
||
def test_from_pretrained(multi_qa_model: SentenceTransformer) -> None: | ||
# Example taken from https://www.sbert.net/docs/hugging_face.html#using-hugging-face-models. | ||
query_embedding = multi_qa_model.encode("How big is London") | ||
passage_embedding = multi_qa_model.encode( | ||
[ | ||
"London has 9,787,426 inhabitants at the 2011 census", | ||
"London is known for its financial district", | ||
] | ||
) | ||
print("Similarity:", util.dot_score(query_embedding, passage_embedding)) | ||
|
||
|
||
@pytest.mark.xfail( | ||
reason=( | ||
"Production endpoint is hardcoded in sentence_transformers when pushing to Hub." | ||
) | ||
) | ||
def test_push_to_hub( | ||
multi_qa_model: SentenceTransformer, repo_name: str, cleanup_repo: None | ||
) -> None: | ||
multi_qa_model.save_to_hub(repo_name) | ||
Comment on lines
+26
to
+34
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would be nice to eventually have a test that doesn't fail to ensure that save to hub actually works :) we could have a specific org for that, like skops does, but I understand it's a bit complex to setup + very annoying to have testing artifacts on the actual hub There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, completely agree on that. I'd like to do that later. I am about to open an issue/PR on sentence transformers side to test that properly. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Timm | ||
git+https://github.com/rwightman/pytorch-image-models.git#egg=timm |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
import timm | ||
|
||
from ..utils import production_endpoint | ||
|
||
|
||
MODEL_ID = "nateraw/timm-resnet50-beans" | ||
|
||
|
||
@production_endpoint() | ||
def test_load_from_hub() -> None: | ||
# Test load only config | ||
_ = timm.models.hub.load_model_config_from_hf(MODEL_ID) | ||
|
||
# Load entire model from Hub | ||
_ = timm.create_model("hf_hub:" + MODEL_ID, pretrained=True) | ||
|
||
|
||
def test_push_to_hub(repo_name: str, cleanup_repo: None) -> None: | ||
model = timm.create_model("resnet18") | ||
timm.models.hub.push_to_hf_hub(model, repo_name) | ||
Comment on lines
+18
to
+20
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we also test that the model pushed is according to what we expect? For example, that we can redownload it and use it once again, as this would be the usual workflow? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (discussed offline) Decision has been taken that the contrib/ test suite purpose is only to test the deprecation warnings in downstream libraries. Testing the validity of a pushed/downloaded model is therefore out of scope here. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
import contextlib | ||
from typing import Generator | ||
from unittest.mock import patch | ||
|
||
|
||
@contextlib.contextmanager | ||
def production_endpoint() -> Generator: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this really needed? With There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I'd say yes, kinda the same as what already exists in the Another solution could be to upload test models to staging but then we wouldn't notice if a model changed in production. |
||
"""Patch huggingface_hub to connect to production server in a context manager. | ||
|
||
Ugly way to patch all constants at once. | ||
TODO: refactor when https://github.com/huggingface/huggingface_hub/issues/1172 is fixed. | ||
|
||
Example: | ||
```py | ||
def test_push_to_hub(): | ||
# Pull from production Hub | ||
with production_endpoint(): | ||
model = ...from_pretrained("modelname") | ||
|
||
# Push to staging Hub | ||
model.push_to_hub() | ||
``` | ||
""" | ||
PROD_ENDPOINT = "https://huggingface.co" | ||
ENDPOINT_TARGETS = [ | ||
"huggingface_hub.constants", | ||
"huggingface_hub._commit_api", | ||
"huggingface_hub.hf_api", | ||
"huggingface_hub.lfs", | ||
"huggingface_hub.commands.user", | ||
"huggingface_hub.utils._git_credential", | ||
] | ||
|
||
PROD_URL_TEMPLATE = PROD_ENDPOINT + "/{repo_id}/resolve/{revision}/{filename}" | ||
URL_TEMPLATE_TARGETS = [ | ||
"huggingface_hub.constants", | ||
"huggingface_hub.file_download", | ||
] | ||
|
||
from huggingface_hub.hf_api import api | ||
|
||
patchers = ( | ||
[patch(target + ".ENDPOINT", PROD_ENDPOINT) for target in ENDPOINT_TARGETS] | ||
+ [ | ||
patch(target + ".HUGGINGFACE_CO_URL_TEMPLATE", PROD_URL_TEMPLATE) | ||
for target in URL_TEMPLATE_TARGETS | ||
] | ||
+ [patch.object(api, "endpoint", PROD_URL_TEMPLATE)] | ||
) | ||
|
||
# Start all patches | ||
for patcher in patchers: | ||
patcher.start() | ||
|
||
yield | ||
|
||
# Stop all patches | ||
for patcher in patchers: | ||
patcher.stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we run this for main as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. If we want to make trigger it manually from
main
it's possible but if we run it all the time, themain
branch could end up in a ❌ status as I expect contrib tests to fail (code can break in downstream library without a change on our side).