Skip to content

Commit

Permalink
[RFC] Contrib test suite + tests for timm and sentence_transformers (#…
Browse files Browse the repository at this point in the history
…1200)

* First draft for a contrib test suite + test for timm contrib

* run only Python 3.8

* remove commented code

* Run contrib tests in separate environments

* fix ci

* fix ci again

* and now ?

* stupid me

* this time ?

* Refactor how to run contrib tests locally

* add tests for sentence_transformers

* amke style

* Update contrib/README.md

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* ADapt timm tests

* Include feedback form osanseviero

* script to check contrib list is accurate

* Use [testing] requirements as contrib common dependencies

* add check_contrib_list in github workflow

* code qualiry

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
  • Loading branch information
Wauplin and osanseviero authored Nov 28, 2022
1 parent 22c1431 commit b33c1f2
Show file tree
Hide file tree
Showing 15 changed files with 453 additions and 9 deletions.
45 changes: 45 additions & 0 deletions .github/workflows/contrib-tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
name: Contrib tests

on:
workflow_dispatch:
schedule:
- cron: '0 0 * * 6' # Run once a week, Saturday midnight
push:
branches:
- ci_contrib_*

jobs:
build:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
contrib: [
"sentence_transformers",
"timm",
]

steps:
- uses: actions/checkout@v2
- name: Set up Python 3.8
uses: actions/setup-python@v2
with:
python-version: 3.8

# Install pip
- name: Install pip
run: pip install --upgrade pip

# Install downstream library and its specific dependencies
- name: Install ${{ matrix.contrib }}
run: pip install -r contrib/${{ matrix.contrib }}/requirements.txt

# Install huggingface_hub from source code + testing extras
- name: Install `huggingface_hub`
run: |
pip uninstall -y huggingface_hub
pip install .[testing]
# Run tests
- name: Run tests
run: pytest contrib/${{ matrix.contrib }}
1 change: 1 addition & 0 deletions .github/workflows/python-quality.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ jobs:
- run: black --check tests src
- run: isort --check-only tests src
- run: flake8 tests src
- run: python utils/check_contrib_list.py
- run: python utils/check_static_imports.py

# Run type checking at least on huggingface_hub root file to check all modules
Expand Down
47 changes: 44 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,20 +1,61 @@
.PHONY: quality style test
.PHONY: contrib quality style test


check_dirs := tests src utils setup.py
check_dirs := contrib src tests utils setup.py


quality:
black --check $(check_dirs)
isort --check-only $(check_dirs)
flake8 $(check_dirs)
mypy src
python utils/check_contrib_list.py
python utils/check_static_imports.py

style:
black $(check_dirs)
isort $(check_dirs)
python utils/check_static_imports.py --update-file
python utils/check_contrib_list.py --update
python utils/check_static_imports.py --update

test:
pytest ./tests/

# Taken from https://stackoverflow.com/a/12110773
# Commands:
# make contrib_setup_timm : setup tests for timm
# make contrib_test_timm : run tests for timm
# make contrib_timm : setup and run tests for timm
# make contrib_clear_timm : delete timm virtual env
#
# make contrib_setup : setup ALL tests
# make contrib_test : run ALL tests
# make contrib : setup and run ALL tests
# make contrib_clear : delete all virtual envs
# Use -j4 flag to run jobs in parallel.
CONTRIB_LIBS := sentence_transformers timm
CONTRIB_JOBS := $(addprefix contrib_,${CONTRIB_LIBS})
CONTRIB_CLEAR_JOBS := $(addprefix contrib_clear_,${CONTRIB_LIBS})
CONTRIB_SETUP_JOBS := $(addprefix contrib_setup_,${CONTRIB_LIBS})
CONTRIB_TEST_JOBS := $(addprefix contrib_test_,${CONTRIB_LIBS})

contrib_clear_%:
rm -rf contrib/$*/.venv

contrib_setup_%:
python3 -m venv contrib/$*/.venv
./contrib/$*/.venv/bin/pip install -r contrib/$*/requirements.txt
./contrib/$*/.venv/bin/pip uninstall -y huggingface_hub
./contrib/$*/.venv/bin/pip install -e .[testing]

contrib_test_%:
./contrib/$*/.venv/bin/python -m pytest contrib/$*

contrib_%:
make contrib_setup_$*
make contrib_test_$*

contrib: ${CONTRIB_JOBS};
contrib_clear: ${CONTRIB_CLEAR_JOBS}; echo "Successful contrib tests."
contrib_setup: ${CONTRIB_SETUP_JOBS}; echo "Successful contrib setup."
contrib_test: ${CONTRIB_TEST_JOBS}; echo "Successful contrib tests."
70 changes: 70 additions & 0 deletions contrib/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Contrib test suite

The contrib folder contains simple end-to-end scripts to test integration of `huggingface_hub` in downstream libraries. The main goal is to proactively notice breaking changes and deprecation warnings.

## Add tests for a new library

To add another contrib lib, one must:
1. Create a subfolder with the lib name. Example: `./contrib/transformers`
2. Create a `requirements.txt` file specific to this lib. Example `./contrib/transformers/requirements.txt`
3. Implements tests for this lib. Example: `./contrib/transformers/test_push_to_hub.py`
4. Run `make style`. This will edit both `makefile` and `.github/workflows/contrib-tests.yml` to add the lib to list of libs to test. Make sure changes are accurate before committing.

## Run contrib tests on CI

Contrib tests can be [manually triggered in GitHub](https://github.com/huggingface/huggingface_hub/actions) with the `Contrib tests` workflow.

Tests are not run in the default test suite (for each PR) as this would slow down development process. The goal is to notice breaking changes, not to avoid them. In particular, it is interesting to trigger it before a release to make sure it will not cause too much friction.

## Run contrib tests locally

Tests must be ran individually for each dependent library. Here is an example to run
`timm` tests. Tests are separated to avoid conflicts between version dependencies.

### Run all contrib tests

Before running tests, a virtual env must be setup for each contrib library. To do so, run:

```sh
# Run setup in parallel to save time
make contrib_setup -j4
```

Then tests can be run

```sh
# Optional: -j4 to run in parallel. Output will be messy in that case.
make contrib_test -j4
```

Optionally, it is possible to setup and run all tests in a single command. However this
take more time as you don't need to setup the venv each time you run tests.

```sh
make contrib -j4
```

Finally, it is possible to delete all virtual envs to get a fresh start for contrib tests.
After running this command, `contrib_setup` will have to re-download/re-install all dependencies.

```
make contrib_clear
```

### Run contrib tests for a single lib

Instead of running tests for all contrib libraries, you can run a specific lib:

```sh
# Setup timm tests
make contrib_setup_timm

# Run timm tests
make contrib_test_timm

# (or) Setup and run timm tests at once
make contrib_timm

# Delete timm virtualenv if corrupted
make contrib_clear_timm
```
Empty file added contrib/__init__.py
Empty file.
57 changes: 57 additions & 0 deletions contrib/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
import time
import uuid
from typing import Generator

import pytest

from huggingface_hub import HfFolder, delete_repo


@pytest.fixture(scope="session")
def token() -> str:
# Not critical, only usable on the sandboxed CI instance.
return "hf_94wBhPGp6KrrTH3KDchhKpRxZwd6dmHWLL"


@pytest.fixture(scope="session")
def user() -> str:
return "__DUMMY_TRANSFORMERS_USER__"


@pytest.fixture(autouse=True, scope="session")
def login_as_dummy_user(token: str) -> Generator:
"""Login with dummy user token on machine
Once all tests are completed, set back previous token."""
# Remove registered token
old_token = HfFolder().get_token()
HfFolder().save_token(token)

yield # Run all tests

# Set back token once all tests have passed
if old_token is not None:
HfFolder().save_token(old_token)


@pytest.fixture
def repo_name(request) -> None:
"""
Return a readable pseudo-unique repository name for tests.
Example: "repo-2fe93f-16599646671840"
"""
prefix = request.module.__name__ # example: `test_timm`
id = uuid.uuid4().hex[:6]
ts = int(time.time() * 10e3)
return f"repo-{prefix}-{id}-{ts}"


@pytest.fixture
def cleanup_repo(user: str, repo_name: str) -> None:
"""Delete the repo at the end of the tests.
TODO: Adapt to handle `repo_type` as well
"""
yield # run test
delete_repo(repo_id=f"{user}/{repo_name}")
Empty file.
1 change: 1 addition & 0 deletions contrib/sentence_transformers/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
git+https://github.com/UKPLab/sentence-transformers.git#egg=sentence-transformers
34 changes: 34 additions & 0 deletions contrib/sentence_transformers/test_sentence_transformers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
import pytest

from sentence_transformers import SentenceTransformer, util

from ..utils import production_endpoint


@pytest.fixture(scope="module")
def multi_qa_model() -> SentenceTransformer:
with production_endpoint():
return SentenceTransformer("multi-qa-MiniLM-L6-cos-v1")


def test_from_pretrained(multi_qa_model: SentenceTransformer) -> None:
# Example taken from https://www.sbert.net/docs/hugging_face.html#using-hugging-face-models.
query_embedding = multi_qa_model.encode("How big is London")
passage_embedding = multi_qa_model.encode(
[
"London has 9,787,426 inhabitants at the 2011 census",
"London is known for its financial district",
]
)
print("Similarity:", util.dot_score(query_embedding, passage_embedding))


@pytest.mark.xfail(
reason=(
"Production endpoint is hardcoded in sentence_transformers when pushing to Hub."
)
)
def test_push_to_hub(
multi_qa_model: SentenceTransformer, repo_name: str, cleanup_repo: None
) -> None:
multi_qa_model.save_to_hub(repo_name)
Empty file added contrib/timm/__init__.py
Empty file.
2 changes: 2 additions & 0 deletions contrib/timm/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Timm
git+https://github.com/rwightman/pytorch-image-models.git#egg=timm
20 changes: 20 additions & 0 deletions contrib/timm/test_timm.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
import timm

from ..utils import production_endpoint


MODEL_ID = "nateraw/timm-resnet50-beans"


@production_endpoint()
def test_load_from_hub() -> None:
# Test load only config
_ = timm.models.hub.load_model_config_from_hf(MODEL_ID)

# Load entire model from Hub
_ = timm.create_model("hf_hub:" + MODEL_ID, pretrained=True)


def test_push_to_hub(repo_name: str, cleanup_repo: None) -> None:
model = timm.create_model("resnet18")
timm.models.hub.push_to_hf_hub(model, repo_name)
59 changes: 59 additions & 0 deletions contrib/utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
import contextlib
from typing import Generator
from unittest.mock import patch


@contextlib.contextmanager
def production_endpoint() -> Generator:
"""Patch huggingface_hub to connect to production server in a context manager.
Ugly way to patch all constants at once.
TODO: refactor when https://github.com/huggingface/huggingface_hub/issues/1172 is fixed.
Example:
```py
def test_push_to_hub():
# Pull from production Hub
with production_endpoint():
model = ...from_pretrained("modelname")
# Push to staging Hub
model.push_to_hub()
```
"""
PROD_ENDPOINT = "https://huggingface.co"
ENDPOINT_TARGETS = [
"huggingface_hub.constants",
"huggingface_hub._commit_api",
"huggingface_hub.hf_api",
"huggingface_hub.lfs",
"huggingface_hub.commands.user",
"huggingface_hub.utils._git_credential",
]

PROD_URL_TEMPLATE = PROD_ENDPOINT + "/{repo_id}/resolve/{revision}/{filename}"
URL_TEMPLATE_TARGETS = [
"huggingface_hub.constants",
"huggingface_hub.file_download",
]

from huggingface_hub.hf_api import api

patchers = (
[patch(target + ".ENDPOINT", PROD_ENDPOINT) for target in ENDPOINT_TARGETS]
+ [
patch(target + ".HUGGINGFACE_CO_URL_TEMPLATE", PROD_URL_TEMPLATE)
for target in URL_TEMPLATE_TARGETS
]
+ [patch.object(api, "endpoint", PROD_URL_TEMPLATE)]
)

# Start all patches
for patcher in patchers:
patcher.start()

yield

# Stop all patches
for patcher in patchers:
patcher.stop()
Loading

0 comments on commit b33c1f2

Please sign in to comment.