adapter-hub · calpt · Mar 23, 2022 · Jan 31, 2022 · Jan 31, 2022 · Jan 31, 2022
diff --git a/.github/workflows/tests_torch.yml b/.github/workflows/tests_torch.yml
@@ -17,6 +17,7 @@ on:
       - 'templates/**'
       - 'tests/**'
       - 'utils/**'
+  workflow_dispatch:
 
 jobs:
   check_code_quality:

diff --git a/.gitignore b/.gitignore
@@ -167,3 +167,6 @@ scripts/git-strip-merge
 
 # .lock
 *.lock
+
+# DS_Store (MacOS)
+.DS_Store
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -175,51 +175,82 @@ Follow these steps to start contributing:
 5. Develop the features on your branch.
 
    As you work on the features, you should make sure that the test suite
-   passes:
+   passes. You should run the tests impacted by your changes like this:
+
+   ```bash
+   $ pytest tests/<TEST_TO_RUN>.py
+   ```
+
+   You can also run the full suite with the following command, but it takes
+   a beefy machine to produce a result in a decent amount of time now that
+   Transformers has grown a lot. Here is the command for it:
 
    ```bash
    $ make test
    ```
 
-   Note, that this command uses `-n auto` pytest flag, therefore, it will start as many parallel `pytest` processes as the number of your computer's CPU-cores, and if you have lots of those and a few GPUs and not a great amount of RAM, it's likely to overload your computer. Therefore, to run the test suite, you may want to consider using this command instead:
+   For more information about tests, check out the
+   [dedicated documentation](https://huggingface.co/docs/transformers/testing)
+
+   🤗 Transformers relies on `black` and `isort` to format its source code
+   consistently. After you make changes, apply automatic style corrections and code verifications
+   that can't be automated in one go with:
 
    ```bash
-   $ python -m pytest -n 3 --dist=loadfile -s -v ./tests/
+   $ make fixup
    ```
 
-   Adjust the value of `-n` to fit the load your hardware can support.
+   This target is also optimized to only work with files modified by the PR you're working on.
 
-   `adapter-transformers` relies on `black` and `isort` to format its source code
-   consistently. After you make changes, format them with:
+   If you prefer to run the checks one after the other, the following command apply the
+   style corrections:
 
    ```bash
    $ make style
    ```
 
-   `adapter-transformers` also uses `flake8` to check for coding mistakes. Quality
+   `adapter-transformers` also uses `flake8` and a few custom scripts to check for coding mistakes. Quality
    control runs in CI, however you can also run the same checks with:
 
    ```bash
    $ make quality
    ```
-   You can do the automatic style corrections and code verifications that can't be automated in one go:
+
+   Finally we have a lot of scripts that check we didn't forget to update
+   some files when adding a new model, that you can run with
 
    ```bash
-   $ make fixup
+   $ make repo-consistency
    ```
 
-   This target is also optimized to only work with files modified by the PR you're working on.
+   To learn more about those checks and how to fix any issue with them, check out the
+   [documentation](https://huggingface.co/docs/transformers/pr_checks)
 
    If you're modifying documents under `docs/source`, make sure to validate that
    they can still be built. This check also runs in CI. To run a local check
-   make sure you have installed the documentation builder requirements, by
-   running `pip install .[tf,torch,docs]` once from the root of this repository
-   and then run:
+   make sure you have installed the documentation builder requirements. First you will need to clone the
+   repository containing our tools to build the documentation:
+
+   ```bash
+   $ pip install git+https://github.com/huggingface/doc-builder
+   ```
 
+   Then, make sure you have all the dependencies to be able to build the doc with:
+
    ```bash
-   $ make docs
+   $ pip install ".[docs]"
    ```
 
+   Finally run the following command from the root of the repository:
+
+   ```bash
+   $ doc-builder build transformers docs/source/ --build_dir ~/tmp/test-build
+   ```
+
+   This will build the documentation in the `~/tmp/test-build` folder where you can inspect the generated
+   Markdown files with your favorite editor. You won't be able to see the final rendering on the website
+   before your PR is merged, we are actively working on adding a tool for this.
+
    Once you're happy with your changes, add changed files using `git add` and
    make a commit with `git commit` to record your changes locally:
 
@@ -273,8 +304,15 @@ Follow these steps to start contributing:
    - If you are adding a new tokenizer, write tests, and make sure
      `RUN_SLOW=1 python -m pytest tests/test_tokenization_{your_model_name}.py` passes.
    CircleCI does not run the slow tests, but github actions does every night!
-6. All public methods must have informative docstrings that work nicely with sphinx. See `modeling_ctrl.py` for an
+6. All public methods must have informative docstrings that work nicely with sphinx. See `modeling_bert.py` for an
    example.
+7. Due to the rapidly growing repository, it is important to make sure that no files that would significantly weigh down the repository are added. This includes images, videos and other non-text files. We prefer to leverage a hf.co hosted `dataset` like
+   the ones hosted on [`hf-internal-testing`](https://huggingface.co/hf-internal-testing) in which to place these files and reference 
+   them by URL. We recommend putting them in the following dataset: [huggingface/documentation-images](https://huggingface.co/datasets/huggingface/documentation-images).
+   If an external contribution, feel free to add the images to your PR and ask a Hugging Face member to migrate your images
+   to this dataset.
+
+See more about the checks run on a pull request in our [PR guide](pr_checks)
 
 ### Tests
 
@@ -326,7 +364,7 @@ $ python -m unittest discover -s examples -t examples -v
 
 ### Style guide
 
-For documentation strings, `transformers` follows the [google style](https://google.github.io/styleguide/pyguide.html).
+For documentation strings, 🤗 Transformers follows the [google style](https://google.github.io/styleguide/pyguide.html).
 Check our [documentation writing guide](https://github.com/huggingface/transformers/tree/master/docs#writing-documentation---specification)
 for more information.
 
@@ -350,7 +388,7 @@ You can now use `make` from any terminal (Powershell, cmd.exe, etc) 🎉
 
 ### Syncing forked master with upstream (HuggingFace) master
 
-To avoid pinging the upstream repository which adds reference notes to each upstream PR and sends unnessary notifications to the developers involved in these PRs,
+To avoid pinging the upstream repository which adds reference notes to each upstream PR and sends unnecessary notifications to the developers involved in these PRs,
 when syncing the master branch of a forked repository, please, follow these steps:
 1. When possible, avoid syncing with the upstream using a branch and PR on the forked repository. Instead merge directly into the forked master.
 2. If a PR is absolutely necessary, use the following steps after checking out your branch:

diff --git a/Makefile b/Makefile
@@ -31,25 +31,25 @@ deps_table_check_updated:
 
 autogenerate_code: deps_table_update
 
-# Check that source code meets quality standards
+# Check that the repo is in a good state
 
 # NOTE FOR adapter-transformers: The following check is skipped as not all copies implement adapters yet
 # python utils/check_copies.py
 # python utils/check_table.py
 # python utils/check_dummies.py
-# python utils/tests_fetcher.py --sanity_check
-extra_quality_checks:
+repo-consistency:
 	python utils/check_repo.py
 	python utils/check_inits.py
 	python utils/check_adapters.py
 
 # this target runs checks on all files
+
 quality:
 	black --check $(check_dirs)
 	isort --check-only $(check_dirs)
 	python utils/custom_init_isort.py --check_only
 	flake8 $(check_dirs)
-	${MAKE} extra_quality_checks
+	python utils/style_doc.py src/transformers docs/source --max_len 119 --check_only
 
 # Format source code automatically and check is there are any problems left that need manual fixing
 
@@ -58,6 +58,7 @@ extra_style_checks:
 	python utils/style_doc.py src/transformers docs/source --max_len 119
 
 # this target runs checks on all files and potentially modifies some of them
+
 style:
 	black $(check_dirs)
 	isort $(check_dirs)
@@ -66,7 +67,7 @@ style:
 
 # Super fast fix and check target that only works on relevant modified files since the branch was made
 
-fixup: modified_only_fixup extra_style_checks autogenerate_code extra_quality_checks
+fixup: modified_only_fixup extra_style_checks autogenerate_code repo-consistency
 
 # Make marked copies of snippets of codes conform to the original
 
@@ -96,11 +97,6 @@ test-sagemaker: # install sagemaker dependencies in advance with pip install .[s
 	TEST_SAGEMAKER=True python -m pytest -n auto  -s -v ./tests/sagemaker
 
 
-# Check that docs can build
-
-docs:
-	cd docs && make html SPHINXOPTS="-W -j 4"
-
 # Release stuff
 
 pre-release:

diff --git a/tests/conftest.py → conftest.py b/tests/conftest.py → conftest.py
@@ -15,14 +15,15 @@
 # tests directory-specific settings - this file is run automatically
 # by pytest before any tests are run
 
+import doctest
 import sys
 import warnings
 from os.path import abspath, dirname, join
 
 
 # allow having multiple repository checkouts and not needing to remember to rerun
 # 'pip install -e .[dev]' when switching between checkouts and running tests.
-git_repo_path = abspath(join(dirname(dirname(__file__)), "src"))
+git_repo_path = abspath(join(dirname(__file__), "src"))
 sys.path.insert(1, git_repo_path)
 
 # silence FutureWarning warnings in tests since often we can't act on them until
@@ -59,3 +60,19 @@ def pytest_sessionfinish(session, exitstatus):
     # If no tests are collected, pytest exists with code 5, which makes the CI fail.
     if exitstatus == 5:
         session.exitstatus = 0
+
+
+# Doctest custom flag to ignore output.
+IGNORE_RESULT = doctest.register_optionflag('IGNORE_RESULT')
+
+OutputChecker = doctest.OutputChecker
+
+
+class CustomOutputChecker(OutputChecker):
+    def check_output(self, want, got, optionflags):
+        if IGNORE_RESULT & optionflags:
+            return True
+        return OutputChecker.check_output(self, want, got, optionflags)
+
+
+doctest.OutputChecker = CustomOutputChecker
diff --git a/docker/transformers-all-latest-gpu/Dockerfile b/docker/transformers-all-latest-gpu/Dockerfile
@@ -0,0 +1,22 @@
+FROM nvidia/cuda:11.2.2-cudnn8-runtime-ubuntu20.04
+LABEL maintainer="Hugging Face"
+
+ARG DEBIAN_FRONTEND=noninteractive
+
+RUN apt update
+RUN apt install -y git libsndfile1-dev tesseract-ocr espeak-ng python3 python3-pip ffmpeg
+RUN python3 -m pip install --no-cache-dir --upgrade pip
+
+ARG REF=master
+RUN git clone https://github.com/huggingface/transformers && cd transformers && git checkout $REF
+RUN python3 -m pip install --no-cache-dir -e ./transformers[dev,onnxruntime]
+
+RUN python3 -m pip install --no-cache-dir -U torch tensorflow
+RUN python3 -m pip uninstall -y flax jax
+RUN python3 -m pip install --no-cache-dir torch-scatter -f https://data.pyg.org/whl/torch-$(python3 -c "from torch import version; print(version.__version__.split('+')[0])")+cu102.html
+RUN python3 -m pip install --no-cache-dir git+https://github.com/facebookresearch/detectron2.git pytesseract https://github.com/kpu/kenlm/archive/master.zip
+RUN python3 -m pip install -U "itsdangerous<2.1.0"
+
+# When installing in editable mode, `transformers` is not recognized as a package.
+# this line must be added in order for python to be aware of transformers.
+RUN cd transformers && python3 setup.py develop
diff --git a/docker/transformers-doc-builder/Dockerfile b/docker/transformers-doc-builder/Dockerfile
@@ -0,0 +1,16 @@
+FROM python:3.8
+LABEL maintainer="Hugging Face"
+
+RUN apt update
+RUN git clone https://github.com/huggingface/transformers
+
+RUN python3 -m pip install --no-cache-dir --upgrade pip && python3 -m pip install --no-cache-dir git+https://github.com/huggingface/doc-builder ./transformers[dev,deepspeed]
+RUN apt-get -y update && apt-get install -y libsndfile1-dev && apt install -y tesseract-ocr
+
+RUN python3 -m pip install --no-cache-dir torch-scatter -f https://data.pyg.org/whl/torch-$(python -c "from torch import version; print(version.__version__.split('+')[0])")+cpu.html
+RUN python3 -m pip install --no-cache-dir torchvision git+https://github.com/facebookresearch/detectron2.git pytesseract https://github.com/kpu/kenlm/archive/master.zip
+RUN python3 -m pip install --no-cache-dir pytorch-quantization --extra-index-url https://pypi.ngc.nvidia.com
+RUN python3 -m pip install -U "itsdangerous<2.1.0"
+
+RUN doc-builder build transformers transformers/docs/source --build_dir doc-build-dev --notebook_dir notebooks/transformers_doc --clean --version pr_$PR_NUMBER
+RUN rm -rf doc-build-dev
diff --git a/docker/transformers-pytorch-deepspeed-latest-gpu/Dockerfile b/docker/transformers-pytorch-deepspeed-latest-gpu/Dockerfile
@@ -0,0 +1,21 @@
+FROM nvcr.io/nvidia/pytorch:21.03-py3
+LABEL maintainer="Hugging Face"
+
+ARG DEBIAN_FRONTEND=noninteractive
+
+RUN apt -y update
+RUN apt install -y libaio-dev
+RUN python3 -m pip install --no-cache-dir --upgrade pip
+
+ARG REF=master
+RUN git clone https://github.com/huggingface/transformers && cd transformers && git checkout $REF
+RUN python3 -m pip install --no-cache-dir -e ./transformers[testing,deepspeed]
+
+RUN git clone https://github.com/microsoft/DeepSpeed && cd DeepSpeed && rm -rf build && \
+    DS_BUILD_CPU_ADAM=1 DS_BUILD_AIO=1 DS_BUILD_UTILS=1 python3 -m pip install -e . --global-option="build_ext" --global-option="-j8" --no-cache -v --disable-pip-version-check 2>&1
+
+# When installing in editable mode, `transformers` is not recognized as a package.
+# this line must be added in order for python to be aware of transformers.
+RUN cd transformers && python3 setup.py develop
+
+RUN python3 -c "from deepspeed.launcher.runner import main"
diff --git a/docker/transformers-pytorch-gpu/Dockerfile b/docker/transformers-pytorch-gpu/Dockerfile
@@ -1,30 +1,26 @@
-FROM nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04
+FROM nvidia/cuda:11.2.2-cudnn8-runtime-ubuntu20.04
 LABEL maintainer="Hugging Face"
-LABEL repository="transformers"
 
-RUN apt update && \
-    apt install -y bash \
-                   build-essential \
-                   git \
-                   curl \
-                   ca-certificates \
-                   python3 \
-                   python3-pip && \
-    rm -rf /var/lib/apt/lists
+ARG DEBIAN_FRONTEND=noninteractive
 
-RUN python3 -m pip install --no-cache-dir --upgrade pip && \
-    python3 -m pip install --no-cache-dir \
-    mkl \
-    torch
+RUN apt update
+RUN apt install -y git libsndfile1-dev tesseract-ocr espeak-ng python3 python3-pip ffmpeg
+RUN python3 -m pip install --no-cache-dir --upgrade pip
 
-RUN git clone https://github.com/NVIDIA/apex
-RUN cd apex && \
-    python3 setup.py install && \
-    pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
+ARG REF=master
+RUN git clone https://github.com/huggingface/transformers && cd transformers && git checkout $REF
+RUN python3 -m pip install --no-cache-dir -e ./transformers[dev-torch,testing]
 
-WORKDIR /workspace
-COPY . transformers/
-RUN cd transformers/ && \
-    python3 -m pip install --no-cache-dir .
+# If set to nothing, will install the latest version
+ARG PYTORCH=''
 
-CMD ["/bin/bash"]
+RUN [ ${#PYTORCH} -gt 0 ] && VERSION='torch=='$PYTORCH'.*' ||  VERSION='torch'; python3 -m pip install --no-cache-dir -U $VERSION
+RUN python3 -m pip uninstall -y tensorflow flax
+
+RUN python3 -m pip install --no-cache-dir torch-scatter -f https://data.pyg.org/whl/torch-$(python3 -c "from torch import version; print(version.__version__.split('+')[0])")+cu102.html
+RUN python3 -m pip install --no-cache-dir git+https://github.com/facebookresearch/detectron2.git pytesseract https://github.com/kpu/kenlm/archive/master.zip
+RUN python3 -m pip install -U "itsdangerous<2.1.0"
+
+# When installing in editable mode, `transformers` is not recognized as a package.
+# this line must be added in order for python to be aware of transformers.
+RUN cd transformers && python3 setup.py develop