Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update deprecated/unused dependencies 🧹 🧹 #36419

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
48 changes: 10 additions & 38 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,53 +69,31 @@

import os
import re
import shutil
from pathlib import Path

from setuptools import Command, find_packages, setup


# Remove stale transformers.egg-info directory to avoid https://github.com/pypa/pip/issues/5466
stale_egg_info = Path(__file__).parent / "transformers.egg-info"
if stale_egg_info.exists():
print(
(
"Warning: {} exists.\n\n"
"If you recently updated transformers to 3.0 or later, this is expected,\n"
"but it may prevent transformers from installing in editable mode.\n\n"
"This directory is automatically generated by Python's packaging tools.\n"
"I will remove it now.\n\n"
"See https://github.com/pypa/pip/issues/5466 for details.\n"
).format(stale_egg_info)
)
shutil.rmtree(stale_egg_info)
Comment on lines -78 to -91
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The root issue was sorted and added to pip 20.1, released in early 2020. Our minimum python version is 3.9, released in late 2020, and it comes with a pip version more recent than 20.1.



# IMPORTANT:
# 1. all dependencies should be listed here with their version requirements if any
# 2. once modified, run: `make deps_table_update` to update src/transformers/dependency_versions_table.py
_deps = [
"Pillow>=10.0.1,<=15.0",
"accelerate>=0.26.0",
"av",
"beautifulsoup4",
"blobfile",
"codecarbon>=2.8.1",
"cookiecutter==1.7.3",
"dataclasses",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dataclasses is part of the standard library since python 3.7: https://peps.python.org/pep-0557/

"datasets!=2.5.0",
"deepspeed>=0.9.3",
"diffusers",
"dill<0.3.5",
Copy link
Member Author

@gante gante Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dill was pinned to sort example tests (#17368) -- removing to test whether it is still needed

"evaluate>=0.2.0",
"faiss-cpu",
"fastapi",
"filelock",
"flax>=0.4.1,<=0.7.0",
"fsspec<2023.10.0",
Copy link
Member Author

@gante gante Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fsspec was pinned to fix CI (#27010) -- removing to test whether it is still needed

"ftfy",
"fugashi>=1.0",
"GitPython<3.1.19",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pin on GitPython was added to fix CI (#12858) -- removing to test whether it is still needed

"GitPython",
"hf-doc-builder>=0.3.0",
"huggingface-hub>=0.26.0,<1.0",
"importlib_metadata",
Expand All @@ -130,12 +108,10 @@
"keras>2.9,<2.16",
"keras-nlp>=0.3.1,<0.14.0", # keras-nlp 0.14 doesn't support keras 2, see pin on keras.
"librosa",
"natten>=0.14.6,<0.15.0",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pin on GitPython was added to fix CI (#28432) -- removing to test whether it is still needed

"natten>=0.14.6",
"nltk<=3.8.1",
"num2words",
"numpy>=1.17",
"onnxconverter-common",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onnxconverter-common (library name) nor onnxconverter_common (corresponding import) are present in our library

"onnxruntime-tools>=1.4.2",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

"onnxruntime>=1.4.0",
"opencv-python",
"optimum-benchmark>=0.3.0",
Expand All @@ -144,11 +120,12 @@
"packaging>=20.0",
"parameterized",
"phonemizer",
"Pillow>=10.0.1,<=15.0",
"protobuf",
"psutil",
"pyyaml>=5.1",
"pydantic",
"pytest>=7.2.0,<8.0.0",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pin on pytest was added to fix CI (#28758) -- removing to test whether it is still needed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is for doctest ... and I kinda think we have issues running doctests on circleci for some time.
But I think it's time for us to move doctest from circleci anyway, so ok for me here.

"pytest>=7.2.0",
"pytest-asyncio",
"pytest-timeout",
"pytest-xdist",
Expand Down Expand Up @@ -267,15 +244,9 @@ def run(self):
extras["ja"] = deps_list("fugashi", "ipadic", "unidic_lite", "unidic", "sudachipy", "sudachidict_core", "rhoknp")
extras["sklearn"] = deps_list("scikit-learn")

extras["tf"] = deps_list("tensorflow", "onnxconverter-common", "tf2onnx", "tensorflow-text", "keras-nlp")
extras["tf"] = deps_list("tensorflow", "tf2onnx", "tensorflow-text", "keras-nlp")
extras["tf-cpu"] = deps_list(
"keras",
"tensorflow-cpu",
"onnxconverter-common",
"tf2onnx",
"tensorflow-text",
"keras-nlp",
"tensorflow-probability",
"keras", "tensorflow-cpu", "tf2onnx", "tensorflow-text", "keras-nlp", "tensorflow-probability"
)

extras["torch"] = deps_list("torch", "accelerate")
Expand All @@ -290,8 +261,8 @@ def run(self):

extras["tokenizers"] = deps_list("tokenizers")
extras["ftfy"] = deps_list("ftfy")
extras["onnxruntime"] = deps_list("onnxruntime", "onnxruntime-tools")
extras["onnx"] = deps_list("onnxconverter-common", "tf2onnx") + extras["onnxruntime"]
extras["onnxruntime"] = deps_list("onnxruntime")
extras["onnx"] = deps_list("tf2onnx") + extras["onnxruntime"]
extras["modelcreation"] = deps_list("cookiecutter")

extras["sagemaker"] = deps_list("sagemaker")
Expand Down Expand Up @@ -328,7 +299,6 @@ def run(self):
"parameterized",
"psutil",
"datasets",
"dill",
"evaluate",
"pytest-timeout",
"ruff",
Expand All @@ -345,6 +315,7 @@ def run(self):
)
+ extras["retrieval"]
+ extras["modelcreation"]
+ extras["tiktoken"]
Copy link
Member Author

@gante gante Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needed to test py.test tests/models/llama/test_tokenization_llama.py::TikTokenIntegrationTests, so it makes sense to be in the testing extra

Some of our CI images rely on installing testing (e.g. see scheduled daily torch tests), and were not running those tests as a result

)

extras["deepspeed-testing"] = extras["deepspeed"] + extras["testing"] + extras["optuna"] + extras["sentencepiece"]
Expand Down Expand Up @@ -418,6 +389,7 @@ def run(self):
"tqdm",
)

# TODO: when we remove `agents` from `transformers`, review entries in `_deps`.
extras["agents"] = deps_list(
"diffusers", "accelerate", "datasets", "torch", "sentencepiece", "opencv-python", "Pillow"
)
Expand Down
13 changes: 4 additions & 9 deletions src/transformers/dependency_versions_table.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,27 +2,23 @@
# 1. modify the `_deps` dict in setup.py
# 2. run `make deps_table_update``
deps = {
"Pillow": "Pillow>=10.0.1,<=15.0",
"accelerate": "accelerate>=0.26.0",
"av": "av",
"beautifulsoup4": "beautifulsoup4",
"blobfile": "blobfile",
"codecarbon": "codecarbon>=2.8.1",
"cookiecutter": "cookiecutter==1.7.3",
"dataclasses": "dataclasses",
"datasets": "datasets!=2.5.0",
"deepspeed": "deepspeed>=0.9.3",
"diffusers": "diffusers",
"dill": "dill<0.3.5",
"evaluate": "evaluate>=0.2.0",
"faiss-cpu": "faiss-cpu",
"fastapi": "fastapi",
"filelock": "filelock",
"flax": "flax>=0.4.1,<=0.7.0",
"fsspec": "fsspec<2023.10.0",
"ftfy": "ftfy",
"fugashi": "fugashi>=1.0",
"GitPython": "GitPython<3.1.19",
"GitPython": "GitPython",
"hf-doc-builder": "hf-doc-builder>=0.3.0",
"huggingface-hub": "huggingface-hub>=0.26.0,<1.0",
"importlib_metadata": "importlib_metadata",
Expand All @@ -36,12 +32,10 @@
"keras": "keras>2.9,<2.16",
"keras-nlp": "keras-nlp>=0.3.1,<0.14.0",
"librosa": "librosa",
"natten": "natten>=0.14.6,<0.15.0",
"natten": "natten>=0.14.6",
"nltk": "nltk<=3.8.1",
"num2words": "num2words",
"numpy": "numpy>=1.17",
"onnxconverter-common": "onnxconverter-common",
"onnxruntime-tools": "onnxruntime-tools>=1.4.2",
"onnxruntime": "onnxruntime>=1.4.0",
"opencv-python": "opencv-python",
"optimum-benchmark": "optimum-benchmark>=0.3.0",
Expand All @@ -50,11 +44,12 @@
"packaging": "packaging>=20.0",
"parameterized": "parameterized",
"phonemizer": "phonemizer",
"Pillow": "Pillow>=10.0.1,<=15.0",
"protobuf": "protobuf",
"psutil": "psutil",
"pyyaml": "pyyaml>=5.1",
"pydantic": "pydantic",
"pytest": "pytest>=7.2.0,<8.0.0",
"pytest": "pytest>=7.2.0",
"pytest-asyncio": "pytest-asyncio",
"pytest-timeout": "pytest-timeout",
"pytest-xdist": "pytest-xdist",
Expand Down
4 changes: 2 additions & 2 deletions tests/models/llama/test_tokenization_llama.py
Original file line number Diff line number Diff line change
Expand Up @@ -839,13 +839,13 @@ def test_special_tokens_strip(self):
self.assertEqual(tokens, ["▁No", "<s>", "▁He"]) # spaces are eaten by rstrip / lstrip


@require_tiktoken
@require_read_token
class TikTokenIntegrationTests(unittest.TestCase):
"""
A class that regroups important test to make sure that we properly handle the special tokens.
"""

@require_tiktoken
@require_read_token
Comment on lines +847 to +848
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@require_read_token behaves slightly different from other @require_xxx: it doesn't add a @unittest.skipUnless. On my machine these tests were not being run, and I can't find them on CI either (but perhaps I'm not looking in the right place)

def test_tiktoken_llama(self):
model_path = "hf-internal-testing/llama-3-8b-internal"
subfolder = "original"
Expand Down
Loading