Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sklearn and Numpy Dependencies when installing REL from source (Option 3) #159

Open
leventidis opened this issue May 19, 2023 · 2 comments

Comments

@leventidis
Copy link

I am trying to run a simple local server by installing REL from source (Option 3) (Following the tutorial from: https://rel.readthedocs.io/en/latest/tutorials/e2e_entity_linking/)

I have also downloaded the generic and wiki_2014 corpus that are linked in github

However, I am running into package versioning issues. For instance just running EntityDisambiguation() I get the following:

model = EntityDisambiguation(base_url, wiki_version, config)
File "/home/aristotelis/Documents/REL/env/lib/python3.8/site-packages/REL/entity_disambiguation.py", line 67, in __init__
    self.model_lr = pkl.load(f)
ModuleNotFoundError: No module named 'sklearn.linear_model.logistic'

I tried downgrading scikit-learn (e.g., 0.22.0) but then I get numpy float depreciation errors. The requirements.txt doesn't specify versions for these packages

What versions of scikit-learn and numpy should be used to correctly load the models provided? Are there any other specific version of packages needed in order to successfully run the example in https://rel.readthedocs.io/en/latest/tutorials/e2e_entity_linking/

@KDercksen
Copy link
Contributor

Hi there,

Our API server has the following versions installed:

scikit-learn==0.22.2
numpy==1.19.1
torch==1.7.0
flair==0.11.3

That should be everything, let us know if you run into further problems!

@leventidis
Copy link
Author

Thank you for the help! Unfortunately, I am still not able to run the server locally.

I deleted my environment and started a fresh one. I first installed REL using pip install git+https://github.com/informagi/REL

I noticed that would install different versions from the ones you specified (it specifically installed scikit-learn-1.2.2, numpy-1.24.3, torch-2.0.1, flair-0.12.2)

So I manually uninstalled those 4 packages re-run pip install for them with the specified versions.

Running a pip freeze on my environment I currently have the following packages:

accelerate==0.19.0
aiohttp==3.8.4
aiosignal==1.3.1
anyascii==0.3.2
anyio==3.6.2
async-timeout==4.0.2
attrs==23.1.0
beautifulsoup4==4.12.2
blis==0.7.9
boto3==1.26.137
botocore==1.29.137
bpemb==0.3.4
catalogue==2.0.8
certifi==2023.5.7
charset-normalizer==3.1.0
click==8.1.3
cloudpickle==2.2.1
cmake==3.26.3
colorama==0.4.6
confection==0.0.4
conllu==4.5.2
contourpy==1.0.7
cycler==0.11.0
cymem==2.0.7
dataclasses==0.6
datasets==2.12.0
Deprecated==1.2.13
dill==0.3.6
fastapi==0.95.2
filelock==3.12.0
flair==0.11.3
fonttools==4.39.4
frozenlist==1.3.3
fsspec==2023.5.0
ftfy==6.1.1
future==0.18.3
gdown==4.4.0
gensim==4.3.1
h11==0.14.0
huggingface-hub==0.14.1
hyperopt==0.2.7
idna==3.4
importlib-metadata==3.10.1
importlib-resources==5.12.0
Janome==0.4.2
Jinja2==3.1.2
jmespath==1.0.1
joblib==1.2.0
kiwisolver==1.4.4
konoha==4.6.5
langcodes==3.3.0
langdetect==1.0.9
lit==16.0.5
lxml==4.9.2
MarkupSafe==2.1.2
matplotlib==3.7.1
more-itertools==9.1.0
mpld3==0.3
mpmath==1.3.0
multidict==6.0.4
multiprocess==0.70.14
murmurhash==1.0.9
networkx==3.1
nltk==3.8.1
numpy==1.19.1
nvidia-cublas-cu11==11.10.3.66
nvidia-cuda-cupti-cu11==11.7.101
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cudnn-cu11==8.5.0.96
nvidia-cufft-cu11==10.9.0.58
nvidia-curand-cu11==10.2.10.91
nvidia-cusolver-cu11==11.4.0.1
nvidia-cusparse-cu11==11.7.4.91
nvidia-nccl-cu11==2.14.3
nvidia-nvtx-cu11==11.7.91
overrides==3.1.0
packaging==23.1
pandas==2.0.1
pathy==0.10.1
Pillow==9.5.0
pptree==3.1
preshed==3.0.8
protobuf==3.20.2
psutil==5.9.5
py4j==0.10.9.7
pyarrow==12.0.0
pydantic==1.10.7
pyparsing==3.0.9
PySocks==1.7.1
python-dateutil==2.8.2
pytorch_revgrad==0.2.0
pytz==2023.3
PyYAML==6.0
radboud-el @ git+https://github.com/informagi/REL@a61bfc02d7aa713b470f0ed3f83af1c6c72eef8c
regex==2023.5.5
requests==2.30.0
responses==0.18.0
s3transfer==0.6.1
scikit-learn==0.22.2
scipy==1.10.1
segtok==1.5.11
sentencepiece==0.1.95
six==1.16.0
smart-open==6.3.0
sniffio==1.3.0
soupsieve==2.4.1
spacy==3.5.3
spacy-legacy==3.0.12
spacy-loggers==1.0.4
sqlitedict==2.1.0
srsly==2.4.6
starlette==0.27.0
sympy==1.12
tabulate==0.9.0
thinc==8.1.10
threadpoolctl==3.1.0
tokenizers==0.13.3
torch==1.7.0
tqdm==4.65.0
transformer-smaller-training-vocab==0.2.3
transformers==4.29.2
triton==2.0.0
typer==0.7.0
typing_extensions==4.5.0
tzdata==2023.3
urllib3==1.26.15
uvicorn==0.22.0
wasabi==1.1.1
wcwidth==0.2.6
Wikipedia-API==0.5.8
wrapt==1.15.0
xxhash==3.2.0
yarl==1.9.2
zipp==3.15.0

Trying to run the server I am now getting the following stack trace error:

Traceback (most recent call last):
  File "test.py", line 3, in <module>
    from REL.entity_disambiguation import EntityDisambiguation
  File "/home/aristotelis/Documents/REL/env/lib/python3.8/site-packages/REL/entity_disambiguation.py", line 16, in <module>
    from sklearn.linear_model import LogisticRegression
  File "/home/aristotelis/Documents/REL/env/lib/python3.8/site-packages/sklearn/linear_model/__init__.py", line 12, in <module>
    from ._least_angle import (Lars, LassoLars, lars_path, lars_path_gram, LarsCV,
  File "/home/aristotelis/Documents/REL/env/lib/python3.8/site-packages/sklearn/linear_model/_least_angle.py", line 30, in <module>
    method='lar', copy_X=True, eps=np.finfo(np.float).eps,
  File "/home/aristotelis/Documents/REL/env/lib/python3.8/site-packages/numpy/__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'float'.
`np.float` was a deprecated alias for the builtin `float`. To avoid this error in existing code, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

On another note I noticed that the server.py file has no mention of the make_handler() function which is referenced in the tutorial in the from REL.server import make_handler statement. I found a make_handler() function at

def make_handler(base_url, wiki_version, models, tagger_ner, argss, logger):
and used that in the server.py but I am not sure if that's correct or the tutorial for setting up the server at: https://rel.readthedocs.io/en/latest/tutorials/e2e_entity_linking/ is outdated

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants