-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spacy 2.1.4 fails to initialize or load fasttext model using python2.7.5 #3734
Comments
Considering 2.7.5 dates from May, 2013 I am in favour of dropping support. On a related note, NLTK seems to push to drop 2.7 support as a whole (nltk/nltk#2296). If this would help development, this may be a good idea for spaCy as well. |
@BramVanroy Actually python2.7 is going to retire next year (you can see how much time is left for it here: https://pythonclock.org/ ). Also many DS/machine learning libraries will drop it including tensorflow, scikit-learn, numpy, pandas and many others. (check this link https://python3statement.org/) However, it will take more time for production servers to actually drop python2.7 especially that some of them, like centos use it within yum package manager. |
Oh wow, I didn't know the deprecation was that wide-spread. Thanks for the link, nice to see that many packages join in. |
Does it only affect Python 2.7.5? We test a more recent Python2.7 in our CI, and it works. We'll drop Python2.7 eventually, sure. We've already dropped Python2.7 on Windows. Tentatively I would suggest December 31 2020 as a date to start developing versions which only work on Python 3. |
Yes. From the discussion on the Python boards it seems that the fix was pushed between 2.7.5 and 2.7.6. 2.7.6 should be safe, and as such it doesn't seem bad to just drop support for <= 2.7.5 specifically. In the bug report there is also a mention of Python 3.3 and 3.4, but I can't seem to find out which subversion they are talking about. |
@BramVanroy @honnibal I did more debugging, and the part of regex that causes the problem is # host name
r"(?:(?:[a-z0-9\-]*)?[a-z0-9]+)" Despite that it's considered a bug in python2.7.5 . I think there is a small error in the code as well. Because the regex syntax
to match a string like server-local-host. PS: the above tweak might not be the optimal alternative but just to make a statement that the regex can be fixed without dropping the support for python2.7.5. |
@mohamed-ali I think this should be fixed in v2.2. Could you try and see whether it works for you? |
This issue has been automatically closed because there has been no response to a request for more information from the original author. With only the information that is currently in the issue, there's not enough information to take action. If you're the original author, feel free to reopen the issue if you have or find the answers needed to investigate further. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
How to reproduce the behaviour
The following code chunk from https://github.com/explosion/spaCy/blob/master/spacy/lang/tokenizer_exceptions.py fails in python 2.7.5 which makes init-model and spacy.load fail:
The error message:
The issues is known: (SO discussion):
Was fixed:
Info about spaCy & enviroment:
Potential solutions
This issue doesn't occur when working with spacy 2.0.18.
Based on the blame history here the issue started to occur since the switch from regex to re three months ago.
Some quick fixes would be:
The text was updated successfully, but these errors were encountered: