-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Lemmatizer Exceptions #595
Comments
Have you checked that the POS tag is being predicted correctly? The exceptions are POS-keyed, so they won't fire if the POS is incorrect. |
Yes, i'm looking for verbs and then for a lemma. Had a problem with a few words (like "Don't feed the dog" is returning lemma "fee" for feed.) Is there some cache to rebuild wordnet? This is an example code: def get_main_verbs_of_sent(sent):
"""Return the main (non-auxiliary) verbs in a sentence."""
return [tok for tok in sent
if tok.pos == VERB and tok.dep_ not in {'aux', 'auxpass'}]
tdoc_itext = textacy.Doc("Don't feed the dog.", lang=u"en")
for sent in doc_itext.sents:
itext_verbs = get_main_verbs_of_sent(sent)
for verb in itext_verbs:
print verb.text
print verb.pos_
print verb.lemma_ feed |
Thanks for the report — turned out to be a deeper problem. "feed" is being assigned the tag VB, which means it shouldn't be lemmatized at all. The 'VB' tag is supposed to be associated with the morphological feature Should be fixed now. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
I tried to edit exceptions in spacy/data/en-1.1.0/wordnet/verb.exc but it didn't have any effect, spacy is still returning the old lemma. Is there some other way to fix or add the lemmatizer exceptions?
Your Environment
The text was updated successfully, but these errors were encountered: