-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dependency parser/tagger misidentifies a verb as a noun #1021
Comments
Could be related to #1015. How many examples would one need to correctly update the pre-trained model? |
I tried the following code, based on the one from #1015, but even after 100,000 iterations I had no luck making it recognise work as a verb:
At this point, I would prefer it to err on the side of work always being a verb rather than a noun (I understand that such behaviour might be desired in the general case, but it would work for my data). In the meantime, I've found that if I replace work with some other verb that I'm not likely to see in my data set, like "hasten", I would get the correct dependency parse. But that feels like a very silly workaround. |
Closing this and making #1057 the master issue – work in progress for spaCy v2.0! |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Here is an example input
Here is the output:
As you can see, spaCy incorrectly classifies work as a noun, which (I assume) leads to the dependency parser failing to label it as the root, and thus misidentifying the root as
Does
.You can play with variations of the above input, such as "Will this phone work?" or "Would this phone work?" In all of the above cases, spaCy fails to pull out "work" as the root.
This would be a minor annoyance, but I rely on the dependency parse for a lot of my downstream tasks, and the "{does/will/would} this work" pattern is common for my data. (I can only think of one example where labelling work as a noun would be correct, such as "I did some work on this phone", but that strikes me as a rarer case than the one I have encountered, and that doesn't explain the case of a sentence starting with {would/will}.)
I don't know if this problem lies with the part of speech tagger, which assigns an erroneous tag to work, and thus messes up the dependency parser, or if it's something else.
Would you have any idea about what is causing this? If so, is there a way to fix it without re-training the whole model?
Thanks!
Info about spaCy
The text was updated successfully, but these errors were encountered: