-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is the NER *actually* retrainable? #887
Labels
usage
General spaCy usage
Comments
Sorry that this has been a bit unstable. The code in Thinc 6.5.0 is training properly for me -- I'm currently working on getting new models up for the next release of spaCy. So, try either updating to the latest thinc, or at least remove the call to |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
I have seen issues #773 and #881 . I had been trying to use spaCy for training an entity recognizer like those of these bot APIs (e.g., Google API.ai, WIT.ai, ...). I have a function like this: (the main loop is basically copied from the tutorial)
where
load_train_data()
gives me vectors in the format of the tutorial. I get the entities by runningprocessed_sentence = nlp(sentence)
(wheresentence
is a unicode string), and then accessingprocessed_sentence.ents
. I had initially tried with just a few examples, and it didn't work. Then I read this (in #773 ):So I thought I would try to overfit the model by training it in the same sentences for a big number of epochs and see what happens. I found lots of addresses in http://results.openaddresses.io/. I chose the addresses in Thüringen (Germany) and randomly picked 25000 addresses there. I want to cause the entity recognizer to take these as "GPE". Using 12 small sentence templates, I generated 23105 sentences using these addresses, with taggings saying in which character a GPE started and in which character it ended (just like in the tutorial). There are less sentences because some sentences required two addresses (they are like "I moved from {} to {}").
Finally, I trained the entity recognizer (using the function above and this dataset) for 5, 20, 50 and 100 epochs. Still, it seems all this training didn't make any difference. I.e., when I try any of the sentences in my training set (the sentences I trained it on), it still gives me the same results it used to give without any training. E.g., the sentence
Waldstraße, 10, Dermbach is where I live
(which is one of the sentences in the training set)
still gives me these three entities (the following printing formatting is for my convenience)
(see issue #858 for the Unicode strangeness -- shouldn't be a problem here)
which are the same as it originally would give me without any training.
Am I doing anything wrong? Is this training procedure somehow wrong? Any ideas? Should I try more epochs?
[Now I'll probably take a look at how RASA NLU does it... because if they manage to make it work, then I am probably doing some silly mitake]
Your Environment
The text was updated successfully, but these errors were encountered: