Entity recognition is inconsistent across runs #1336

jaju · 2017-09-19T15:41:47Z

I'm attempting some entity recognition, and the results keep changing for the same sentence when I reload the same model.

Sample run snapshot (verbatim)

>>> import spacy
>>> nlp = spacy.load('en_core_web_sm')
>>> d = nlp('The company that IBM bought had rejected Apple and Google bids')
>>> d.ents
()
>>> nlp = spacy.load('en_core_web_sm')
>>> d = nlp('The company that IBM bought had rejected Apple and Google bids')
>>> d.ents
()
>>> nlp = spacy.load('en_core_web_sm')
>>> d = nlp('The company that IBM bought had rejected Apple and Google bids')
>>> d.ents
(IBM,)
>>> nlp = spacy.load('en_core_web_sm')
>>> d = nlp('The company that IBM bought had rejected Apple and Google bids')
>>> d.ents
(IBM,)
>>> nlp = spacy.load('en_core_web_sm')
>>> d = nlp('The company that IBM bought had rejected Apple and Google bids')
>>> d.ents
(IBM, Apple)

I'm not sure, but this is unexpected given nothing really changes.
Output of spacy info --markdown

Info about spaCy

spaCy version: 2.0.0a13
Platform: Darwin-16.7.0-x86_64-i386-64bit
Python version: 3.6.2
Models: en, en_core_web_sm, en_vectors_web_lg

The text was updated successfully, but these errors were encountered:

honnibal · 2017-09-19T15:58:09Z

Thanks for the report -- definitely something wrong here.

jaju · 2017-09-19T16:13:46Z

Please let me know if I can provide any more information, or run additional tests.
Thanks!

honnibal · 2017-09-19T16:51:50Z

I have it reproduced now, so it shouldn't be long to get the fix sorted :)

socialglass · 2017-09-19T18:19:45Z

Something similar -

>>> doc = nlp(u"She Was among first investors to get approval for Biotech Fund.")
>>> doc.ents
(first, Biotech Fund)
>>> doc = nlp(u"She Was among the first investors to get approval for Biotech Fund.")
>>> doc.ents
()
>>> doc = nlp(u"I Was among first investors to get approval for Biotech Fund.")
>>> doc.ents
(Biotech Fund,)

honnibal · 2017-09-19T19:54:11Z

The current version on develop seems to have this fixed already. Hopefully I can get it pushed to spacy-nightly tonight. (The new model also has better parse accuracy, which is nice...)

There are two possible explanations for the inconsistency:

Some model preserves its random initialization even after loading (e.g. the model adds the loaded weights, instead of replacing them)
Somewhere there's an out-of-bounds read. The eventual calculations would then depend on values from neighbouring memory locations, which would vary between runs.

I think 2) is probably more likely. The good news is that the instability occurs also in the tensor values, not just in the parser or tagger. This means there's only a few places to look. It's probably the maxout or convolution functions.

jaju · 2017-09-20T04:58:15Z

I updated to the latest nightly build, and the issue has disappeared.
Thanks a lot! That's admirably quick! :)

honnibal · 2017-09-21T18:17:27Z

No worries!

lock · 2018-05-08T16:27:51Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

honnibal added the bug Bugs and behaviour differing from documentation label Sep 19, 2017

honnibal closed this as completed Sep 21, 2017

lock bot locked as resolved and limited conversation to collaborators May 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Entity recognition is inconsistent across runs #1336

Entity recognition is inconsistent across runs #1336

jaju commented Sep 19, 2017 •

edited

Loading

honnibal commented Sep 19, 2017 •

edited

Loading

jaju commented Sep 19, 2017

honnibal commented Sep 19, 2017

socialglass commented Sep 19, 2017

honnibal commented Sep 19, 2017

jaju commented Sep 20, 2017

honnibal commented Sep 21, 2017

lock bot commented May 8, 2018

Entity recognition is inconsistent across runs #1336

Entity recognition is inconsistent across runs #1336

Comments

jaju commented Sep 19, 2017 • edited Loading

Info about spaCy

honnibal commented Sep 19, 2017 • edited Loading

jaju commented Sep 19, 2017

honnibal commented Sep 19, 2017

socialglass commented Sep 19, 2017

honnibal commented Sep 19, 2017

jaju commented Sep 20, 2017

honnibal commented Sep 21, 2017

lock bot commented May 8, 2018

jaju commented Sep 19, 2017 •

edited

Loading

honnibal commented Sep 19, 2017 •

edited

Loading