-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OverflowError: Invalid Nan value when encoding double #11757
Comments
This looks like a duplicate of #10217. It looks like the issue is that you are using the enumerated ID of your labels as the value of Out of curiosity, how did you come up with the current code for assigning labels? |
Brilliant, thank you. I was using the code in this Medium article building-a-text-classifier-with-spacy. I was under the assumption that the label had to be unique. Maybe the documentation needs to be clearer? |
It's true that the label has to be unique - duplicate labels will be treated as the same thing - but the label value does not have to be unique, it just needs to be 0 or 1. This is actually called out in the API docs for textcat, which were updated to be clearer as part of #9041. Is there any other place you were checking where we could make this more explicit? I will look at adding a check for this during training. |
Ah I did not read the documentation properly. I think it seemed a bit overwhelming (which I think is a little my fault) which is why I went to other tutorials. I think the error message is also not very clear. I think there should be some sort of page on the spaCy section on TextCategorizer or Training Pipelines & Models with common issues and examples of how to fix them. |
Thanks for the suggestions, and the note that the documentation seems overwhelming. The error message here is definitely unhelpful, and I've written a PR (#11763) to help with that part. We do have an FAQ label in Discussions, and a top-level FAQ, but I'll see if there's some way we can make these more accessible from the main docs. |
Thank you very much 😃 |
This issue has been automatically closed because it was answered and there was no follow-up discussion. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
How to reproduce the behaviour
An extract of my python code:
After creating the data and running the spaCy train loop I get an overflow error:
Your Environment
The text was updated successfully, but these errors were encountered: