-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: token.ent_iob_ is str not unicode #672
Comments
Thanks for the report! All modules should definitely have You can find the contribution guidelines here. Thanks again! |
Should be fixed now. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Spacy version: 1.3.0
System: Ubuntu 14.04
Issue:
The value returned on
token.ent_iob_
is a string, not unicode.Code:
The above issue is reproducible with the following:
Results in:
Comments:
Pretty sure this is caused by this line in token.pyx.
Possible solutions are to change that line or import
unicode_literals
in that file. I'm not sure how the project handles strings internally but having all modules useunicode_literals
might not be a terrible idea.Just fixing the single line would be easy though. If I want to submit a PR as small as this do I need to run a bunch of tests or can I just put
u
in front of each of those letters? That said, adding some kind of automated test builder to ensure that all properties and return values respect the contracts in the documentation might not be a bad idea. Alternatively, from what little I know about cython, maybe the properties could get type declarations that would be enforced by the compiler?Followup question, is there a page with instructions for contributing?
-- Eric
The text was updated successfully, but these errors were encountered: