You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks, good catch! This should probably be added to the tokenizer exceptions.
I'm currently in the process of reorganising the language data and we're gonna merge the changes pretty soon, so there's little point in fixing this in the old, messy format now. I'll do it it on the new organize-language-data branch instead so it will definitely be fixed in the next release.
When the tokenizer sees the unicode apostrophe, it doesn't tokenize correctly. For example:
outputs
My Environment
The text was updated successfully, but these errors were encountered: