Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when string contains "Word Joiner", "BOM", "OGHAM Space Mark" or "zero-width space" characters #15

Open
JamoCA opened this issue Oct 18, 2019 · 0 comments

Comments

@JamoCA
Copy link

JamoCA commented Oct 18, 2019

While importing international data from a Microsoft Excel file (using ColdFusion2016u11 w/Java 11.0.2), I encountered a java.lang.NullPointerException error when a string contained a space/control character. Is this a known issue?

Here's a list of characters (decimal & hex codes provided) that I tested against.
https://gist.github.com/JamoCA/42c3be286185aff0476d5888f0a819ff

My initial tests included the Word Joiner (decimal 8288), BOM (decimal 65279), OGHAM space mark (decimal 5760) and zero-width space (decimal 8203) and each caused the same error.

To work around it, I wrote a separate function to sanitize "unsafe" characters prior to using Junidecode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant