-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ParsingException on unicode U+FFFF character #254
Comments
can you please share your build.sbt. or sprayJson library version. |
happens with the latest version - 1.3.4 |
Hi @anilkumarmyla, that's so by design (but could be documented better). According to the unicode standard |
IFYK: "Because of this complicated history and confusing changes of wording in the standard over the years regarding what are now known as noncharacters, there is still considerable disagreement about their use and whether they should be considered "illegal" or "invalid" in various contexts. Particularly for implementations prior to Unicode 3.1, it should not be surprising to find legacy behavior treating U+FFFE and U+FFFF as invalid in Unicode 16-bit strings. And U+FFFF and U+10FFFF are, indeed, known to be used in various implementations as sentinels. For example, the value FFFF is used for WEOF in Windows implementations. For up-to-date Unicode implementations, however, one should use caution when choosing sentinel values. U+FFFF and U+10FFFF still have interesting numerical properties which render them likely choices for internal use as sentinels, but implementers should be aware of the fact that those values, as for all noncharacters in the standard, are also valid in Unicode strings, must be converted between UTFs, and may be encountered in Unicode data—not necessarily used with the same interpretation as for one's own sentinel use. Just be careful out there!" |
Thanks for the added information. There's also the paragraph directly before:
So, yes it's complicated but I also think it's arguably still a good enough solution right now. Let's reopen to add a note to the documentation that those code points are not supported by the parser. |
Self explanatory with following code
The text was updated successfully, but these errors were encountered: