Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should JsonReader ignore BOM char at start of inputStream? #43

Open
lmsurpre opened this issue Oct 15, 2021 · 0 comments
Open

Should JsonReader ignore BOM char at start of inputStream? #43

lmsurpre opened this issue Oct 15, 2021 · 0 comments

Comments

@lmsurpre
Copy link

I'm processing some JSON files that I got from an external source. Everything was going well until I hit this on one of them:

Caused by: jakarta.json.stream.JsonParsingException: Unexpected char 65,279 at (line no=1, column no=1, offset=0)
    at org.eclipse.parsson.JsonTokenizer.unexpectedChar (JsonTokenizer.java:584)
    at org.eclipse.parsson.JsonTokenizer.nextToken (JsonTokenizer.java:396)
    at org.eclipse.parsson.JsonParserImpl$NoneContext.getNextEvent (JsonParserImpl.java:425)
    at org.eclipse.parsson.JsonParserImpl.next (JsonParserImpl.java:375)
    at org.eclipse.parsson.JsonReaderImpl.readObject (JsonReaderImpl.java:99)

It looks like the file has a BOM char at its start.
I'm pretty sure (but not positive) that isn't allowed, but I'm wondering if the Parsson authors are interested in making the parser resilient to this situation.

From https://datatracker.ietf.org/doc/html/rfc8259#section-8.1

Implementations MUST NOT add a byte order mark (U+FEFF) to the
beginning of a networked-transmitted JSON text. In the interests of
interoperability, implementations that parse JSON texts MAY ignore
the presence of a byte order mark rather than treating it as an
error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant