Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of org.glassfish.json.JsonParserImpl #15

Open
jjspiegel opened this issue Aug 31, 2018 · 1 comment
Open

Improve performance of org.glassfish.json.JsonParserImpl #15

jjspiegel opened this issue Aug 31, 2018 · 1 comment

Comments

@jjspiegel
Copy link

As more and more products use the JSON-P reference implementation in production, it is critical that parsing performance is good.

I think there is an opportunity to improve the performance of JsonParserImpl. Right now, the underlying tokenizer operates on a Java character string and is completely unaware of the underlying byte representation. In many cases, JSON is persisted as UTF8 - from rfc8259:

JSON text exchanged between systems that are not part of a closed
ecosystem MUST be encoded using UTF-8 [RFC3629].

Java characters are represented in UTF-16 and conversion from UTF-8 to UTF-16 is often expensive.

I suggest making a special purpose tokenizer that operates directly on the UTF8 byte stream. Other encodings can continue to use the current code path as they will be less common. A special-case UTF8 tokenizer would provide the following benefits:

(1) Markup characters in the ascii range (curly braces, brackets, string delimiters, white space, etc) can be scanned with byte comparisons and never converted to UTF-16.

(2) JSON numbers, true, false, and null don't need to be converted to UTF-16

(3) Strings (keys and values) can be converted to UTF-16 lazily so that if they are never consumed by an application, they need not be converted.

(4) Skip methods (like skipArray() and skipObject()) could avoid any character set conversion of the skipped item.

@keilw
Copy link
Member

keilw commented Sep 3, 2018

@jjspiegel Thanks for raising those issues. Unlike the JCP model with a dedicated Reference Implementation Jakarta EE does not mandate this or rather it can and should (in theory also at Eclipse or different communities like Apache, JBoss etc.) have more than one implementation.

If the Glassfish "Spec Implementation" under the Jakarta EE umbrella is widely used, then I am pretty sure, the team and community will try to address many of those issues with upcoming releases, but it does not prevent others to create and maintain their own independent implementations that may have advantages over the SI.

@lukasj lukasj transferred this issue from jakartaee/jsonp-api Jun 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants