-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FastDoubleParser doesn't support all input formats as the default OpenJDK Float/Double parsers #19
Comments
@grcevski any number prefixed with 0x should be interpreted as hexidecimal - it is a very common convention to make it obvious the number is not plain decimal. |
Oh - I didn't know that Double.parseDouble() can parse numbers with a trailing format specifier. I only looked at the lexical syntax rules that they gave in the Javadoc of Double.valueOf(String) FastDoubleParser can parse the examples that you gave, if you remove the trailing "f" and "d" characters: 1.1e-23 Thank you for the link to the Double/Float test suite. I am going to run it against FastDoubleParser and fix any errors. It is my goal, to have an implementation that can be used as a drop in for java.util.Double.parseDouble(String) and java.util.Float.parseFloat(String). You are making me very curious though: |
Oops, jackson-core is an XML-Parser (not JSON). https://www.w3.org/TR/xmlschema-2/#double The lexical rules of XML schema neither allow trailing format specifiers nor hexadecimal float literals. So, I don't know why you would want support for them(?) |
Thanks for the quick response @wrandelshofer, to be honest I didn't realize it's an issue with trailing format specifiers, if it does work without them I think we are good for jackson-core! |
Thanks everyone for the input. jackson-core is most commonly used for JSON parsing but there are jackson modules for other formats like XML and YAML (and lots more). JSON doesn't permit hex values - but there is an open issue where someone wants Jackson to support JSON5 which does allow hex values (https://json5.org/). |
Okay, I am going to do the following:
What specification do you use for the JSON syntax? Is this "ECMA-404", "The JSON data interchange syntax" ? |
I think ECMA 404 is the right spec to use - confirmed it by looking at https://www.json.org/json-en.html |
@wrandelshofer Jackson actually handles lexical part (tokenization) according to JSON spec (or, for a small number of deviations, optionally allowing some alternate cases). Value that would be handed to FDP need not be further verified as long as accepted set of values is a superset, which I think is the case (JSON floating-point value definition is quite strict). So I think there is no need for alternate end points just for Jackson use. Of course it may be otherwise useful for different use cases, but no need from Jackson perspective. |
So far, I did the following: [done] Add support for trailing format specifiers to the library. |
This sounds great @wrandelshofer, I think this resolves the issue. We started using the new jackson 2.14 as soon as it was released, great speedup, I think with your changes this should be broadly applicable and maybe the JDK eventually makes this the default parsing algorithm :). |
The FastDoubleParser was recently introduced in Jackson through this issue FasterXML/jackson-core#577 is 3-4x times faster compared to the version that's implemented in OpenJDK. This is fantastic news, since many numerical processing workloads would benefit from this.
However the OpenJDK Double/Float parsers support variety of input formats that the FastDoubleParser will fail on, therefore it can cause unexpected regressions when used.
For example, the FastDoubleParser will fail with a NumberFormatException on these example patterns (there are more to be found in the OpenJDK Double/Float tests):
1.1e-23f
0x.003p12f
0x1.17742db862a4P-1d
I think apart from the first one in this list, the rest are all hexadecimal if I'm not mistaken.
The text was updated successfully, but these errors were encountered: