FastDoubleParser doesn't support all input formats as the default OpenJDK Float/Double parsers #19

grcevski · 2022-06-24T15:23:58Z

The FastDoubleParser was recently introduced in Jackson through this issue FasterXML/jackson-core#577 is 3-4x times faster compared to the version that's implemented in OpenJDK. This is fantastic news, since many numerical processing workloads would benefit from this.

However the OpenJDK Double/Float parsers support variety of input formats that the FastDoubleParser will fail on, therefore it can cause unexpected regressions when used.

For example, the FastDoubleParser will fail with a NumberFormatException on these example patterns (there are more to be found in the OpenJDK Double/Float tests):

1.1e-23f
0x.003p12f
0x1.17742db862a4P-1d

I think apart from the first one in this list, the rest are all hexadecimal if I'm not mistaken.

pjfanning · 2022-06-24T15:29:39Z

@grcevski any number prefixed with 0x should be interpreted as hexidecimal - it is a very common convention to make it obvious the number is not plain decimal.

wrandelshofer · 2022-06-24T17:43:22Z

Oh - I didn't know that Double.parseDouble() can parse numbers with a trailing format specifier. I only looked at the lexical syntax rules that they gave in the Javadoc of Double.valueOf(String)

FastDoubleParser can parse the examples that you gave, if you remove the trailing "f" and "d" characters:

1.1e-23
0x.003p12
0x1.17742db862a4P-1

Thank you for the link to the Double/Float test suite. I am going to run it against FastDoubleParser and fix any errors. It is my goal, to have an implementation that can be used as a drop in for java.util.Double.parseDouble(String) and java.util.Float.parseFloat(String).

You are making me very curious though:
JSON does not support trailing format specifiers. Hexadecimal float literals are also not supported. So, it would be an error if jackson-core would let FastDoubleParser parse them. How do you handle these cases in jackson-core?

wrandelshofer · 2022-06-24T17:52:26Z

Oops, jackson-core is an XML-Parser (not JSON).

https://www.w3.org/TR/xmlschema-2/#double

The lexical rules of XML schema neither allow trailing format specifiers nor hexadecimal float literals. So, I don't know why you would want support for them(?)

grcevski · 2022-06-24T18:07:48Z

Thanks for the quick response @wrandelshofer, to be honest I didn't realize it's an issue with trailing format specifiers, if it does work without them I think we are good for jackson-core!

pjfanning · 2022-06-24T18:24:04Z

Thanks everyone for the input. jackson-core is most commonly used for JSON parsing but there are jackson modules for other formats like XML and YAML (and lots more). JSON doesn't permit hex values - but there is an open issue where someone wants Jackson to support JSON5 which does allow hex values (https://json5.org/).

wrandelshofer · 2022-06-24T19:01:00Z

Okay, I am going to do the following:

Add support for trailing format specifiers to the library.
Run the JDK tests that you linked against FastDoubleParser and fix all bugs.
Change the back-end classes so that they encode errors in a magic return value(which is a primitive long) instead of throwing an Exception.
Make the back-end classes public, so that Jackson can use them directly.
Maybe? Add an entry point that only supports the JSON "number" production.

What specification do you use for the JSON syntax? Is this "ECMA-404", "The JSON data interchange syntax" ?

pjfanning · 2022-06-24T19:04:53Z

What specification do you use for the JSON syntax? Is this "ECMA-404", "The JSON data interchange syntax" ?

I think ECMA 404 is the right spec to use - confirmed it by looking at https://www.json.org/json-en.html

cowtowncoder · 2022-06-24T22:59:54Z

@wrandelshofer Jackson actually handles lexical part (tokenization) according to JSON spec (or, for a small number of deviations, optionally allowing some alternate cases). Value that would be handed to FDP need not be further verified as long as accepted set of values is a superset, which I think is the case (JSON floating-point value definition is quite strict).
So @pjfanning is correct about values accepted by jackson-core with default settings.

So I think there is no need for alternate end points just for Jackson use. Of course it may be otherwise useful for different use cases, but no need from Jackson perspective.

wrandelshofer · 2022-11-19T17:01:37Z

So far, I did the following:

[done] Add support for trailing format specifiers to the library.
[done] Run the JDK tests that you linked against FastDoubleParser and fix all bugs.
[done] Change the back-end classes so that they encode errors in a magic return value(which is a primitive long) instead of throwing an Exception.
[not done] Make the back-end classes public, so that Jackson can use them directly.
[done] Maybe? Add an entry point that only supports the JSON "number" production.

grcevski · 2022-11-23T14:23:31Z

This sounds great @wrandelshofer, I think this resolves the issue. We started using the new jackson 2.14 as soon as it was released, great speedup, I think with your changes this should be broadly applicable and maybe the JDK eventually makes this the default parsing algorithm :).

This was referenced Jun 24, 2022

FastDoubleParser doesn't support all input formats as the default OpenJDK Float/Double parsers FasterXML/jackson-core#778

Closed

Use the default float/double parser as fallback FasterXML/jackson-core#779

Closed

wrandelshofer mentioned this issue Nov 17, 2022

The parser throws StringIndexOutOfBoundsException/ArrayIndexOutOfBoundsException for some inputs #22

Closed

grcevski closed this as completed Nov 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FastDoubleParser doesn't support all input formats as the default OpenJDK Float/Double parsers #19

FastDoubleParser doesn't support all input formats as the default OpenJDK Float/Double parsers #19

grcevski commented Jun 24, 2022

pjfanning commented Jun 24, 2022

wrandelshofer commented Jun 24, 2022

wrandelshofer commented Jun 24, 2022 •

edited

Loading

grcevski commented Jun 24, 2022

pjfanning commented Jun 24, 2022

wrandelshofer commented Jun 24, 2022

pjfanning commented Jun 24, 2022

cowtowncoder commented Jun 24, 2022 •

edited

Loading

wrandelshofer commented Nov 19, 2022

grcevski commented Nov 23, 2022

FastDoubleParser doesn't support all input formats as the default OpenJDK Float/Double parsers #19

FastDoubleParser doesn't support all input formats as the default OpenJDK Float/Double parsers #19

Comments

grcevski commented Jun 24, 2022

pjfanning commented Jun 24, 2022

wrandelshofer commented Jun 24, 2022

wrandelshofer commented Jun 24, 2022 • edited Loading

grcevski commented Jun 24, 2022

pjfanning commented Jun 24, 2022

wrandelshofer commented Jun 24, 2022

pjfanning commented Jun 24, 2022

cowtowncoder commented Jun 24, 2022 • edited Loading

wrandelshofer commented Nov 19, 2022

grcevski commented Nov 23, 2022

wrandelshofer commented Jun 24, 2022 •

edited

Loading

cowtowncoder commented Jun 24, 2022 •

edited

Loading