-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse JSON number to double in full-precision. #137
Conversation
This shall generate best possible precision (if strtod() is correctly implemented). Need more unit tests and performance tests. May add an option for accepting precision error. Otherwise LUT in Pow10() can be reduced.
Feedback: I was also running into the precision issues lately (I need to ensure roundtrip behavior for Your PR seems to fix the precision issues but has a severe performance impact: reading of my test datasets (~2800 files, >500MB) went up from |
May I know which platform/compiler? |
Windows 7; VisualStudio 2012 latest service-pack; 64-bit mode |
I have confirmed the 3x performance loss on Linux (32-bit, Clang 3.6) by running the How do other parsers handle the precision corner cases? Most parsers, I've seen so far, go for a simple digit-based approach and should suffer from the same problems, right? |
I suspect that // ...
"geometry": {"type":"Polygon","coordinates":[[[-65.613616999999977,43.420273000000009],
// ... For example, I think the first two numbers should actually output as I suggest replacing To @pah, I have not investigated the details of other parsers. Some of them seems using even less precise conversion (well.. actually previously RapidJSON does not use a proper fast path as well). Some of them are using To @Kosta-Github, I am investigating |
@miloyip have a look here about the number of required digits to enable roundtrip behavior: http://en.cppreference.com/w/cpp/types/numeric_limits/max_digits10. For |
Intermediate ResultsAfter working for a few days, I have implemented a custom The following results are generated by
There is performance improvement compared with the CRT's |
Should fix gcc debug error in tranvis. May need further refactoring.
…ion_customstrtod Conflicts: include/rapidjson/internal/dtoa.h test/unittest/readertest.cpp
This is the latest result that I have been doing. Add another method (DiyFp) to try to parse the number. If it cannot handle the number correctly, it will fallback to BigInteger method. So basically in full precision mode, it will try FastPath -> DiyFp -> BigInteger. It should have better performance in average but adding more code size.
I hope to resolve this #120 "bug" and continue to work on a 1.0 RC. |
The approach in issue120floatprecision_customstrtod and its performance looks good to me. 👍 Can you update the branch |
Parse JSON number to double in full-precision with custom strtod. Fix #120
Use
kParseFullPrecision
to turn on this option in compile-time. The implementation of new option should have no performance impact if the flag is not used.Implementation details
The full-precision path tries to use fast-path if possible. If the criteria of fast-path cannot be met, it falls back to use
strtod()
to convert string.Note that the parser still verify the JSON syntax of number as in normal-precision path.
To fulfill the above requirement, the parser needs to backup the correctly parsed characters from stream (as some streams cannot read back). A helper template class
GenericReader::NumberStream
is designed for this. If full-precision is set, then backup is required, and a specializedNumberStream
will backup the characters duringNumberStream::Take()
into theGenericReader::stack_
. Thatstack_
was previously used only for storing the decoded characters inParseString()
.Unit test
Added random numbers to test more cases for integer types and
double
.This experimental results show that full precision generate exact representation (no error), while normal precision parsing has maximum error of 3 ULP.
Denormal numbers () are not tested as it varies among platforms. Implementations of strtod()` in standard libraries may also simply flush denormal to zero.
Fix #120