Loss of precision when serializing <double> #360

matspetter · 2016-11-17T15:50:53Z

It seems that precision is lost when serializing a double. I cannot say why since
std::numeric_limits::digits10 should provide enough digits !?
but if I change that to:
std::numeric_limits::digits10 then I'm not loosing anything.
I'm NOT using and "long double" types. Only "double".

nlohmann · 2016-11-17T16:20:00Z

Do you have an example to check?

gregmarr · 2016-11-17T16:53:36Z

digits10
number of decimal digits that can be represented without change
that is, any number with this many decimal digits can be converted to a value of type T and back to decimal form, without change due to rounding or overflow.

max_digits10
number of decimal digits necessary to differentiate all values of this type
that is, the number of base-10 digits that are necessary to uniquely represent all distinct values of the type T, such as necessary for serialization/deserialization to text.

matspetter · 2016-11-17T18:13:35Z

Ok lets see. This code was run on a Mac.
g++ --version

Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/usr/include/c++/4.2.1
Apple LLVM version 8.0.0 (clang-800.0.42.1)
Target: x86_64-apple-darwin16.1.0

#include <string>
#include <sstream>
#include <iostream>
#include <limits>

int main(int argc,char** argv)
{
    double v = 100000000000.1236;

    int p1 = std::numeric_limits<double>::digits10;
    int p2 = std::numeric_limits<double>::max_digits10;

    // stream with precision == digits10
    std::stringstream ss;
    ss.precision(p1);
    ss << v;
    std::cout << "digits10     " << p1 << ": " << ss.str() << std::endl;

    // stream with precision == max_digits10
    std::stringstream ss2;
    ss2.precision(p2);
    ss2 << v;
    std::cout  << "max_digits10 " << p2 << ": " << ss2.str() << std::endl;

    // Read back and compare with original
    double v1,v2;
    ss >> v1;
    ss2 >> v2;

    std::cout << "v==v1 : " << ((v==v1)?"true":"false") << std::endl;
    std::cout << "v==v2 : " << ((v==v2)?"true":"false") << std::endl;
}

output:

digits10     15: 100000000000.124
max_digits10 17: 100000000000.1236
v==v1 : false
v==v2 : true

It is not easy with floating points and precision but this tells me that the streaming seems more correct when using "max_digits10"

matspetter · 2016-11-18T06:39:30Z

It also gives the same result/output on a Ubuntu system:
g++ --version
g++ (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609

uname -a
Linux mbergg12-ubuntu15 4.4.0-47-generic #68-Ubuntu SMP Wed Oct 26 19:39:52 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

TurpentineDistillery · 2016-11-19T18:25:45Z

This is not a bug, but a reality of dealing with floating-point representation.
If one uses more than digits10 digits of precision, then string->value->string is not guaranteed to round-trip.

gregmarr · 2016-11-19T22:27:36Z

@TurpentineDistillery However using max_digits10 on output guarantees that value->string->value will not change. The difference being that not all strings of length max_digits10 can be generated by value->string with max_digits10 precision. So as long as string was generated by outputting a value, it's safe. The issue comes when the string was just an arbitrary string of digits.

nlohmann · 2016-11-24T19:46:28Z

This issue can be closed, right?

matspetter · 2016-11-24T20:43:24Z

Yes I guess so. /mb

…

Sent from my iPhone On 24 Nov 2016, at 20:46, Niels Lohmann <notifications@github.com<mailto:notifications@github.com>> wrote: This issue can be closed, right? - You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#360 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AAq6oDYqiPZd1bPmNpaTZ-AvNPG0bUaLks5rBemYgaJpZM4K1d9R>.

nlohmann · 2016-11-24T21:14:00Z

Thanks for the quick response!

timueller · 2017-07-28T15:53:28Z

I don't see how this is not a bug. The OP's example exactly illustrates the problem of value -> string -> value serialization/deserialization. We need to store double precision data in a json string form and read it back without loss of precision and have run into exactly the same problem.

nlohmann · 2017-07-28T21:17:09Z

What would you propose?

timueller · 2017-07-31T14:09:20Z

Is there a problem if you just switch to using max_digits10 for both dump() and operator<< ?

nlohmann · 2017-07-31T16:59:15Z

Then numbers like 2312.42 would be round-tripped to 2312.4200000000001.

abolz · 2017-10-11T10:55:56Z

I just ran into the same problem. In my opinion, the string->value->string round-trip isn't really relevant, the value->string->value round-trip is. And the (minimum) number of decimal digits to distinguish all floating-point values is max_digits10, not digits10.

Note that currently "2312.4200000000001" does not "round-trip" to "2312.4200000000001". Actually a string->value->string round-trip cannot be guaranteed unless the first string is generated by the same implementation as the second (and then there is an initial value->string->value round-trip required...).

So I think that value->string->value should be guaranteed (when using the same library for serialization/deserialization).

nlohmann · 2017-10-11T16:08:40Z

See #360 (comment).

gregmarr · 2017-10-11T16:15:58Z

@nlohmann My comment actually supports using max_digits10. It was a comment on this statement:

If one uses more than digits10 digits of precision, then string->value->string is not guaranteed to round-trip.

I was pointing out that this is true, but only for strings that were written by a value->string conversion not using the full precisions, or were written by hand. As such, I think it's fine for those values to not be preserved exactly.

abolz · 2017-10-11T16:28:02Z

I think @gregmarr is saying the same. My expectation is that if I have a json object, assign any double precision value to it, serialize it to disk and later deserialize it again, the values should be exactly equal. This requires using max_digits10.

nlohmann · 2017-10-11T17:47:13Z

I think one reason for the status quo were the roundtrip results of https://github.com/miloyip/nativejson-benchmark. I'm not sure whether there exists the one right solution, so we need to make a decision.

abolz · 2017-10-28T11:45:15Z

I have an implementation of the Grisu2 algorithm for printing floating-point numbers, based on the reference implementation by Florian Loitsch. It works for IEEE float/double (but not long double) and produces a short representation which is guaranteed to round-trip. I could submit a PR if you like, so we have something to start with.

ojwoodford · 2017-10-30T04:23:27Z

I just hit this issue. I store unit test data in JSON and a new unit test is failing because of this loss of precision. Is there any reason why std::setprecision shouldn't work on an ostream I'm passing a json object into?

gregmarr · 2017-10-30T14:27:36Z

It doesn't use the ostream formatting for floating point numbers.
https://github.com/nlohmann/json/blob/master/src/json.hpp#L8324
https://github.com/nlohmann/json/blob/develop/src/json.hpp#L6701

If you change these to max_digits10 instead of digits10 it should fix your failures.

pvleuven · 2017-11-17T11:35:17Z

I just ran into the same issue as @gregmarr described and switching to max_digits10 seems to work

json j1 = 1.0 / 3.0;
json j2 = json::parse( j1.dump() );
bool is_eq = *( j1.get_ptr<json::number_float_t*>() ) == *( j2.get_ptr<json::number_float_t*>() );
// is_eq is false with digits10, true with max_digits10

nlohmann · 2017-11-28T20:12:40Z

Hi all. I shall change digits10 to max_digits10.

I lot of test cases fail. It seems that they focus on the string->number->string case:

      Start 36: test-inspection_all
36/70 Test #36: test-inspection_all .................***Failed    5.77 sec

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test-inspection is a Catch v1.9.7 host application.
Run with -? for options

-------------------------------------------------------------------------------
object inspection
  serialization
  dump and floating-point numbers
-------------------------------------------------------------------------------
../test/src/unit-inspection.cpp:235
...............................................................................

../test/src/unit-inspection.cpp:238: FAILED:
  CHECK( s.find("42.23") != std::string::npos )
with expansion:
  18446744073709551615 (0xffffffffffffffff)
  !=
  18446744073709551615 (0xffffffffffffffff)

-------------------------------------------------------------------------------
object inspection
  serialization
  dump and small floating-point numbers
-------------------------------------------------------------------------------
../test/src/unit-inspection.cpp:241
...............................................................................

../test/src/unit-inspection.cpp:244: FAILED:
  CHECK( s.find("1.23456e-78") != std::string::npos )
with expansion:
  18446744073709551615 (0xffffffffffffffff)
  !=
  18446744073709551615 (0xffffffffffffffff)

===============================================================================
test cases:   1 |   0 passed | 1 failed
assertions: 150 | 148 passed | 2 failed

      Start 62: test-regression_all
62/70 Test #62: test-regression_all .................***Failed    5.59 sec

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test-regression is a Catch v1.9.7 host application.
Run with -? for options

-------------------------------------------------------------------------------
regression tests
  issue #228 - double values are serialized with commas as decimal points
-------------------------------------------------------------------------------
../test/src/unit-regression.cpp:417
...............................................................................

../test/src/unit-regression.cpp:453: FAILED:
  CHECK( j1a.dump() == "2312.42" )
with expansion:
  "2312.4200000000001" == "2312.42"

../test/src/unit-regression.cpp:454: FAILED:
  CHECK( j1b.dump() == "2312.42" )
with expansion:
  "2312.4200000000001" == "2312.42"

../test/src/unit-regression.cpp:462: FAILED:
  CHECK( ss.str() == "4.712,112312.42" )
with expansion:
  "4.712,112312.4200000000001"
  ==
  "4.712,112312.42"

../test/src/unit-regression.cpp:464: FAILED:
  CHECK( ss.str() == "4.712,112312.4247,11" )
with expansion:
  "4.712,112312.420000000000147,11"
  ==
  "4.712,112312.4247,11"

../test/src/unit-regression.cpp:466: FAILED:
  CHECK( j2a.dump() == "23.42" )
with expansion:
  "23.420000000000002" == "23.42"

-------------------------------------------------------------------------------
regression tests
  issue #380 - bug in overflow detection when parsing integers
-------------------------------------------------------------------------------
../test/src/unit-regression.cpp:784
...............................................................................

../test/src/unit-regression.cpp:788: FAILED:
  CHECK( j.dump() == "1.66020696663386e+20" )
with expansion:
  "1.6602069666338596e+20"
  ==
  "1.66020696663386e+20"

===============================================================================
test cases:   1 |   0 passed | 1 failed
assertions: 408 | 402 passed | 6 failed

      Start 66: test-testsuites_all
66/70 Test #66: test-testsuites_all .................***Failed    0.07 sec

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test-testsuites is a Catch v1.9.7 host application.
Run with -? for options

-------------------------------------------------------------------------------
compliance tests from nativejson-benchmark
  roundtrip
-------------------------------------------------------------------------------
../test/src/unit-testsuites.cpp:281
...............................................................................

../test/src/unit-testsuites.cpp:328: FAILED:
  CHECK( j.dump() == json_string )
with expansion:
  "[1.2344999999999999]" == "[1.2345]"
with messages:
  filename := "test/data/json_roundtrip/roundtrip22.json"
  json_string := "[1.2345]"

../test/src/unit-testsuites.cpp:328: FAILED:
  CHECK( j.dump() == json_string )
with expansion:
  "[-1.2344999999999999]" == "[-1.2345]"
with messages:
  filename := "test/data/json_roundtrip/roundtrip23.json"
  json_string := "[-1.2345]"

../test/src/unit-testsuites.cpp:328: FAILED:
  CHECK( j.dump() == json_string )
with expansion:
  "[2.2250738585071999e-308]"
  ==
  "[2.2250738585072e-308]"
with messages:
  filename := "test/data/json_roundtrip/roundtrip29.json"
  json_string := "[2.2250738585072e-308]"

===============================================================================
test cases:   7 |   6 passed | 1 failed
assertions: 974 | 971 passed | 3 failed

It would be great if you could have a look at these tests and tell me why it's OK to change or ignore them.

ojwoodford · 2017-11-30T18:49:10Z

A string is higher precision than a double (e.g. the former can represent 1.2345 exactly; the latter cannot), so converting from string -> double -> string can lead to a change in value, whereas double -> string -> double should not. For this reason, it's not clear to me why you would have exact tests on the former; they should have a tolerance.

gregmarr · 2017-11-30T19:13:31Z

I think several of these came from an external benchmark that valued the "load and resave a JSON file with exact values" benchmark. I agree that those are not necessarily something that we should care about.

nlohmann · 2017-11-30T20:41:36Z

The roundtrip tests (string -> JSON -> string) come from here: https://github.com/miloyip/nativejson-benchmark/tree/master/data/roundtrip

lwinkler · 2017-12-07T09:08:48Z

Another strange behavior happening with serialization. Here I serialized a double with the DBL_MAX value (1.79769e+308). The resulting string value becomes larger than DBL_MAX and cannot be parsed back. (I post this here as it seems to be related.)

#include<iostream>
#include<float.h>
#include<cstdio>
#include"json.hpp"

using nlohmann::json;
using namespace std;


int main() {
	double d = DBL_MAX;
	json js;
	js["max"] = d;

	stringstream ss;
	ss << js["max"].dump();
	json js2 = json::parse(ss.str());
	cout << js2.dump() << endl;
}

This results in:

terminate called after throwing an instance of 'nlohmann::detail::out_of_range'
  what():  [json.exception.out_of_range.406] number overflow parsing '1.79769313486232e+308'

stale · 2018-01-06T10:01:33Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

nlohmann · 2018-01-21T14:08:02Z

Reopened to check whether #915 fixed this issue.

nlohmann · 2018-01-27T16:47:17Z

The example from #360 (comment) works now and outputs 1.7976931348623157e+308.

nlohmann · 2018-01-27T16:47:47Z

The roundtrips from #360 (comment) work.

nlohmann · 2018-01-27T16:48:39Z

Roundtripping 2312.42 (#360 (comment)) works now.

nlohmann · 2018-01-27T16:49:36Z

Roundtripping 100000000000.1236 (#360 (comment)) works now.

nlohmann · 2018-01-27T16:50:24Z

Thanks to #915 from @abolz, this issue is now fixed. Thanks everybody for the patience!

ghost · 2018-11-14T04:38:24Z

This issue still seems to be there?

Here is some code that reproduces the issue:

#include <nlohmann/json.hpp>
int main()
{	
	using Json = ::nlohmann::json;
	std::string json_text{ "{\"spot\": 21898.99}" };
	Json json = Json::parse(json_text);
	auto j = *json.find("spot");
	double val = j.get<double>();
	// val is 21898.990000000002
        // expecting 21898.99
}

Git SHA da81e7b
Windows 10 version 10.0.14939.0
Visual Studio 2017 15.9.0, C++ 19.15.26732.1
Windows SDK 10.0.17134.0

Is seems as if this calls to std::strtod in lexer.hpp is the problem

 static void strtof(double& f, const char* str, char** endptr) noexcept
    {
        f = std::strtod(str, endptr);
    }

nlohmann · 2018-11-14T06:55:29Z

The library stores floating point numbers as double by default (you can change this in a template parameter).

The number 21898.99 will be stored as the double 21898.990000000002. This is the name number the parser comes up to after reading the string 21898.99. You can check in your debugger that both numbers are equal down to the bit.

https://www.exploringbinary.com/floating-point-converter/

ghost · 2018-11-15T02:25:03Z

@nlohmann Danke Schoen!
I found out this is not a problem in your library but how decimal to floating point conversion takes place, and the limit of how floating point numbers are stored in memory. I learnt something today!

charpty · 2021-08-09T08:31:40Z

I learnt something today too, thanks for sharing

nlohmann added the solution: invalid the issue is not related to the library label Nov 24, 2016

nlohmann closed this as completed Nov 24, 2016

nlohmann reopened this Oct 28, 2017

nlohmann added state: please discuss please discuss the issue or vote for your favorite option and removed solution: invalid the issue is not related to the library labels Oct 28, 2017

lidavidm mentioned this issue Nov 20, 2017

Perform internal computations in long double HaliteChallenge/Halite-II#319

Merged

nlohmann added a commit that referenced this issue Nov 28, 2017

🔀 a branch for #360

0ffa6f5

abolz mentioned this issue Dec 9, 2017

Floating-point formatting (#360) #866

Closed

nlohmann mentioned this issue Dec 23, 2017

Regression Tests Failure using "ctest" #887

Closed

stale bot added the state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated label Jan 6, 2018

stale bot closed this as completed Jan 13, 2018

abolz mentioned this issue Jan 15, 2018

Floating-point formatting #915

Merged

nlohmann removed the state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated label Jan 21, 2018

nlohmann reopened this Jan 21, 2018

nlohmann closed this as completed Jan 27, 2018

nlohmann added solution: proposed fix a fix for the issue has been proposed and waits for confirmation and removed state: please discuss please discuss the issue or vote for your favorite option labels Jan 27, 2018

nlohmann self-assigned this Jan 27, 2018

nlohmann added this to the Release 3.1.0 milestone Jan 27, 2018

garethsb mentioned this issue Feb 28, 2018

Add double serialization format microsoft/cpprestsdk#706

Open

sinannuman mentioned this issue Oct 16, 2019

json error on parsing DBL_MAX from string #1796

Closed

garethsb mentioned this issue Nov 9, 2021

sdp::grammar::number_converter precision sony/nmos-cpp#210

Closed

Loss of precision when serializing <double> #360

Loss of precision when serializing <double> #360

Comments

matspetter commented Nov 17, 2016

nlohmann commented Nov 17, 2016

gregmarr commented Nov 17, 2016

matspetter commented Nov 17, 2016 • edited by nlohmann Loading

matspetter commented Nov 18, 2016

TurpentineDistillery commented Nov 19, 2016

gregmarr commented Nov 19, 2016 • edited Loading

nlohmann commented Nov 24, 2016

matspetter commented Nov 24, 2016 via email

nlohmann commented Nov 24, 2016

timueller commented Jul 28, 2017

nlohmann commented Jul 28, 2017

timueller commented Jul 31, 2017

nlohmann commented Jul 31, 2017

abolz commented Oct 11, 2017

nlohmann commented Oct 11, 2017

gregmarr commented Oct 11, 2017

abolz commented Oct 11, 2017

nlohmann commented Oct 11, 2017

abolz commented Oct 28, 2017

ojwoodford commented Oct 30, 2017 • edited Loading

gregmarr commented Oct 30, 2017

pvleuven commented Nov 17, 2017 • edited Loading

nlohmann commented Nov 28, 2017

ojwoodford commented Nov 30, 2017

gregmarr commented Nov 30, 2017

nlohmann commented Nov 30, 2017

lwinkler commented Dec 7, 2017

stale bot commented Jan 6, 2018

nlohmann commented Jan 21, 2018

nlohmann commented Jan 27, 2018

nlohmann commented Jan 27, 2018

nlohmann commented Jan 27, 2018

nlohmann commented Jan 27, 2018

nlohmann commented Jan 27, 2018

ghost commented Nov 14, 2018 • edited by ghost Loading

nlohmann commented Nov 14, 2018

ghost commented Nov 15, 2018

charpty commented Aug 9, 2021

matspetter commented Nov 17, 2016 •

edited by nlohmann

Loading

gregmarr commented Nov 19, 2016 •

edited

Loading

ojwoodford commented Oct 30, 2017 •

edited

Loading

pvleuven commented Nov 17, 2017 •

edited

Loading

ghost commented Nov 14, 2018 •

edited by ghost

Loading