Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strconv: FormatFloat rounding error #29491

Closed
cyberphone opened this issue Jan 1, 2019 · 17 comments
Closed

strconv: FormatFloat rounding error #29491

cyberphone opened this issue Jan 1, 2019 · 17 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@cyberphone
Copy link

What version of Go are you using (go version)?

go version go1.11.4 windows/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\Anders\AppData\Local\go-build
set GOEXE=.exe
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOOS=windows
set GOPATH=C:\Users\Anders\go
set GOPROXY=
set GORACE=
set GOROOT=C:\Go
set GOTMPDIR=
set GOTOOLDIR=C:\Go\pkg\tool\windows_amd64
set GCCGO=gccgo
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=C:\Users\Anders\AppData\Local\Temp\go-build2129905
62=/tmp/go-build -gno-record-gcc-switches

What did you do?

I'm trying to implement a JSON canonicalization scheme requiring numbers to be expressed in the shortest way. prec -1 seemed like a better alternative than porting V8's solution but unfortunately it seems that the -1 option is broken:

package main

import (
    "strconv"
    "math"
    "fmt"
)

func fpformat(hex string, format byte, precision int, expected string) {
    ieeeU64, err := strconv.ParseUint(hex, 16, 64)
    if err != nil {
        panic(hex)
    }
    ieeeF64 := math.Float64frombits(ieeeU64)
    shortestNumber := strconv.FormatFloat(ieeeF64, format, -1, 64)
    moreDecimals := strconv.FormatFloat(ieeeF64, format, precision, 64)
    plusOneDecimal := strconv.FormatFloat(ieeeF64, format, precision + 1, 64)
    fmt.Printf("\nfloat64hex: %s\n", hex)
    fmt.Printf("   actual using precision[-1]: %s\n", shortestNumber)  // INCORRECT in go1.11.4 windows/amd64
    fmt.Printf(" expected using precision[-1]: %s\n", expected)
    fmt.Printf("   actual using precision[%2d]: %s\n", precision, moreDecimals)
    fmt.Printf("   actual using precision[%2d]: %s\n", precision + 1, plusOneDecimal)
}

func main() {
    fpformat("439babe4b56e8a39", 'f', 0, "498484681984085570")
    fpformat("c4dee27bdef22651", 'g', 17, "-5.8339553793802237e+23")
}

What did you expect to see?

float64hex: 439babe4b56e8a39
   actual using precision[-1]: 498484681984085570
 expected using precision[-1]: 498484681984085570
   actual using precision[ 0]: 498484681984085568
   actual using precision[ 1]: 498484681984085568.0

float64hex: c4dee27bdef22651
   actual using precision[-1]: -5.8339553793802237e+23
 expected using precision[-1]: -5.8339553793802237e+23
   actual using precision[17]: -5.8339553793802237e+23
   actual using precision[18]: -5.83395537938022366e+23

What did you see instead?

float64hex: 439babe4b56e8a39
   actual using precision[-1]: 498484681984085560
 expected using precision[-1]: 498484681984085570
   actual using precision[ 0]: 498484681984085568
   actual using precision[ 1]: 498484681984085568.0

float64hex: c4dee27bdef22651
   actual using precision[-1]: -5.8339553793802236e+23
 expected using precision[-1]: -5.8339553793802237e+23
   actual using precision[17]: -5.8339553793802237e+23
   actual using precision[18]: -5.83395537938022366e+23
@ALTree
Copy link
Member

ALTree commented Jan 1, 2019

I'm not sure I understand what issue you are reporting. The -1 option for prec is documented as

The special precision -1 uses the smallest number of digits necessary such that ParseFloat will return f exactly.

And this seems to hold for both your numbers:

package main

import (
	"fmt"
	"math"
	"strconv"
)

func test(num string) {
	n, err := strconv.ParseUint(num, 16, 64)
	if err != nil {
		panic("main")
	}

	f := math.Float64frombits(n)
	s := strconv.FormatFloat(f, 'f', -1, 64)
	g, err := strconv.ParseFloat(s, 64)
	if err != nil {
		panic("main")
	}

	fmt.Printf("%b\n", f)
	fmt.Printf("%b\n", g)
	fmt.Println(f == g)
}

func main() {
	test("439babe4b56e8a39")
	test("c4dee27bdef22651")
}

Note how f (the number I called FormatFloat on with prec -1) and g (the number returned by ParseFloat when called on the string from f) are bit-per-bit equal.

@ALTree ALTree changed the title strconv.FormatFloat rounding error strconv: FormatFloat rounding error Jan 1, 2019
@cyberphone
Copy link
Author

As you can see in my example using precision 0 for "integers" result in a value which after rounding should give another result while still being compatible with both ParseFloat and V8/EcmaScript.

This is a request rather than a bug. I did port 2000 lines to .NET but that wasn't a pleasure :(

@cyberphone
Copy link
Author

@katiehockman katiehockman added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jan 2, 2019
@bmkessler
Copy link
Contributor

I'm not sure if this falls into a proposed change since (as noted above) the documentation only states that precision -1 outputs one of the shortest decimal representations that satisfies the round trip property, but I think the expectation is that the -1 precision output is also correctly rounded to the closest decimal representation to the input. Note that issue #2625 was resolved to produce the closest decimal. I believe that closest decimal is what other programming languages typically provide as well. The bug appears to be in this section of roundShortest:

// Okay to round up if upper has a different digit and either upper
// is inclusive or upper is bigger than the result of rounding up.
okup := m != u && (inclusive || m+1 < u || i+1 < upper.nd)

The round up comparison is byte-by-byte and so incorrectly fails for these inputs:

lower: 498484681984085536
d:     498484681984085568
upper: 498484681984085600

49848468198408557 < 49848468198408560 so should have okup == true
but '7' > '0' in the last digit

lower: 583395537938022332891136
d:     583395537938022366445568
upper: 583395537938022400000000

58339553793802237 < 58339553793802240 so should have okup == true
but '7' > '0' in the last digit

Adding @rsc as the author of the conversion routine.

@cyberphone
Copy link
Author

cyberphone commented Jan 5, 2019

The rounding issue also applies to a small set of floating point numbers:

Hex: c4af4eaaa5e19289

g -1: 
-7.3922251744542715e+22

g 17:
-7.3922251744542716e+22

@ALTree ALTree added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jan 8, 2019
@ALTree ALTree added this to the Go1.13 milestone Jan 8, 2019
@bmkessler
Copy link
Contributor

That case is also the same issue with byte-wise comparison to '0'

lower: 73922251744542711611392
d:     73922251744542715805696
upper: 73922251744542720000000

73922251744542716 < 73922251744542720 so should have okup == true
but '6' > '0' in the last digit

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/157697 mentions this issue: strconv: fix rounding in FormatFloat fallback path

@cespare
Copy link
Contributor

cespare commented Jan 13, 2019

Though this bug has been around for a long time, I also happened to independently discover it a few days ago while implementing Ryu (see #15672 (comment)). I sent a CL though it will have to wait for Go 1.13.

Thanks @bmkessler for the very helpful analysis.

@cyberphone
Copy link
Author

@cespare I tried the Java version of Ryu on x0000000000000001 and it returned 4.9e-324 which differs from the current Go implementation which returns 5e-324 (which also is compliant with JavaScript).

@cespare
Copy link
Contributor

cespare commented Jan 13, 2019

@cyberphone that's just a bug in the Java code. See ulfjack/ryu#83. My Ryu code gives 5e-324.

@cyberphone
Copy link
Author

@cespare Do you think I should try to run my 100M test suite against your Ryu implementation?

https://github.com/cyberphone/json-canonicalization/tree/master/testdata#es6-numbers

@ulfjack
Copy link

ulfjack commented Jan 14, 2019

The Java implementation attempts to follow the specification of Java's Double.toString [1], which says that it must return at least two digits. Out of all two-digit numbers, 4.9e-324 is closest to the exact floating point value in this case, so that's what it returns. The problem described in ulfjack/ryu#83 is that it sometimes doesn't return the closest two-digit number because there's a one-digit number that's shorter. The C implementation doesn't do that, and I have not found any differences between it and Grisu over a fairly large number of tests.

[1] https://docs.oracle.com/javase/7/docs/api/java/lang/Double.html#toString(double)

There must be at least one digit to represent the fractional part

@ulfjack
Copy link

ulfjack commented Jan 14, 2019

I updated the README for Ryu, and have a feature request open for the Java implementation to generate shortest output as well at ulfjack/ryu#87. Thanks @cyberphone for bringing this to my attention.

remyoudompheng pushed a commit to remyoudompheng/go that referenced this issue Jan 16, 2019
Float formatting uses a multiprecision fallback path where Grisu3
algorithm fails. This has a bug during the rounding phase: the
difference between the decimal value and the upper bound is examined
byte-by-byte and doesn't properly handle the case where the first
divergence has a difference of 1.

For instance (using an example from golang#29491), for the number
498484681984085570, roundShortest examines the three decimal values:

lower: 498484681984085536
d:     498484681984085568
upper: 498484681984085600

After examining the 16th digit, we know that rounding d up will fall
within the bounds unless all remaining digits of d are 9 and all
remaining digits of upper are 0:

d:     ...855xx
upper: ...856xx

However, the loop forgets that d and upper have already diverged and
then on the next iteration sees that the 17th digit of d is actually
lower than the 17th digit of upper and decides that we still can't round
up:

d:     ...8556x
upper: ...8560x

Thus the original value is incorrectly rounded down to
498484681984085560 instead of the closer (and equally short)
498484681984085570.

Thanks to Brian Kessler for diagnosing this bug.

Fix it by remembering when we've seen divergence in previous digits.

This CL also fixes another bug in the same loop: for some inputs, the
decimal value d or the lower bound may have fewer digits than the upper
bound, yet the iteration through the digits starts at i=0 for each of
them. For instance, given the float64 value 1e23, we have

d:      99999999999999991611392
upper: 100000000000000000000000

but the loop starts by comparing '9' to '1' rather than '0' to '1'.

I haven't found any cases where this second bug causes incorrect output
because when the digit comparison fails on the first loop iteration the
upper bound always has more nonzero digits (i.e., the expression
'i+1 < upper.nd' is always true).

Fixes golang#29491

Change-Id: I58856a7a2e47935ec2f233d9f717ef15c78bb2d0
remyoudompheng pushed a commit to remyoudompheng/go that referenced this issue Jan 19, 2019
Float formatting uses a multiprecision fallback path where Grisu3
algorithm fails. This has a bug during the rounding phase: the
difference between the decimal value and the upper bound is examined
byte-by-byte and doesn't properly handle the case where the first
divergence has a difference of 1.

For instance (using an example from golang#29491), for the number
498484681984085570, roundShortest examines the three decimal values:

lower: 498484681984085536
d:     498484681984085568
upper: 498484681984085600

After examining the 16th digit, we know that rounding d up will fall
within the bounds unless all remaining digits of d are 9 and all
remaining digits of upper are 0:

d:     ...855xx
upper: ...856xx

However, the loop forgets that d and upper have already diverged and
then on the next iteration sees that the 17th digit of d is actually
lower than the 17th digit of upper and decides that we still can't round
up:

d:     ...8556x
upper: ...8560x

Thus the original value is incorrectly rounded down to
498484681984085560 instead of the closer (and equally short)
498484681984085570.

Thanks to Brian Kessler for diagnosing this bug.

Fix it by remembering when we've seen divergence in previous digits.

This CL also fixes another bug in the same loop: for some inputs, the
decimal value d or the lower bound may have fewer digits than the upper
bound, yet the iteration through the digits starts at i=0 for each of
them. For instance, given the float64 value 1e23, we have

d:      99999999999999991611392
upper: 100000000000000000000000

but the loop starts by comparing '9' to '1' rather than '0' to '1'.

I haven't found any cases where this second bug causes incorrect output
because when the digit comparison fails on the first loop iteration the
upper bound always has more nonzero digits (i.e., the expression
'i+1 < upper.nd' is always true).

Fixes golang#29491

Change-Id: I58856a7a2e47935ec2f233d9f717ef15c78bb2d0
@remyoudompheng
Copy link
Contributor

At exponent 64, every 1250th float64 is affected by the issue (and is fixed by the patch)

  for k := 0; k < 1000; k++ {
        mant := (36028797018967+10*k)*125 + 62
        f := math.Ldexp(float64(mant), 12)
        t.Logf("%.19e %v", f, f)
  }

Output:

    fptest_test.go:141: 1.8446744073711357952e+19 1.8446744073711357e+19
    fptest_test.go:141: 1.8446744073716477952e+19 1.8446744073716477e+19
    fptest_test.go:141: 1.8446744073721597952e+19 1.8446744073721597e+19
    fptest_test.go:141: 1.8446744073726717952e+19 1.8446744073726717e+19
    fptest_test.go:141: 1.8446744073731837952e+19 1.8446744073731837e+19
    fptest_test.go:141: 1.8446744073736957952e+19 1.8446744073736957e+19
    fptest_test.go:141: 1.8446744073742077952e+19 1.8446744073742077e+19

@jackhftang
Copy link

Not if it is the same bug, but this one is relatively small.

strconv.FormatFloat(255.255, 'f', 2, 64) 

The above output "255.25", but I expect "255.26".

go version go1.12.1 darwin/amd64

@remyoudompheng
Copy link
Contributor

Not if it is the same bug, but this one is relatively small.

strconv.FormatFloat(255.255, 'f', 2, 64) 

The above output "255.25", but I expect "255.26".

go version go1.12.1 darwin/amd64

No, 255.25 is the correct rounding. float64(255.255) is slightly less than the actual 255.255.

fmt.Printf("%.18f\n", float64(255.255))
-> 255.254999999999995453

@cyberphone
Copy link
Author

Apparently JavaScript agree:

Number.parseFloat("255.255").toPrecision(5)
> "255.25"

The rounding is performed at the binary level which is one reason to why floating point is not suitable for monetary amounts

@golang golang locked and limited conversation to collaborators May 22, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

9 participants