Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: BigDecimal is broken on 2.13.0, works on 2.12.8 #11590

Closed
35VLG84 opened this issue Jun 25, 2019 · 22 comments
Closed

Regression: BigDecimal is broken on 2.13.0, works on 2.12.8 #11590

35VLG84 opened this issue Jun 25, 2019 · 22 comments
Labels
Milestone

Comments

@35VLG84
Copy link

35VLG84 commented Jun 25, 2019

Hello,

Following code works with Scala 2.12.8 but it doesn't work with 2.13.0.

Scala 2.12.8 is ok:

Welcome to Scala 2.12.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_212).
Type in expressions for evaluation. Or try :help.

scala> :paste
// Entering paste mode (ctrl-D to finish)

object BigDecimalTest {
  def main(args: Array[String]): Unit = {
    test()
  }

  def test(): Unit = {

     val bd1 = scala.math.BigDecimal("1000000000000000000000000.1")
     val bd2 = scala.math.BigDecimal("2000000000000000000000.0002")
     val bd3 = scala.math.BigDecimal("3000000000000000000.0000003")
     val bd4 = scala.math.BigDecimal("4000000000000000.0000000004")
     val bd5 = scala.math.BigDecimal("5000000000000.0000000000005")
     val bd6 = scala.math.BigDecimal("6000000000.0000000000000006")
     val bd7 = scala.math.BigDecimal("7000000.0000000000000000007")
     val bd8 = scala.math.BigDecimal("8000.0000000000000000000008")
     val bd9 = scala.math.BigDecimal("9.0000000000000000000000009")


     assert(bd9  == scala.math.BigDecimal("9.0000000000000000000000009"))
     assert(bd8 + bd9 == scala.math.BigDecimal("8009.0000000000000000000008009"))
     assert(bd7 + bd8 + bd9 == scala.math.BigDecimal("7008009.0000000000000000007008009"))
     assert(bd6 + bd7 + bd8 + bd9 == scala.math.BigDecimal("6007008009.0000000000000006007008009"))
     assert(bd5 + bd6 + bd7 + bd8 + bd9 == scala.math.BigDecimal("5006007008009.0000000000005006007008009"))
     assert(bd4 + bd5 + bd6 + bd7 + bd8 + bd9 == scala.math.BigDecimal("4005006007008009.0000000004005006007008009"))
     assert(bd3 + bd4 + bd5 + bd6 + bd7 + bd8 + bd9 == scala.math.BigDecimal("3004005006007008009.0000003004005006007008009"))
     assert(bd2 + bd3 + bd4 + bd5 + bd6 + bd7 + bd8 + bd9 == scala.math.BigDecimal("2003004005006007008009.0002003004005006007008009"))
     assert(bd1 + bd2 + bd3 + bd4 + bd5 + bd6 + bd7 + bd8 + bd9 == scala.math.BigDecimal("1002003004005006007008009.1002003004005006007008009"))
     println("BigDecimal works")
  }
}

// Exiting paste mode, now interpreting.

defined object BigDecimalTest

scala> BigDecimalTest.test
BigDecimal works

But Scala 2.13.0 breaks (it will truncate value and reduces scale, hence error):

Welcome to Scala 2.13.0 (OpenJDK 64-Bit Server VM, Java 1.8.0_212).
Type in expressions for evaluation. Or try :help.

scala> :paste
// Entering paste mode (ctrl-D to finish)

object BigDecimalTest {
  def main(args: Array[String]): Unit = {
    test()
  }

  def test(): Unit = {

     val bd1 = scala.math.BigDecimal("1000000000000000000000000.1")
     val bd2 = scala.math.BigDecimal("2000000000000000000000.0002")
     val bd3 = scala.math.BigDecimal("3000000000000000000.0000003")
     val bd4 = scala.math.BigDecimal("4000000000000000.0000000004")
     val bd5 = scala.math.BigDecimal("5000000000000.0000000000005")
     val bd6 = scala.math.BigDecimal("6000000000.0000000000000006")
     val bd7 = scala.math.BigDecimal("7000000.0000000000000000007")
     val bd8 = scala.math.BigDecimal("8000.0000000000000000000008")
     val bd9 = scala.math.BigDecimal("9.0000000000000000000000009")


     assert(bd9  == scala.math.BigDecimal("9.0000000000000000000000009"))
     assert(bd8 + bd9 == scala.math.BigDecimal("8009.0000000000000000000008009"))
     assert(bd7 + bd8 + bd9 == scala.math.BigDecimal("7008009.0000000000000000007008009"))
     assert(bd6 + bd7 + bd8 + bd9 == scala.math.BigDecimal("6007008009.0000000000000006007008009"))
     assert(bd5 + bd6 + bd7 + bd8 + bd9 == scala.math.BigDecimal("5006007008009.0000000000005006007008009"))
     assert(bd4 + bd5 + bd6 + bd7 + bd8 + bd9 == scala.math.BigDecimal("4005006007008009.0000000004005006007008009"))
     assert(bd3 + bd4 + bd5 + bd6 + bd7 + bd8 + bd9 == scala.math.BigDecimal("3004005006007008009.0000003004005006007008009"))
     assert(bd2 + bd3 + bd4 + bd5 + bd6 + bd7 + bd8 + bd9 == scala.math.BigDecimal("2003004005006007008009.0002003004005006007008009"))
     assert(bd1 + bd2 + bd3 + bd4 + bd5 + bd6 + bd7 + bd8 + bd9 == scala.math.BigDecimal("1002003004005006007008009.1002003004005006007008009"))
     println("BigDecimal works")
  }
}

// Exiting paste mode, now interpreting.

defined object BigDecimalTest

scala> BigDecimalTest.test
java.lang.AssertionError: assertion failed
  at scala.Predef$.assert(Predef.scala:267)
  at BigDecimalTest$.test(<console>:22)
  ... 28 elided

It breaks first time on on step bd6 + bd7 + bd8 + bd9 because resulting value has reduced scale and value.

@35VLG84 35VLG84 changed the title Regression: BigDecimal is broken on 2.13, works on 2.12.8 Regression: BigDecimal is broken on 2.13.0, works on 2.12.8 Jun 25, 2019
@dwijnand dwijnand added this to the 2.13.1 milestone Jun 25, 2019
@plokhotnyuk
Copy link

plokhotnyuk commented Jun 25, 2019

I would say that it is broken in 2.12.8 because it doesn't take in account a math context with rounding precision and rules, which is MathContext.DECIMAL128 by default.

@35VLG84
Copy link
Author

35VLG84 commented Jun 25, 2019

Test case works also with 2.11.12. So even if 2.11 and 2.12 are broken in sense of math context, this is change in default behaviour and could break subtle way code with numerical calculations.

35VLG84 added a commit to e257-fi/tackler that referenced this issue Jun 25, 2019
There is behaviour change with Scala 2.13.0:
scala/bug#11590

Signed-off-by: 35V LG84 <35vlg84-x4e6b92@e257.fi>
@NthPortal
Copy link

NthPortal commented Jun 25, 2019

possibly related: scala/scala#6884, scala/scala#7670

@plokhotnyuk
Copy link

plokhotnyuk commented Jun 25, 2019

@ 35VLG84 it is broken behaviour (in both 2.11 and 2.12 versions) that was ignored until 2.13...

@dwijnand
Copy link
Member

@plokhotnyuk What changes can a user do to their code to maintain the behaviour in 2.11/2.12 (even if it's technically broken)? Is there some import or something they can add at all use-sites of BigDecimal?

@plokhotnyuk
Copy link

Need just to use MathContext.UNLIMITED for operations that were fixed in mentioned bug fixes above.

@dwijnand
Copy link
Member

For clarity: that's an optional constructor parameter of BigDecimal.

I'm happy to close this issue as "not a bug", having provided that detail for any users needing to mitigate the fix in 2.13.0.

@NthPortal
Copy link

There has been talk of redesigning it so that the MathContext is an implicit passed to operations rather than a field of the class, but that's not guaranteed to ever happen, and if it does it will not happen before 2.14

@Ichoran
Copy link

Ichoran commented Jun 25, 2019

BigDecimal is conceptually broken as it carries along a MathContext which isn't used symmetrically for symmetric operations (i.e. a + b can be different from b + a due to the (hidden) context of a and b). I think for 2.14 we should just throw it out entirely and rework it from scratch, paying careful attention to (1) mathematical correctness; (2) flexibility in cases where there is more than one sensible choice; and (3) performance.

If we can find a set of symmetrizing operations for MathContext that don't have disastrous behavior (e.g. requires 2^n space after n operations), maybe we can leave it with the same name. Otherwise we should choose something else to make it clear it works differently (e.g. BigReal).

So we can label this as "not a bug", I guess, but the entire design is kind of a bug. Everything relating to the current design forces tradeoffs that result in surprising behavior like the above.

@dwijnand
Copy link
Member

Sounds like a good plan to me.

Closing off this issue, meanwhile.

Do you want to straight open that as an issue here, or go via contributors.scala-lang.org first?

@35VLG84
Copy link
Author

35VLG84 commented Jun 25, 2019

There is second problem with sum:

Welcome to Scala 2.13.0 (OpenJDK 64-Bit Server VM, Java 1.8.0_212).
Type in expressions for evaluation. Or try :help.

scala> :paste
// Entering paste mode (ctrl-D to finish)

import java.math.MathContext

object BigDecimalTest {
  def main(args: Array[String]): Unit = {
    test()
  }

  def test(): Unit = {

     val bds = List(
     scala.math.BigDecimal("1000000000000000000000000.1", MathContext.UNLIMITED),
     scala.math.BigDecimal("9.0000000000000000000000009", MathContext.UNLIMITED))

     assert(bds.sum == scala.math.BigDecimal("1000000000000000000000009.1000000000000000000000009", MathContext.UNLIMITED))

     // Below line works with scala 2.13.0
     //assert(bds.foldLeft(BigDecimal(0, MathContext.UNLIMITED))(_ + _) == scala.math.BigDecimal("1000000000000000000000009.1000000000000000000000009", MathContext.UNLIMITED))
     println("BigDecimal works")
  }
}

// Exiting paste mode, now interpreting.

import java.math.MathContext
defined object BigDecimalTest

scala> BigDecimalTest.test
java.lang.AssertionError: assertion failed
  at scala.Predef$.assert(Predef.scala:267)
  at BigDecimalTest$.test(<console>:14)
  ... 28 elided

Which is probably caused by wrong math context with num.zero in here: https://github.com/scala/scala/blob/6b4d32c3f518d21a798e8d3cf4a8c35866afa8e2/src/library/scala/collection/IterableOnce.scala#L915

I can't stress enough that this change with 2.13 is highly surprising, even though orinal implementation with 2.11 and 2.12 has been broken.

P.S. Above code works with 2.12.8

@Ichoran
Copy link

Ichoran commented Jun 25, 2019

Yeah, arguably the sum implementation should be reduceOption(_ + _).getOrElse(zero) (and similarly for product). This is something we can fix now.

@dwijnand
Copy link
Member

There is second problem with sum:

Created #11592 for that. Thanks for the report, @35VLG84

@zapov
Copy link

zapov commented Feb 3, 2021

The reason why you don't want to use MathContext for sum operations is that they will not change number of decimals in the operation.
Multiplication and division change them so it is expected that they use MC, but using MC for addition just leads to subtle bugs ;(

We were trying to upgrade app and this was detected on several places. Suggested workarounds in this thread are not viable at all.
Most common problem we have is that numbers which were previously
0.005000000000000000000000000000000
are now represented as
0.004999999999999999999999999999981

which trips up rounding to 2 decimals ;(

@Ichoran
Copy link

Ichoran commented Feb 3, 2021

@zapov - I don't understand how you could be suffering the problems you state.

scala> val mc = new java.math.MathContext(40)
mc: java.math.MathContext = precision=40 roundingMode=HALF_UP

scala> val a = BigDecimal("0.004999999999999999999999999999981", mc)
a: scala.math.BigDecimal = 0.004999999999999999999999999999981

scala> val r = a.round(new java.math.MathContext(2))
r: scala.math.BigDecimal = 0.0050

This seems to round correctly. Can you be more precise about how that value is being (incorrectly) generated, and how your rounding fails?

@zapov
Copy link

zapov commented Feb 4, 2021

They are the result of different addition operation due to rounding error usage in such operations.
Basically something along the lines of:

val numbers = roundWithDenomination(bd1 + bd2 + bd_previous_rounding_error)

Keep in mind that this denomination does not need to be 0.01
It can also be 0.25 or something alike, so its not just rounding to decimal.

While I did fix this particular place by adding BigDecimal(0, UNLIMITED) at the beggining, BigDecimals are added and subtracted all over the codebase and its not really realistic to go through each of them and check if they need special handling.

The 0.004999999999999999999999999999981 is the result of precision loss during addition, so while I could do rounding like you suggested, it would also round incorrectly numbers which really are less than 0.005

It seems to me that original PR which changed this (scala/scala#6884)
was a result of trying to unify code which is not really understood. Reasoning that * and / are using MC so + and - should too is not really good enough.
Especially since documentation explicitly states (https://www.scala-lang.org/api/2.13.4/scala/math/BigDecimal.html) BigDecimal maintains a MathContext that determines the rounding that is applied to certain calculations and Rounding will be applied on those mathematical operations that can dramatically change the number of digits in a full representation, namely multiplication, division, and powers..

I've tried reading through various explanations about the change and didn't really find one. Even if there is one (please do tell) how can you change such fundamental behavior in the language?

@Ichoran
Copy link

Ichoran commented Feb 4, 2021

First of all, this isn't "fundamental behavior in the language". It's a library, and you can take it and rewrite it (call it BigDec, perhaps) and put in any logic you please. So whatever this library happens to do, a fix is only a few hours of work away. You can grep for "BigDecimal" in your code, replace it with "BigDec" (which could just be the code taken from 2.12, with everything renamed "BigDec"), and you're back to the old behavior.

Because it really is just a library, not a language-level feature.

Secondly, the old behavior was problematic because the MathContext was breaking symmetry and associativity.

When you only use MathContext in multiplication, not addition, it is not the case that a*(big + tiny) is the same as a*big + a*tiny. So you can't be sure that it worked unless you never first add things and then multiply them. Furthermore, since the MathContext comes along with the value, and only one is chosen, you can't even be sure that a*b is the same as b*a. Finally, if you had really huge or tiny numbers and tried to add them together, you'd get a crash.

So the old way had inconsistent imprecision. The new way has consistent imprecision. Consistency is generally better. Apparently the docs didn't get updated properly. If you relied on the old inconsistencies--I guess it matters to you that 0.005 gets rounded up while 0.004999....981 is rounded down--then fork the library under a new name and you're good to go.

@zapov
Copy link

zapov commented Feb 4, 2021

I would consider standard library types with aliases in Predef quite fundamental to the language, regardless if language wants to consider everything a library.
We will definitely look into moving away from BigDecimal if there is no reason here to revert behavior to the old correct one. But just for context there is more than 13k usages of BigDecimal in manually written code here. More than 4k+ usages of BigDecimals at endpoints with other libraries. Suggesting to search replace that shows a lack of understanding for the problem :(

If you want inprecise math, you use Double. BigDecimal was the only way to retain correct math for various operations. You can't just introduce UNLIMITED into MC because all other stuff will break (just for context our code did not use MC once).
Even the problem I mentioned above is the result of setScale rounding which is now broken due to loss at addition.

The reason I pointed out documentation was to show that BD behaved as it did on purpose. It was a well defined behavior which had inconsistent implementation (by carrying context around). Again, the change was done with someone saying it makes sense to me to unify this. It does not make sense to others which consume BigDecimal as shown by test in this issue.

Please do ask yourself, if such a test existed prior to the change, would you do the change?

@Ichoran
Copy link

Ichoran commented Feb 4, 2021

It's always a danger changing any aspect of the operation of library code because someone may have (probably has) relied upon the particular behavior in the library. However, the original behavior was incredibly fragile, despite being documented--if you happen to create any of your values by multiplication, your precision will disappear.

So if you had such a change, where teensy values were getting added that were supposed to produce a value that was supposed to be exactly on a rounding boundary, and the change made it round the wrong way, I personally would still advise both that (1) the library may as well change, and (2) the original code is suspect and should change.

That you even get a 10^-32-scale error in your code, and that you rely upon not having it, is a troubling feature of your code base. Where, exactly, does this error come from? It is similar to the default precision of BigDecimal anyway, which suggests that you are doing an inaccurate division somewhere, and then propagating this inaccuracy through several steps, where you have carefully conspired to have it exactly cancel out (because at this point you're exactly precise). Alternatively, you may be using the BigDecimal.binary version to import a floating-point number, which is a bad idea because you'll get stuff like

scala> val inexact = BigDecimal.binary(17.33)
inexact: scala.math.BigDecimal = 17.32999999999999829469743417575955

scala> val also = new java.math.BigDecimal(17.33)
also: java.math.BigDecimal = 17.3299999999999982946974341757595539093017578125

scala> val better = BigDecimal(17.33)
better: scala.math.BigDecimal = 17.33

So I am sorry that the library change has made things more difficult for you, and I agree that it's not 100% clear that this was the right choice given that there is always the chance to break working code. But

(1) You can fork the library and regain the old behavior (or some other behavior you like), and with grep and search-and-replace this is actually not a difficult thing to do (if you use libraries, adding implicit conversions both ways make interop easy too), and
(2) You can't count on every other library canceling out inaccurate computations the way your code does, so you already can't trust any of them anyway, and
(3) This does not seem an advisable way to write code whose outputs are to be relied upon, because apparently you're already doing an inaccurate computation somewhere.

One advantage of having your own class is that you can forbid certain operations as inherently dangerous (e.g. division), and then the compiler will tell you everywhere that the dangerous operation is being used.

For instance,

val bill = BigDecimal("17.35")
val people = Array("Mary", "Jose", "Bill", "Kiko", "Anne", "Sura")
val share = bill  / people.length
val pay = share.setScale(2, scala.math.BigDecimal.RoundingMode.HALF_EVEN)

looks okay until you realize that only $17.34 is paid if you do it that way. All sorts of other things are slightly off, too, at that point...even under 2.12, bill - 5*share isn't the same as share.

However, if division isn't allowed, then you can write sensible alternate rules about how to handle things like this.

import scala.math.BigDecimal.RoundingMode.HALF_EVEN
def distribute(
  amount: BigDecimal, n: Int, already: List[BigDecimal] = Nil
): List[BigDecimal] =
  if (n <= 1) amount :: already
  else {
    val part = (amount / n).setScale(2, HALF_EVEN)
    distribute(amount - part, n - 1, part :: already)
  }
val shares = distribute(bill, 6)

Anyway, I understand it's frustrating when working code stops working due to what ought to be an innocuous change. But from what you've described so far--maybe if you describe more I'll see that I'm mistaken--it seems to me that in the long run, you would benefit from a code base that doesn't rely upon this particular feature of BigDecimal. It would be more robust regardless.

@zapov
Copy link

zapov commented Feb 6, 2021

Before I explain what the code is doing which resulted in new problems, some disclaimers first:

  • I agree that relying on precision after the 30th digit is asking for trouble and I would design code differently. But I only inherited what was written 7-8 years ago and trying to salvage what can be salvaged
  • I wasn't really expecting problems due to loss after 30th digit, but the problems are because of imprecise math in addition and subtraction which cause incorrect behavior of setScale rounding
  • while I understand that changing anything will have people raising pitchforks, it is really not advisable to change such operation for the worse. Nothing good will come out of having consistent implementation with broken usage. If you wanted to make BigDecimal consistent you should have leave current behavior until new implementation was ready
  • My first comment is that you should not use MC in addition and subtraction since they do not change number of decimals. This is in line with documentation. This is in line with expected behavior. Yes, there are certainly problems due to that, but you deal with them on case by case basis, not by introducing imprecise math into addition and subtraction

Now to explain why problem happens in our codebase.
The original devs had a great (read stupid) idea to capture and accumulate really small numbers up until they turn into cents. So they invented something called rounding error and propagated it around. This is not really how you should do that kind of math, but it works in almost all cases.

For this to work when you have a BigDecimal number you split it into rounded number and rounding error.
If addition or subtraction of that rounding error (which can be really small) is not giving you exact number you will be in trouble.
And one kind of trouble is that setScale will be applied on numbers like I posted above, instead of exact numbers.
Yes, I commented that its perfectly fine that multiplication is not fully correct, although one could argue what fully correct means.
I would say if multiplication is correct while maintaining 34 digits it is fully correct within 34 digits which is more than enough.

Also, we resolved this in our codebase by duplicating BigDecimal and reverted + and - behavior to correct one. I'm just leaving you my thoughts on the matter as I'm sure there were others affected by this :(

@Ichoran
Copy link

Ichoran commented Feb 6, 2021

The new implementation is more consistent than the old one. In this one, every operation obeys MathContext. You always know exactly what your precision is: it's the default 34 digits unless you change it, at which point it's whatever the first argument's MathContext is in any binary operation.

It's also slightly less to cause program failure: if you have enormous + 1/enormous, and unbounded precision on addition, you allocate a spectacular amount of memory, probably completely by accident.

However, it the new way is less accurate in some calculations, and apparently your codebase requires accuracy there. How it doesn't still have mistakes is something of a mystery to me, given how hard it is to mix precise and imprecise operations.

It also occurs to me that I overstated how easy a switch is, because equality is cooperative, and nothing knows how to cooperate with a new variant of BigDecimal. Everything except equality handles it fine. But that's a big caveat. 2 == BigDecimal(2) returns true, but 2 == MyDecimal(2) must return false. (For what it's worth, 2 == (new java.math.BigDecimal(2)) also returns false.)

Anyway, I'm sorry that the change has caused you difficulty. I hope there aren't lingering issues with equality. And hopefully someday we'll have a better story than MathContext for handling precision.

As I said, I think the new way is more consistent and therefore overall better, but I'm not sure it was wise to change the behavior given that there were ways to take advantage of the old behavior.

@SethTisue
Copy link
Member

As I said, I think the new way is more consistent and therefore overall better, but I'm not sure it was wise to change the behavior given that there were ways to take advantage of the old behavior.

I think this was 1) a good change, and 2) totally fair game to change in a major Scala version (2.13).

However, scala/scala#6884 should have been listed in the 2.13.0 release notes. I will go rectify that right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants