Incorrect Decimal Division Coercion #6828

tustvold · 2023-07-03T13:30:02Z

Describe the bug

Currently when performing decimal division coercion_decimal_mathematics_type as called by BinaryExpr::evaluate will coerce both inputs to the wider precision type. The division kernel will then scale the left hand side by the output scale.

To see why this is an issue consider

1 DECIMAL(38, 20) / 5 DECIMAL(38, 0)

This computation shouldn't overflow as it can just perform 10e20 / 5

The issue is that the coercion logic will scale the right hand side to 5e20 requiring it to also scale the left hand side to 10e40 which will overflow

To Reproduce

❯ create table foo (a DECIMAL(38, 20), b DECIMAL(38, 0));
0 rows in set. Query took 0.001 seconds.
❯ insert into foo VALUES (1, 1000000000000);
+-------+
| count |
+-------+
| 1     |
+-------+
1 row in set. Query took 0.003 seconds.
❯ select a / b from foo;
Decimal128(38, 20) vs Decimal128(38, 20)
+------------------------------------------+
| foo.a / foo.b                            |
+------------------------------------------+
| 0.00000000000000000000000000000000743222 |
+------------------------------------------+
1 row in set. Query took 0.004 seconds.

Expected behavior

No response

Additional context

#6794

The text was updated successfully, but these errors were encountered:

viirya · 2023-07-03T16:30:46Z

The division kernel will then scale the left hand side by the output scale.

This scaling should allow precision loss instead of overflow directly.

tustvold · 2023-07-03T16:31:32Z

This scaling should allow precision loss instead of overflow directly.

The point here is that no overflow should occur, the only reason overflow occurs is because the type coercion machinery isn't respecting the input types. We don't need to perform fixed point multiplication here if we're sensible about preserving the input scale

Edit: see https://github.com/apache/arrow-rs/pull/4465/files#diff-12bc5282f3ea1d5a1e2efeacbede6aa8ac7308498533f527d43705398cdb56eaR501

viirya · 2023-07-03T17:17:43Z

Read through the diff quickly.

let mul_pow = result_scale - (s2 - s1);

I think result_scale is min(38, 20 + 38 + 1) = 38?

Hmm, for the described case, isn't mul_pow equal to 58 (38 - (0 - 20))? As I assume s1, s2 are original scales of lhs, rhs arrays.

The issue is that the coercion logic will scale the right hand side to 5e20 requiring it to also scale the left hand side to 10e40 which will overflow

So scaling the left hand side to 10e58 won't overflow? Am I missing any point here? 🤔

tustvold · 2023-07-03T17:39:49Z

isn't mul_pow equal to 58 (38 - (0 - 20))

Yeah, I need to continue to iterate on this, it's all still a WIP. But it seems inherently wrong that dividing by a value greater than 1 could result in overflow and requires fixed point multiplication.

tustvold · 2023-07-03T19:25:37Z

Interestingly duckdb appears to not support decimal division at all

>>> duckdb.sql('select cast(5 as DECIMAL(38, 10)) / cast(10 as DECIMAL(38, 10))')
┌──────────────────────────────────────────────────────────┐
│ (CAST(5 AS DECIMAL(38,10)) / CAST(10 AS DECIMAL(38,10))) │
│                          double                          │
├──────────────────────────────────────────────────────────┤
│                                                      0.5 │
└──────────────────────────────────────────────────────────┘

Mysql appears to just increment the left hand scale by 4

DECIMAL(38, 10) / DECIMAL(38, 10) -> DECIMAL(52, 14)
DECIMAL(11, 10) / DECIMAL(2, 0) -> DECIMAL(15, 14)
DECIMAL(11, 10) / DECIMAL(13, 12) -> DECIMAL(27, 14)
DECIMAL(11, 10) / DECIMAL(17, 16) -> DECIMAL(31, 14)

Postgres appears to do something similar - https://github.com/postgres/postgres/blob/29cf61ade3f245aa40f427a1d6345287ef77e622/src/interfaces/ecpg/pgtypeslib/numeric.c#L1047

Interestingly the Hive specification states

But one thing is clear, for scale resulting from a division, the scale of the result is s1 plus a
system-wide increment, which has a default 4

But then goes on to show a table with something different

I think using a fixed increment of the dividend's scale makes a whole lot more sense than a value computed based on the divisor's precision, which just seems to be a recipe for overflow.

Edit: I have updated apache/arrow-rs#4465 to do this

comphead · 2023-07-29T00:43:32Z

Adding more test cases to be validated on DF

Looks like there is a series of bugs on decimal multiply/division in DF, so this qury in PG gives me all the same numbers

select cast(6.4053151420411946063694043751862251568 as decimal(38,37)) / 1,
       cast(6.4053151420411946063694043751862251568 as decimal(38,37)) / 1.0,
       cast(6.4053151420411946063694043751862251568 as decimal(38,37)) * 1,
       cast(6.4053151420411946063694043751862251568 as decimal(38,37)) * 1.0

6.40531514204119460636940437518622515680

in DF division not happening at all, multiply gives diff results

❯ select cast(6.4053151420411946063694043751862251568 as decimal(38,37)) / 1;
Optimizer rule 'simplify_expressions' failed
caused by
Arrow error: Compute error: Overflow happened on: 64053151420411946063694043751862251568 * 100000000000000000000000000000000000000
❯ select cast(6.4053151420411946063694043751862251568 as decimal(38,37)) / 1.0;
Optimizer rule 'simplify_expressions' failed
caused by
Arrow error: Compute error: Overflow happened on: 64053151420411946063694043751862251568 * 100000000000000000000000000000000000000
❯ select cast(6.4053151420411946063694043751862251568 as decimal(38,37)) * 1;
+---------------------------------------------------------------------------+
| Decimal128(Some(64053151420411946063694043751862251568),38,37) * Int64(1) |
+---------------------------------------------------------------------------+
| 6.4053151420411946063694043751862251568                                   |
+---------------------------------------------------------------------------+
1 row in set. Query took 0.002 seconds.
❯ select cast(6.4053151420411946063694043751862251568 as decimal(38,37)) * 1.0;
+-------------------------------------------------------------------------------------------+
| Decimal128(Some(64053151420411946063694043751862251568),38,37) * Decimal128(Some(10),2,1) |
+-------------------------------------------------------------------------------------------+
| -0.40033219637757466289808777344913907232                                                 |
+-------------------------------------------------------------------------------------------+
1 row in set. Query took 0.002 seconds.

viirya · 2023-07-30T21:51:03Z

The overflow could be fixed after #6832, I think. Currently decimal multiplication with scalar doesn't handle overflow correctly.

westonpace · 2023-08-03T15:23:40Z

I haven't followed the entire discussion but, FWIW, Substrait currently agrees with the max(6, s1 + p2 + 1) formulation. SQL server also uses this formula. Let me also preface this description with the fact that this topic confuses me all the time so there is a chance I'm way off base here.

Postgres appears to do something similar - https://github.com/postgres/postgres/blob/29cf61ade3f245aa40f427a1d6345287ef77e622/src/interfaces/ecpg/pgtypeslib/numeric.c#L1047

I'm not sure this bears out empirically. Using https://www.db-fiddle.com/f/4jyoMCicNSZpjMt4jFYoz5/9659 I get the following result from CAST(1 AS DECIMAL(3, 0)) / CAST(1000000) AS DECIMAL(7, 0)) = 0.000001000000000000000000 which has a scale of at least 7 which is greater than 0+4.

I think using a fixed increment of the dividend's scale makes a whole lot more sense than a value computed based on the divisor's precision, which just seems to be a recipe for overflow.

I think the idea of involving the divisor's precision is for cases like:

1 / 1000000000

which just seems to be a recipe for overflow.
This scaling should allow precision loss instead of overflow directly.

Agreed. In both SQL server and Substrait there is a second step to help with the overflow problem. This second step applies to all decimal arithmetic operations. The rule is basically like this:

If, after following the formulas, the resulting precision / scale is out of bounds, then sacrifice scale (e.g. throw away the really insignificant digits on the far right) but keep at least 6 digits of scale. If you can't keep at least 6 digits of scale (e.g. the number requires 33 digits or more) then overflow. The 6 is absolutely arbitrary but it works.

Note, that the SQL server rule is slightly more complex, but more or less the same thing:

In addition and subtraction operations, we need max(p1 - s1, p2 - s2) places to store the integral part of the decimal number. If > there isn't enough space to store them (that is, max(p1 - s1, p2 - s2) < min(38, precision) - scale), the scale is reduced to provide > enough space for the integral part. The resulting scale is min(precision, 38) - max(p1 - s1, p2 - s2), so the fractional part might be > rounded to fit into the resulting scale.

In multiplication and division operations, we need precision - scale places to store the integral part of the result. The scale might > be reduced using the following rules:

The resulting scale is reduced to min(scale, 38 - (precision-scale)) if the integral part is less than 32, because it can't be greater than 38 - (precision-scale). The result might be rounded in this case.

The scale isn't changed if it's less than 6 and if the integral part is greater than 32. In this case, an overflow error might be raised if it can't fit into decimal(38, scale).

The scale is set to 6 if it's greater than 6 and if the integral part is greater than 32. In this case, both the integral part and scale would be reduced and resulting type is decimal(38, 6). The result might be rounded to 6 decimal places, or the overflow error is thrown if the integral part can't fit into 32 digits.

westonpace · 2023-08-03T15:53:03Z

Also, if it helps, here are the calcite rules, which should be the same as, and were the motivation for, the substrait rules.

tustvold · 2023-08-03T18:59:00Z

I've attempted to adjust this in apache/arrow-rs#4640 PTAL

tustvold added the bug Something isn't working label Jul 3, 2023

tustvold mentioned this issue Jul 3, 2023

Incorrect results returned for TPC-H Query 8 #6794

Closed

tustvold mentioned this issue Jul 4, 2023

Update Arrow 45.0.0 And Datum Arithmetic, change Decimal Division semantics #6832

Merged

tustvold mentioned this issue Aug 3, 2023

Increase output scale for decimal division apache/arrow-rs#4640

Closed

alamb closed this as completed in #6832 Aug 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect Decimal Division Coercion #6828

Incorrect Decimal Division Coercion #6828

tustvold commented Jul 3, 2023 •

edited

Loading

viirya commented Jul 3, 2023

tustvold commented Jul 3, 2023 •

edited

Loading

viirya commented Jul 3, 2023 •

edited

Loading

tustvold commented Jul 3, 2023

tustvold commented Jul 3, 2023 •

edited

Loading

comphead commented Jul 29, 2023

viirya commented Jul 30, 2023

westonpace commented Aug 3, 2023 •

edited

Loading

westonpace commented Aug 3, 2023

tustvold commented Aug 3, 2023

Incorrect Decimal Division Coercion #6828

Incorrect Decimal Division Coercion #6828

Comments

tustvold commented Jul 3, 2023 • edited Loading

Describe the bug

To Reproduce

Expected behavior

Additional context

viirya commented Jul 3, 2023

tustvold commented Jul 3, 2023 • edited Loading

viirya commented Jul 3, 2023 • edited Loading

tustvold commented Jul 3, 2023

tustvold commented Jul 3, 2023 • edited Loading

comphead commented Jul 29, 2023

viirya commented Jul 30, 2023

westonpace commented Aug 3, 2023 • edited Loading

westonpace commented Aug 3, 2023

tustvold commented Aug 3, 2023

tustvold commented Jul 3, 2023 •

edited

Loading

tustvold commented Jul 3, 2023 •

edited

Loading

viirya commented Jul 3, 2023 •

edited

Loading

tustvold commented Jul 3, 2023 •

edited

Loading

westonpace commented Aug 3, 2023 •

edited

Loading