You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Currently decimal arithmetic may return an overflow error if an arithmetic operation produces an output that exceeds the maximum precision of the storage type.
To achieve this the arithmetic kernels would need to be updated to detect this case, and instead perform arithmetic on double-width decimal primitives, before then truncating this down to an appropriate scale/precision.
Describe alternatives you've considered
We could continue to return an error, but this is likely to be surprising for users.
Additional context
#4640 proposes increasing the decimal output scale, which in turn would make this error case more likely.
The fundamental primitive necessary for this is, N-digit division, which will likely be implemented as part of #4663
The text was updated successfully, but these errors were encountered:
Having spent a significant amount of time working on this I'm inclined to not move forward with this.
Decimal256 supports up to 76 decimal digits, which is more than most commercial systems, and makes me think supporting precision loss arithmetic, i.e. arithmetic with even greater precision than this, is of limited practical utility:
MySQL - 65
MariaDB - 65
SQL Server - 38
DuckDB - 38
Oracle - 38
Redshift - 38
Spark - 38
Postgres - 131072!! (postgres decimals appear to be rather special)
Silently truncating precision is potentially surprising and is inconsistent with the rest of the arithmetic kernels
Users can easily increase/decrease the precision of a calculation by casting the inputs
Performant precision-loss arithmetic is extremely complex
Ensuring good test coverage is very hard given the size of the numbers involved
I personally find it hard to justify spending significant amounts of time on decimals when they don't show up in my workloads
TLDR the current decimal support, whilst likely not perfect, covers most bases, and is at least as feature-full as many commercial DBMSs, and I therefore struggle to justify spending more time working on it
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Currently decimal arithmetic may return an overflow error if an arithmetic operation produces an output that exceeds the maximum precision of the storage type.
Describe the solution you'd like
Spark and similar systems instead truncate the precision of the output, if the precision would exceed the maximum of the storage type - https://github.com/apache/arrow/blob/36ddbb531cac9b9e512dfa3776d1d64db588209f/java/gandiva/src/main/java/org/apache/arrow/gandiva/evaluator/DecimalTypeUtil.java#L83
To achieve this the arithmetic kernels would need to be updated to detect this case, and instead perform arithmetic on double-width decimal primitives, before then truncating this down to an appropriate scale/precision.
Describe alternatives you've considered
We could continue to return an error, but this is likely to be surprising for users.
Additional context
#4640 proposes increasing the decimal output scale, which in turn would make this error case more likely.
The fundamental primitive necessary for this is, N-digit division, which will likely be implemented as part of #4663
The text was updated successfully, but these errors were encountered: