Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Implement a more accurate float to decimal conversion that supports rounding instead of truncation #16155

Open
ttnghia opened this issue Jul 1, 2024 · 3 comments
Labels
feature request New feature or request

Comments

@ttnghia
Copy link
Contributor

ttnghia commented Jul 1, 2024

Currently, there not exist any accurate float-to-decimal conversion. The closest operation to it is cudf::round which can produce some results that are not correct all the time. As such, we have issues like NVIDIA/spark-rapids#9682 and NVIDIA/spark-rapids#10809.

A new dedicated conversion code in #15905 supposes to add some special handling for float-decimal conversion. Unfortunately, it performs truncation instead of rounding. It should be great to support an optional flag to that code, allowing to do either truncation or rounding depending on the applications.

@ttnghia ttnghia added the feature request New feature or request label Jul 1, 2024
@github-project-automation github-project-automation bot moved this to In Progress in cuDF/Dask/Numba/UCX Jul 1, 2024
@ttnghia ttnghia changed the title [FEA] Implement a more accurate float to decimal conversion that support rounding instead of flooring [FEA] Implement a more accurate float to decimal conversion that supports rounding instead of truncation Jul 1, 2024
@pmattione-nvidia
Copy link
Contributor

I'm working on composing a solution for this. A question in the meantime: if the result under- or overflows, what behavior is needed to match spark-rapids?

E.g. for floating -> decimal, should we return one of 0 / INT_MIN / INT_MAX? for decimal -> floating do we set 0 / +-inf? Or do we null the field entirely?

@pmattione-nvidia
Copy link
Contributor

pmattione-nvidia commented Jul 9, 2024

The code in the new conversion PR has been modified so that the cuDF-specific code wraps around the core of the conversion algorithm. In spark-rapids-jni, we can similarly wrap around this core to perform the spark-specific rounding that we need.

This cuDF draft PR has the code for the spark-specific rounding that we need, wrapping around the core of the cuDF conversion code. This code is just in cuDF for my ease of testing, and should be migrated to spark-rapids-jni for full integration.

@pmattione-nvidia
Copy link
Contributor

The cuDF conversion PR has been merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
Status: In Progress
Development

No branches or pull requests

2 participants