Skip to content

Unified number types #1070

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 21, 2025
Merged

Unified number types #1070

merged 2 commits into from
Feb 21, 2025

Conversation

Jolanrensen
Copy link
Collaborator

@Jolanrensen Jolanrensen commented Feb 19, 2025

Fixes #1068
Helps #961

Fixed getCommonNumberType and commonNumberClass functions that are used solely by sum at the moment.

I introduced a future-proof rewrite of the function and added support for unsigned- and big numbers. Moved it to a separate file and added tests. We will reuse this logic in more places later. I created a small DAG implementation for this, as it's smaller than yet another dependency.

           BigDecimal
           /       \
     BigInteger     \
       /    \        \
   ULong   Long    Double
..   |    /   |   /   |  \..
  \  |   /    |  /    |
    UInt     Int    Float
..   |    /   |   /      \..
  \  |   /    |  /
   UShort   Short
     |    /   |
     |   /    |
   UByte     Byte

The idea of this is that numbers can be converted lossless to a higher number type, so providing, say UInt and Float can be auto-converted to Double at runtime safely. The only place it's currently done is when collecting numbers across multiple columns and summing them, but I intend to reuse this logic in other statistical functions when needed, in parsing, or in JSON.

NumbersAggregator now converts numbers in its input to a common number type first, before aggregating, not relying on smart-casts anymore.
To avoid heavy reflection calls, types can be supplied to aggregateMixed() if you're aware of them.

…ed solely by `sum` at the moment. I introduced a future-proof rewrite of the function and added support for unsigned- and big numbers. Moved it to a separate file and added tests. We will reuse this logic in more places later. NumbersAggregator now converts numbers in its input to a common number type before aggregating, not relying on smart-casts anymore. To avoid heavy reflection calls, types can be supplied to aggregateMixed() if you're aware of them.
@Jolanrensen Jolanrensen marked this pull request as ready for review February 19, 2025 18:38
Copy link
Collaborator

@AndreiKingsley AndreiKingsley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice! Need to test this new logic when get started with other statistical functions.

Copy link
Collaborator

@zaleslaw zaleslaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really intersting innovation and extensible, thanks for graph theory puzzle here!

@Jolanrensen Jolanrensen changed the title Common number types Unifying number types Feb 21, 2025
@Jolanrensen Jolanrensen changed the title Unifying number types Unifyied number types Feb 21, 2025
@Jolanrensen Jolanrensen changed the title Unifyied number types Unified number types Feb 21, 2025
@Jolanrensen
Copy link
Collaborator Author

I'll rename common number types -> unified number type before merging, because I think that better reflects the idea. A "common type" of Int and Double could be interpreted as using Kotlin's type system, so then it'd be Number... However, a "unified type" sounds like a new concept, which is what we do here

…d numbers", added central doc template with graph
@Jolanrensen Jolanrensen added bug Something isn't working enhancement New feature or request labels Feb 21, 2025
@Jolanrensen Jolanrensen self-assigned this Feb 21, 2025
@Jolanrensen Jolanrensen merged commit ed58e48 into master Feb 21, 2025
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rowSum() breaks for Int + Float
3 participants