Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Added wrapping_cast to cast kernels #254

Merged
merged 10 commits into from
Aug 8, 2021

Conversation

sundy-li
Copy link
Collaborator

@sundy-li sundy-li commented Aug 5, 2021

fixes #252

@sundy-li
Copy link
Collaborator Author

sundy-li commented Aug 5, 2021

It's not backwards compatible, but it follows the ways as the modern number cast did.

@codecov
Copy link

codecov bot commented Aug 5, 2021

Codecov Report

Merging #254 (647fddb) into main (e92f47a) will increase coverage by 0.03%.
The diff coverage is 57.64%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #254      +/-   ##
==========================================
+ Coverage   77.31%   77.35%   +0.03%     
==========================================
  Files         232      232              
  Lines       19949    20094     +145     
==========================================
+ Hits        15424    15543     +119     
- Misses       4525     4551      +26     
Impacted Files Coverage Δ
src/compute/cast/mod.rs 43.42% <56.50%> (+8.27%) ⬆️
src/compute/cast/dictionary_to.rs 47.61% <66.66%> (ø)
src/compute/cast/primitive_to.rs 58.42% <100.00%> (+2.47%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e92f47a...647fddb. Read the comment docs.

Copy link
Owner

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Thanks a lot. 4 observations:

  1. could you point me to somewhere describing "the ways as the modern number cast did."? (I feel I should read on that :))
  2. could we add a test for other cases (I think we need 4 csaes: (ignore_overflow = true and not overflowing, ignore_overflow = true and overflowing, etc.)
  3. For all cases where we up-cast (e.g. i32 -> i64), shouldn't we ignore the option and always hit the fast path since it is guaranteed to not overflow?
  4. I wonder if the cast option could somehow follow an equivalent notation as Rust's own notation. If I understand correctly, ignore_overflow is equivalent to using as, which is equivalent to a saturating the cast

src/compute/cast/mod.rs Outdated Show resolved Hide resolved
src/compute/cast/mod.rs Outdated Show resolved Hide resolved
src/compute/cast/primitive_to.rs Outdated Show resolved Hide resolved
@sundy-li
Copy link
Collaborator Author

sundy-li commented Aug 5, 2021

could you point me to somewhere describing "the ways as the modern number cast did."

Emm, this is just c-style cast or default cast behavior in MySQL/ClickHouse.

In c, cpp, rust, we use as keyword to cast the numbers to ignore overflow.
In MySQL/ClickHouse, it's safe to cast a wider number to the shorter one. Users afford the overflow and accurate problem

sundy-li and others added 2 commits August 6, 2021 12:27
Co-authored-by: Jorge Leitao <jorgecarleitao@gmail.com>
Copy link
Owner

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This last change was an awesome idea; it keeps it very clear (and even backward incompatible) 👍

Left some final minor comments, but the design is there.

Thanks a lot, @sundy-li , great work here!

src/compute/cast/mod.rs Outdated Show resolved Hide resolved
src/compute/cast/mod.rs Outdated Show resolved Hide resolved
src/compute/cast/mod.rs Outdated Show resolved Hide resolved
src/compute/cast/mod.rs Outdated Show resolved Hide resolved
sundy-li and others added 2 commits August 6, 2021 14:47
Co-authored-by: Jorge Leitao <jorgecarleitao@gmail.com>
@jorgecarleitao jorgecarleitao added the feature A new feature label Aug 6, 2021
@sundy-li
Copy link
Collaborator Author

sundy-li commented Aug 6, 2021

Is it a big pain for DataFuse to not have the option?

Not yet, we just depend on the performance of cast. And I don't know whether we will have more options in cast kernel.
As SortOptions did, such like cast_null_as_default: bool, overflow_as_default: bool, so making it pub seems to be much more convinient.

@jorgecarleitao
Copy link
Owner

pulled locally and PRed minor changes. Sorry, with 2 I meant the two variations of the functions, not the 2 APIs, to keep it aligned with the other functions :)

…dictionary_keys wrapping_dictionary_to_dictionary_values
@jorgecarleitao jorgecarleitao changed the title [cast] introduce CastOptions to cast kernels Added wrapping_cast to cast kernels Aug 8, 2021
@jorgecarleitao jorgecarleitao merged commit c2c07f2 into jorgecarleitao:main Aug 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature A new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve the performance in cast kernel using AsPrimitive trait in generic dispatch
2 participants