Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-10816: [Rust][DF] Operations with Intervals #9434

Closed
wants to merge 2 commits into from

Conversation

ovr
Copy link
Contributor

@ovr ovr commented Feb 6, 2021

It's draft, only started to work on it. DRAFT is needed to indicate that I started to work on it.

@github-actions
Copy link

github-actions bot commented Feb 6, 2021

@Dandandan
Copy link
Contributor

@ovr cool :) could you make a new JIRA story for this PR?

@ovr
Copy link
Contributor Author

ovr commented Feb 9, 2021

@Dandandan I dont think that I should create a new jira issue, because I didnt implement calculation between timestamp and interval.

Btw:

There is a problem, DF executes BinaryExpr by casting left and right sides which is not needed for timestamp - interval, because this calculation is exception in this case and should be done without casting. I am thinking how it's better to resolve it.

There are two different ideas how to solve it:

  1. Rewrite execution for BinaryExpr
  2. Wrap operations that dont need cast by function on logical plan


make_type!(
IntervalYearMonthType,
i32,
Copy link
Member

@jorgecarleitao jorgecarleitao Feb 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this comment has nothing to do with changes from this PR; this is something already in master.

fwiw, I do not think this is correct and I opened a issue for this. IMO this should be [i32;2] or something like that.

As it stands, we will be trying to read an i32 offseted by 4 bytes from an i64, which is asking us to write masks and other things to get the values.

So, I think that we should hold this PR until we have the interval implementation correct, or I think that this may end up implementing operations under the assumption of a different physical representation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorgecarleitao You are right, it's will be better.

@ovr
Copy link
Contributor Author

ovr commented Feb 25, 2021

@alamb @andygrove @nevi-me @jorgecarleitao

Btw:

There is a problem, DF executes BinaryExpr by casting left and right sides which is not needed for timestamp - interval, because this calculation is exception in this case and should be done without casting. I am thinking how it's better to resolve it.

There are two different ideas how to solve it:

  1. Rewrite execution for BinaryExpr
  2. Wrap operations that dont need cast by function on logical plan

What is better to use? Thanks

@alamb
Copy link
Contributor

alamb commented Feb 25, 2021

There is a problem, DF executes BinaryExpr by casting left and right sides which is not needed for timestamp - interval, because this calculation is exception in this case and should be done without casting. I am thinking how it's better to resolve it.

I think the "casts" in this case are very cheap -- namely no data is actually copied

I am thinking for example Int32 -> Time32 cast https://github.com/apache/arrow/blob/master/rust/arrow/src/compute/kernels/cast.rs#L597

calls cast_array_data - https://github.com/apache/arrow/blob/master/rust/arrow/src/compute/kernels/cast.rs#L871-L885 which basically just re-uses the same underlying data and reinterprets it as the desired type

So I am not sure if there is a major problem or not to have a cast if the cast is effectively doing no real work

@alamb
Copy link
Contributor

alamb commented Apr 19, 2021

The Apache Arrow Rust community is moving the Rust implementation into its own dedicated github repositories arrow-rs and arrow-datafusion. It is likely we will not merge this PR into this repository

Please see the mailing-list thread for more details

We expect the process to take a few days and will follow up with a migration plan for the in-flight PRs.

@alamb
Copy link
Contributor

alamb commented May 3, 2021

#10096 has removed the arrow implementation from this repository (it now resides in https://github.com/apache/arrow-rs and https://github.com/apache/arrow-datafusion) in the hopes of streamlining the development process

Please re-target this PR (let us know if you need help doing so) to one/both of the new repositories.

Thank you for understanding and helping to make arrow-rs and datafusion better

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants