Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error message for timestamp queries outside supported range #5730

Merged
merged 4 commits into from
May 16, 2024

Conversation

Abdi-29
Copy link
Contributor

@Abdi-29 Abdi-29 commented May 6, 2024

Which issue does this PR close?

Closes #5581

Rationale for this change

This change improves the error message handling for timestamp queries outside the supported range in Datafusion. Currently, users receive confusing error messages about nanosecond overflow, which do not clearly explain the limitation of timestamp queries. This change enhances the user experience and reduces confusion by providing clearer guidance.

What changes are included in this PR?

This PR updates the error message handling for timestamp queries to provide clearer guidance when timestamps fall outside the supported range.

Are there any user-facing changes?

Users will receive more informative error messages when executing timestamp queries outside the supported range.

@github-actions github-actions bot added the arrow Changes to the arrow crate label May 6, 2024
@Abdi-29 Abdi-29 force-pushed the enhance-timestamp-error-message branch from 4cc559d to 4b65f4e Compare May 6, 2024 22:57
@Abdi-29
Copy link
Contributor Author

Abdi-29 commented May 8, 2024

Hi @tustvold,
do you know what the next step it's going to be?

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Abdi-29 -- this looks good to me ❤️ Nice work on the tests

I left a few comments, but I think we could merge this as is as well

Just let me know

T::make_value(naive).ok_or_else(|| {
ArrowError::CastError(format!(
T::make_value(naive).ok_or_else(|| match T::UNIT {
TimeUnit::Nanosecond => ArrowError::CastError(format!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if you would be willing to improve the messages for the other types (like Microsecond, etc)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sure, I can work on it. Thanks for the feedback

Copy link
Contributor Author

@Abdi-29 Abdi-29 May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb I did some research about the range of the other types but I couldn't find anything. Do you have any tips for me on where I can search for them?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could create an array with T::Min of the underlying type

something like (untested):

let min_date = DateTimeArray::from(vec![i32::MIN, i32:MAX]));
pretty_print(min_date)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I'll try it now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried

    let  a  = Arc::new(TimestampMicrosecondArray::from(vec![Some(i64::MIN), Some(i64::MAX)]));
    println!("{}", pretty_format_columns("foo", &[a]).unwrap());

And I got

+--------------------------------------------------------------------------------------------------------+
| foo                                                                                                    |
+--------------------------------------------------------------------------------------------------------+
| ERROR: Cast error: Failed to convert -9223372036854775808 to datetime for Timestamp(Microsecond, None) |
| ERROR: Cast error: Failed to convert 9223372036854775807 to datetime for Timestamp(Microsecond, None)  |
+--------------------------------------------------------------------------------------------------------+

So if you can figure out the min/max values that were allowed for that type and then that code would show you the string value I think

However, to be honest what we have in this PR is better than main, I think we could merge it as is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, thanks. I'll open another issue to fix this

arrow-cast/src/cast/mod.rs Outdated Show resolved Hide resolved
@Abdi-29 Abdi-29 requested a review from alamb May 14, 2024 13:53
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the effort @Abdi-29 -- I think this PR is better than what is on main, even though it could be improved for other types as well

Thanks again

@alamb alamb merged commit fa2ba9e into apache:master May 16, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve Error Message for Timestamps Outside Supported Range
2 participants