-
Notifications
You must be signed in to change notification settings - Fork 161
feat(go/adbc/driver/snowflake): New setting to set the maximum timestamp precision to microseconds #2917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Based on the response to snowflakedb/gosnowflake#1430, it didn't sound like a fix was going to happen in the gosnowflake driver, so I added this setting to work around the known issue with dates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel especially qualified to review the Go code, but have left some feedback.
docs/source/driver/snowflake.rst
Outdated
|
|
||
| ``adbc.snowflake.sql.client_option.use_max_microseconds_precision`` | ||
| When ``true``, nanoseconds will be converted to microseconds | ||
| to avoid the overflow of the timestamp type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two lines use tabs where everything else in this file is spaces. I suspect this is the cause of the checkin test failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not affect time values, only timestamp values. It might be worth calling that out here in the documentation.
| public bool UseHighPrecision { get; set; } = true; | ||
|
|
||
| /// <summary> | ||
| /// The Snowflake setting to only have a max timestamp precision of microseconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// The Snowflake setting to only have a max timestamp precision of microseconds | |
| /// The Snowflake setting to have a max timestamp precision of only microseconds |
|
@zeroshade or @lidavidm - any other input on this one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it correct to summarize this issue as "Snowflake returns nanosecond-precision timestamps, but in a representation with a range beyond that of Arrow timestamp[ns]"?
Is there a way we can detect overflow in the nanosecond case? Silent data corruption isn't great, it would be nice if we could return StatusInvalidData and instruct the user to change the option
I started down this path. The challenge is you specify the schema details before you know the values that would cause the overflow, and I wasn’t finding a good way to go back and change the schema details. |
I'm not talking about automatically changing the schema, just detecting overflow and erroring |
|
If that's not possible that's not possible - but if we could do that that would be a better experience than silent corruption, IMO. (Or again, this is why I ask if the option should be a boolean or something more general - maybe the user is OK with disabling overflow checks) |
I added this in the latest push. I decided that default behavior should be to not throw an error, since that's what the driver does today, but to give the option to do the enforcement. |
|
So again, this is why I keep asking if the option should really be a boolean :) Since the new option is only effective when we are asking for nanoseconds, it's effectively 3 possible states modeled with 4 possible configurations. Either we should always be checking for overflow regardless of the type or we should have a single option for nanoseconds, nanoseconds (but error on overflow), or microseconds. |
Overflow only applies to large date ranges (before year 1677 or after 2262) that use nanoseconds. The native Go behavior is to just overflow to the wrong value (which is weird, tbh) so the only options are: 1.) Leave things alone |
|
So I think we agree then: there are only 3 possibilities, but it's being modeled with two booleans (4 possibilities) |
ok. I will rework it to an enum |
|
anything else needed here @lidavidm ? |
|
@zeroshade any final comments? |
|
can we merge this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the lack of response here. This looks good, @lidavidm already covered everything I was thinking of.
Introduces a new setting to set the maximum timestamp precision to Microsecond. Setting this value will convert the default Nanosecond value to Microsecond to avoid the overflow that occurs when a date is before the year 1678 or after 2262.
Provides a fix for #2811 by creating a workaround that can be set by the caller.