-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add toYYYYMM function #1726
Add toYYYYMM function #1726
Conversation
Thanks for the contribution! Please review the labels and make any necessary changes. |
/review @sundy-li |
Take the reviewer to sundy-li |
Codecov Report
@@ Coverage Diff @@
## master #1726 +/- ##
======================================
- Coverage 72% 72% -1%
======================================
Files 573 574 +1
Lines 33938 34002 +64
======================================
+ Hits 24642 24644 +2
- Misses 9296 9358 +62
Continue to review full report at Codecov.
|
@@ -16,3 +16,9 @@ select today() - 1 = yesterday(); | |||
select today() - yesterday() = 1; | |||
select today() + 1 = tomorrow(); | |||
select tomorrow() - today() = 1; | |||
|
|||
select toYYYYMM(toDateTime(1630320462)); | |||
select toYYYYMM(today() - 30) = toYYYYMM(today()) - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if today is the 2021-10-31?
t: PhantomData<T>, | ||
} | ||
|
||
pub trait NumberResultFunction { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, we can even make it more generic using NumberResultFunction<T>
, this can be addressed in another pr to support toYYYYMMDDhhmmss
, which returns UInt64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will address this in next pr.
Ok(false) | ||
} | ||
|
||
fn eval(&self, _columns: &DataColumnsWithField, _input_rows: usize) -> Result<DataColumn> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_columns -> columns
d329a59
to
00c6eea
Compare
select toYYYYMM(today() + 30) = toYYYYMM(today()) + 1; | ||
select toYYYYMM(today()) - toYYYYMM(today() - 30) = 1; | ||
select toYYYYMM(today() + 30) - toYYYYMM(today()) = 1; | ||
select toYYYYMM(today() - 31) = toYYYYMM(today()) - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not correct too. What if today is 03-01
, toYYYYMM(today() - 31)
will be toYYYYMM(today()) - 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sundy-li Yeah, you are right.
For the case of 03-01, all the four eq expression may return false.
As datafuse does not support inserting date16/date32/datetime32 sql like following now.
mysql> CREATE TABLE test(a UInt8, b date) engine = Memory;
Query OK, 0 rows affected (0.01 sec)
mysql> desc test;
+-------+--------+------+
| Field | Type | Null |
+-------+--------+------+
| a | UInt8 | NO |
| b | Date16 | NO |
+-------+--------+------+
2 rows in set (0.00 sec)
mysql> insert into test(a, b) values(1, toDate(1630320462));
ERROR 1105 (HY000): Code: 10, displayText = create_deserializer does not support type 'Date16'.
mysql> insert into test(a, b) values(1, '2021-08-10');
ERROR 1105 (HY000): Code: 10, displayText = create_deserializer does not support type 'Date16'.
```
How can I add the stateless test? Please give some idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As datafuse does not support inserting date16/date32/datetime32 sql like following now.
Yes, we will add this in another pr.
Currently
select toUInt32(now()) returns `1630833797`
select toUInt16(today()) returns `18875`
How about
select toYYYYMM(toDateTime(1630833797) ) = 202109;
select toYYYYMM(toDate(18875) ) = 202109;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
DataType::Date16 | DataType::Date32 => { | ||
let number_vec_result: Vec<Option<u32>> = columns[0] | ||
.column() | ||
.to_values()? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to_values
has poor performance than native array iter.
How about using let col = columns[0].u16()?
and let col = columns[0].u32()?
.
Then we can have col.iter().map(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the u16() method in SeriesTrait?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
00c6eea
to
922c86a
Compare
} | ||
|
||
fn eval(&self, columns: &DataColumnsWithField, _input_rows: usize) -> Result<DataColumn> { | ||
if columns.is_empty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to check, already checked in num_arguments
, but it's trival.
let data_type = columns[0].data_type(); | ||
let number_vec: Vec<Option<u32>> = match data_type { | ||
DataType::Date16 => { | ||
let number_vec_result = columns[0].column().to_array()?.u16()?.iter().map(|value|{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, but there are also much more spaces to improve this function.
- Constant path to check, which may help in
select toYYYYMM( 123455 ) from number(xxx)
- Put the whole eval function into
execute
function, this can be done intoYYYYMMDDhhmmss
, but it's return value is u64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Constant path to check, which may help in select toYYYYMM( 123455 ) from number(xxx)
It is nice to do this, but there are somethings I am not sure:
- If I understand correctly ,
toYYYYMM(123455)
is the same asselect toYYYYMM(toDate(123455))
ortoYYYYMM(toDateTime(123455))
, so which one to choose? toDateTime? Constant
may be negative value , what should return for this situation?- After this,
select toYYYYMM(number) from number(xxx)
will also work. Is it ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the same as select toYYYYMM(toDate(123455)) or toYYYYMM(toDateTime(123455))
It's related to DataType::Date16
or DataType::DateTime
..
Constant may be negative value
Actually, if its logical type is Date16/DateTime32, which physical type is u16
and u32
, the constant value will never be negative.
toYYYYMM(number) will never work, we just need to make toYYYYMM(Date/DateTime) work, the inner argument is physically constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use
let c = columns[0].column().to_array()?.u16()?;
c.apply_cast_numeric(..)
instead, it takes consideration of the null values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use
....
instead, it takes consideration of the null values.
Got it.
Constant path to check, which may help in select toYYYYMM( 123455 ) from number(xxx)
As DataColumn.to_array()
has coverage the Constant
case as Series, what did you mean Constant path to check
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Constant path can have better performance. You are right, you can simply use to_array().
But it will build an array with the same number which has poor performance because you need to calculate each row.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, got it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @ygf11
CI Passed |
Thanks for the contribution! Please review the labels and make any necessary changes. |
I hereby agree to the terms of the CLA available at: https://datafuse.rs/policies/cla/
Summary
impl date16/date32/dateTime32 to YYYYMM UInt32 number.
Changelog
Related Issues
Related #853
Test Plan
Unit Tests
Stateless Tests