Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add toYYYYMM function #1726

Merged
merged 4 commits into from
Sep 5, 2021
Merged

Conversation

ygf11
Copy link
Contributor

@ygf11 ygf11 commented Sep 4, 2021

I hereby agree to the terms of the CLA available at: https://datafuse.rs/policies/cla/

Summary

impl date16/date32/dateTime32 to YYYYMM UInt32 number.

Changelog

  • New Feature

Related Issues

Related #853

Test Plan

Unit Tests

Stateless Tests

@databend-bot
Copy link
Member

Thanks for the contribution!
I have applied any labels matching special text in your PR Changelog.

Please review the labels and make any necessary changes.

@databend-bot databend-bot added the pr-feature this PR introduces a new feature to the codebase label Sep 4, 2021
@PsiACE
Copy link
Member

PsiACE commented Sep 4, 2021

/review @sundy-li

@databend-bot
Copy link
Member

Take the reviewer to sundy-li

@codecov-commenter
Copy link

codecov-commenter commented Sep 4, 2021

Codecov Report

Merging #1726 (922c86a) into master (0d10503) will decrease coverage by 0%.
The diff coverage is 1%.

Impacted file tree graph

@@          Coverage Diff           @@
##           master   #1726   +/-   ##
======================================
- Coverage      72%     72%   -1%     
======================================
  Files         573     574    +1     
  Lines       33938   34002   +64     
======================================
+ Hits        24642   24644    +2     
- Misses       9296    9358   +62     
Impacted Files Coverage Δ
...mon/functions/src/scalars/dates/number_function.rs 0% <0%> (ø)
common/functions/src/scalars/dates/date.rs 90% <100%> (+1%) ⬆️
store/src/meta_service/meta_service_impl.rs 76% <0%> (+1%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0d10503...922c86a. Read the comment docs.

@@ -16,3 +16,9 @@ select today() - 1 = yesterday();
select today() - yesterday() = 1;
select today() + 1 = tomorrow();
select tomorrow() - today() = 1;

select toYYYYMM(toDateTime(1630320462));
select toYYYYMM(today() - 30) = toYYYYMM(today()) - 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if today is the 2021-10-31?

t: PhantomData<T>,
}

pub trait NumberResultFunction {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, we can even make it more generic using NumberResultFunction<T> , this can be addressed in another pr to support toYYYYMMDDhhmmss, which returns UInt64.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will address this in next pr.

Ok(false)
}

fn eval(&self, _columns: &DataColumnsWithField, _input_rows: usize) -> Result<DataColumn> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_columns -> columns

@ygf11 ygf11 force-pushed the feat_toYYYYMM_function branch from d329a59 to 00c6eea Compare September 5, 2021 03:38
select toYYYYMM(today() + 30) = toYYYYMM(today()) + 1;
select toYYYYMM(today()) - toYYYYMM(today() - 30) = 1;
select toYYYYMM(today() + 30) - toYYYYMM(today()) = 1;
select toYYYYMM(today() - 31) = toYYYYMM(today()) - 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not correct too. What if today is 03-01, toYYYYMM(today() - 31) will be toYYYYMM(today()) - 2

Copy link
Contributor Author

@ygf11 ygf11 Sep 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sundy-li Yeah, you are right.
For the case of 03-01, all the four eq expression may return false.
As datafuse does not support inserting date16/date32/datetime32 sql like following now.

mysql> CREATE TABLE test(a UInt8, b date) engine = Memory;
Query OK, 0 rows affected (0.01 sec)

mysql> desc test;
+-------+--------+------+
| Field | Type   | Null |
+-------+--------+------+
| a     | UInt8  | NO   |
| b     | Date16 | NO   |
+-------+--------+------+
2 rows in set (0.00 sec)

mysql> insert into test(a, b) values(1, toDate(1630320462));
ERROR 1105 (HY000): Code: 10, displayText = create_deserializer does not support type 'Date16'.
mysql> insert into test(a, b) values(1, '2021-08-10');
ERROR 1105 (HY000): Code: 10, displayText = create_deserializer does not support type 'Date16'.
```
How can I add the stateless test? Please give some idea.

Copy link
Member

@sundy-li sundy-li Sep 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As datafuse does not support inserting date16/date32/datetime32 sql like following now.

Yes, we will add this in another pr.
Currently

select toUInt32(now()) returns `1630833797`
select toUInt16(today()) returns `18875`

How about

select toYYYYMM(toDateTime(1630833797)  )  =  202109;
select toYYYYMM(toDate(18875)  )  =  202109;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

DataType::Date16 | DataType::Date32 => {
let number_vec_result: Vec<Option<u32>> = columns[0]
.column()
.to_values()?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to_values has poor performance than native array iter.

How about using let col = columns[0].u16()? and let col = columns[0].u32()?.
Then we can have col.iter().map(...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the u16() method in SeriesTrait?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

@ygf11 ygf11 force-pushed the feat_toYYYYMM_function branch from 00c6eea to 922c86a Compare September 5, 2021 12:26
}

fn eval(&self, columns: &DataColumnsWithField, _input_rows: usize) -> Result<DataColumn> {
if columns.is_empty() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to check, already checked in num_arguments, but it's trival.

let data_type = columns[0].data_type();
let number_vec: Vec<Option<u32>> = match data_type {
DataType::Date16 => {
let number_vec_result = columns[0].column().to_array()?.u16()?.iter().map(|value|{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, but there are also much more spaces to improve this function.

  1. Constant path to check, which may help in select toYYYYMM( 123455 ) from number(xxx)
  2. Put the whole eval function into execute function, this can be done in toYYYYMMDDhhmmss, but it's return value is u64.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constant path to check, which may help in select toYYYYMM( 123455 ) from number(xxx)

It is nice to do this, but there are somethings I am not sure:

  1. If I understand correctly , toYYYYMM(123455) is the same as select toYYYYMM(toDate(123455)) or toYYYYMM(toDateTime(123455)), so which one to choose? toDateTime?
  2. Constant may be negative value , what should return for this situation?
  3. After this, select toYYYYMM(number) from number(xxx) will also work. Is it ok?

Copy link
Member

@sundy-li sundy-li Sep 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the same as select toYYYYMM(toDate(123455)) or toYYYYMM(toDateTime(123455))

It's related to DataType::Date16 or DataType::DateTime ..

Constant may be negative value

Actually, if its logical type is Date16/DateTime32, which physical type is u16 and u32, the constant value will never be negative.

toYYYYMM(number) will never work, we just need to make toYYYYMM(Date/DateTime) work, the inner argument is physically constant.

Copy link
Member

@sundy-li sundy-li Sep 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use

let c = columns[0].column().to_array()?.u16()?;
c.apply_cast_numeric(..)

instead, it takes consideration of the null values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use
....
instead, it takes consideration of the null values.

Got it.

Constant path to check, which may help in select toYYYYMM( 123455 ) from number(xxx)

As DataColumn.to_array() has coverage the Constant case as Series, what did you mean Constant path to check?

Copy link
Member

@sundy-li sundy-li Sep 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constant path can have better performance. You are right, you can simply use to_array().
But it will build an array with the same number which has poor performance because you need to calculate each row.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, got it.

Copy link
Member

@sundy-li sundy-li left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @ygf11

@databend-bot
Copy link
Member

CI Passed
Reviewer Approved
Let's Merge

@mergify
Copy link
Contributor

mergify bot commented Mar 16, 2022

Thanks for the contribution!
I have applied any labels matching special text in your PR Changelog.

Please review the labels and make any necessary changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants