Skip to content

Conversation

@MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Aug 17, 2023

What changes were proposed in this pull request?

In the PR, I propose to use the DateFormatClass expression for formatting a datetime expression to string in the to_char/to_varchar functions, and the ToCharacter expression in other cases.

Why are the changes needed?

To achieve feature parity with other systems, and make the migration from them eaiser:

Does this PR introduce any user-facing change?

No, but it can if an user's query depends on errors from to_char.

How was this patch tested?

By running the affected test suites:

$ SPARK_GENERATE_GOLDEN_FILES=1 PYSPARK_PYTHON=python3 build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"
$ build/sbt "test:testOnly *StringExpressionsSuite"
$ build/sbt "sql/test:testOnly org.apache.spark.sql.expressions.ExpressionInfoSuite"

@github-actions github-actions bot added the SQL label Aug 17, 2023
@MaxGekk MaxGekk changed the title [WIP][SQL] Use the DateFormatClass expression to format a datetime in to_char [WIP][SPARK-44868][SQL] Convert datetime to string by to_char/to_varchar Aug 18, 2023
@MaxGekk MaxGekk changed the title [WIP][SPARK-44868][SQL] Convert datetime to string by to_char/to_varchar [SPARK-44868][SQL] Convert datetime to string by to_char/to_varchar Aug 18, 2023
@MaxGekk MaxGekk marked this pull request as ready for review August 18, 2023 12:48
@HyukjinKwon
Copy link
Member

Merged to master.

@cloud-fan
Copy link
Contributor

late LGTM

LuciferYang added a commit that referenced this pull request Aug 21, 2023
…te` for Java 21

### What changes were proposed in this pull request?

SPARK-44507(#42130) updated `try_arithmetic.sql.out` and `numeric.sql.out`, SPARK-44868(#42534) updated `datetime-formatting.sql.out`, but these PRs didn’t pay attention to the test health on Java 21. So this PR has regenerated the golden files `try_arithmetic.sql.out.java21`, `numeric.sql.out.java21`, and `datetime-formatting.sql.out.java21` of `SQLQueryTestSuite` so that `SQLQueryTestSuite` can be tested with Java 21.

### Why are the changes needed?
Restore `SQLQueryTestSuite` to be tested with Java 21.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GitHub Actions
- Manual checked:

```
java -version
openjdk version "21-ea" 2023-09-19
OpenJDK Runtime Environment Zulu21+69-CA (build 21-ea+28)
OpenJDK 64-Bit Server VM Zulu21+69-CA (build 21-ea+28, mixed mode, sharing)
```

```
SPARK_GENERATE_GOLDEN_FILES=0 build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"
```

**Before**

```
...
[info] - datetime-formatting.sql *** FAILED *** (316 milliseconds)
[info]   datetime-formatting.sql
[info]   Array("-- Automatically generated by SQLQueryTestSuite
[info]   ", "create temporary view v as select col from values
[info]    (timestamp '1582-06-01 11:33:33.123UTC+080000'),
[info]    (timestamp '1970-01-01 00:00:00.000Europe/Paris'),
[info]    (timestamp '1970-12-31 23:59:59.999Asia/Srednekolymsk'),
[info]    (timestamp '1996-04-01 00:33:33.123Australia/Darwin'),
[info]    (timestamp '2018-11-17 13:33:33.123Z'),
[info]    (timestamp '2020-01-01 01:33:33.123Asia/Shanghai'),
[info]    (timestamp '2100-01-01 01:33:33.123America/Los_Angeles') t(col)
[info]   ", "struct<>
[info]   ", "
[info]
[info]
[info]   ", "select col, date_format(col, 'G GG GGG GGGG') from v
[info]   ", "struct<col:timestamp,date_format(col, G GG GGG GGGG):string>
[info]   ", "1582-05-31 19:40:35.123	AD AD AD Anno Domini
[info]   1969-12-31 15:00:00	AD AD AD Anno Domini
[info]   1970-12-31 04:59:59.999	AD AD AD Anno Domini
[info]   1996-03-31 07:03:33.123	AD AD AD Anno Domini
[info]   2018-11-17 05:33:33.123	AD AD AD Anno Domini
[info]   2019-12-31 09:33:33.123	AD AD AD Anno Domini
[info]   2100-01-01 01:33:33.123	AD AD AD Anno Domini
[info]
[info]
[info]   ", "select col, date_format(col, 'y yy yyy yyyy yyyyy yyyyyy') from v
[info]   ", "struct<col:timestamp,date_format(col, y yy yyy yyyy yyyyy yyyyyy):string>
[info]   ", "1582-05-31 19:40:35.123	1582 82 1582 1582 01582 001582
[info]   1969-12-31 15:00:00	1969 69 1969 1969 01969 001969
[info]   1970-12-31 04:59:59.999	1970 70 1970 1970 01970 001970
[info]   1996-03-31 07:03:33.123	1996 96 1996 1996 01996 001996
[info]   2018-11-17 05:33:33.123	2018 18 2018 2018 02018 002018
[info]   2019-12-31 09:33:33.123	2019 19 2019 2019 02019 002019
[info]   2100-01-01 01:33:33.123	2100 00 2100 2100 02100 002100
[info]
...
[info] - postgreSQL/numeric.sql *** FAILED *** (35 seconds, 848 milliseconds)
[info]   postgreSQL/numeric.sql
[info]   Expected "...rg.apache.spark.sql.[]AnalysisException
[info]   {
[info]   ...", but got "...rg.apache.spark.sql.[catalyst.Extended]AnalysisException
[info]   {
[info]   ..." Result did not match for query #544
[info]   SELECT '' AS to_number_2,  to_number('-34,338,492.654,878', '99G999G999D999G999') (SQLQueryTestSuite.scala:876)
[info]   org.scalatest.exceptions.TestFailedException:
...
[info] - try_arithmetic.sql *** FAILED *** (314 milliseconds)
[info]   try_arithmetic.sql
[info]   Expected "...rg.apache.spark.sql.[]AnalysisException
[info]   {
[info]   ...", but got "...rg.apache.spark.sql.[catalyst.Extended]AnalysisException
[info]   {
[info]   ..." Result did not match for query #20
[info]   SELECT try_add(interval 2 year, interval 2 second) (SQLQueryTestSuite.scala:876)
[info]   org.scalatest.exceptions.TestFailedException:
```

**After**
```
[info] Run completed in 9 minutes, 10 seconds.
[info] Total number of tests run: 572
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 572, failed 0, canceled 0, ignored 59, pending 0
[info] All tests passed.
```

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #42580 from LuciferYang/SPARK-44888.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: yangjie01 <yangjie01@baidu.com>
valentinp17 pushed a commit to valentinp17/spark that referenced this pull request Aug 24, 2023
### What changes were proposed in this pull request?
In the PR, I propose to use the `DateFormatClass` expression for formatting a datetime expression to string in the `to_char`/`to_varchar` functions, and the `ToCharacter` expression in other cases.

### Why are the changes needed?
To achieve feature parity with other systems, and make the migration from them eaiser:
- Snowflake: https://docs.snowflake.com/en/sql-reference/functions/to_char
- PostgreSQL: https://www.postgresql.org/docs/current/functions-formatting.html
- MariaDB: https://mariadb.com/kb/en/to_char/
- DB2: https://www.ibm.com/docs/en/db2/11.5?topic=sf-char
- Exasol: https://docs.exasol.com/db/latest/sql_references/functions/alphabeticallistfunctions/to_char%20(datetime).htm

### Does this PR introduce _any_ user-facing change?
No, but it can if an user's query depends on errors from `to_char`.

### How was this patch tested?
By running the affected test suites:
```
$ SPARK_GENERATE_GOLDEN_FILES=1 PYSPARK_PYTHON=python3 build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"
$ build/sbt "test:testOnly *StringExpressionsSuite"
$ build/sbt "sql/test:testOnly org.apache.spark.sql.expressions.ExpressionInfoSuite"
```

Closes apache#42534 from MaxGekk/replace-exprs.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
valentinp17 pushed a commit to valentinp17/spark that referenced this pull request Aug 24, 2023
…te` for Java 21

### What changes were proposed in this pull request?

SPARK-44507(apache#42130) updated `try_arithmetic.sql.out` and `numeric.sql.out`, SPARK-44868(apache#42534) updated `datetime-formatting.sql.out`, but these PRs didn’t pay attention to the test health on Java 21. So this PR has regenerated the golden files `try_arithmetic.sql.out.java21`, `numeric.sql.out.java21`, and `datetime-formatting.sql.out.java21` of `SQLQueryTestSuite` so that `SQLQueryTestSuite` can be tested with Java 21.

### Why are the changes needed?
Restore `SQLQueryTestSuite` to be tested with Java 21.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GitHub Actions
- Manual checked:

```
java -version
openjdk version "21-ea" 2023-09-19
OpenJDK Runtime Environment Zulu21+69-CA (build 21-ea+28)
OpenJDK 64-Bit Server VM Zulu21+69-CA (build 21-ea+28, mixed mode, sharing)
```

```
SPARK_GENERATE_GOLDEN_FILES=0 build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"
```

**Before**

```
...
[info] - datetime-formatting.sql *** FAILED *** (316 milliseconds)
[info]   datetime-formatting.sql
[info]   Array("-- Automatically generated by SQLQueryTestSuite
[info]   ", "create temporary view v as select col from values
[info]    (timestamp '1582-06-01 11:33:33.123UTC+080000'),
[info]    (timestamp '1970-01-01 00:00:00.000Europe/Paris'),
[info]    (timestamp '1970-12-31 23:59:59.999Asia/Srednekolymsk'),
[info]    (timestamp '1996-04-01 00:33:33.123Australia/Darwin'),
[info]    (timestamp '2018-11-17 13:33:33.123Z'),
[info]    (timestamp '2020-01-01 01:33:33.123Asia/Shanghai'),
[info]    (timestamp '2100-01-01 01:33:33.123America/Los_Angeles') t(col)
[info]   ", "struct<>
[info]   ", "
[info]
[info]
[info]   ", "select col, date_format(col, 'G GG GGG GGGG') from v
[info]   ", "struct<col:timestamp,date_format(col, G GG GGG GGGG):string>
[info]   ", "1582-05-31 19:40:35.123	AD AD AD Anno Domini
[info]   1969-12-31 15:00:00	AD AD AD Anno Domini
[info]   1970-12-31 04:59:59.999	AD AD AD Anno Domini
[info]   1996-03-31 07:03:33.123	AD AD AD Anno Domini
[info]   2018-11-17 05:33:33.123	AD AD AD Anno Domini
[info]   2019-12-31 09:33:33.123	AD AD AD Anno Domini
[info]   2100-01-01 01:33:33.123	AD AD AD Anno Domini
[info]
[info]
[info]   ", "select col, date_format(col, 'y yy yyy yyyy yyyyy yyyyyy') from v
[info]   ", "struct<col:timestamp,date_format(col, y yy yyy yyyy yyyyy yyyyyy):string>
[info]   ", "1582-05-31 19:40:35.123	1582 82 1582 1582 01582 001582
[info]   1969-12-31 15:00:00	1969 69 1969 1969 01969 001969
[info]   1970-12-31 04:59:59.999	1970 70 1970 1970 01970 001970
[info]   1996-03-31 07:03:33.123	1996 96 1996 1996 01996 001996
[info]   2018-11-17 05:33:33.123	2018 18 2018 2018 02018 002018
[info]   2019-12-31 09:33:33.123	2019 19 2019 2019 02019 002019
[info]   2100-01-01 01:33:33.123	2100 00 2100 2100 02100 002100
[info]
...
[info] - postgreSQL/numeric.sql *** FAILED *** (35 seconds, 848 milliseconds)
[info]   postgreSQL/numeric.sql
[info]   Expected "...rg.apache.spark.sql.[]AnalysisException
[info]   {
[info]   ...", but got "...rg.apache.spark.sql.[catalyst.Extended]AnalysisException
[info]   {
[info]   ..." Result did not match for query apache#544
[info]   SELECT '' AS to_number_2,  to_number('-34,338,492.654,878', '99G999G999D999G999') (SQLQueryTestSuite.scala:876)
[info]   org.scalatest.exceptions.TestFailedException:
...
[info] - try_arithmetic.sql *** FAILED *** (314 milliseconds)
[info]   try_arithmetic.sql
[info]   Expected "...rg.apache.spark.sql.[]AnalysisException
[info]   {
[info]   ...", but got "...rg.apache.spark.sql.[catalyst.Extended]AnalysisException
[info]   {
[info]   ..." Result did not match for query apache#20
[info]   SELECT try_add(interval 2 year, interval 2 second) (SQLQueryTestSuite.scala:876)
[info]   org.scalatest.exceptions.TestFailedException:
```

**After**
```
[info] Run completed in 9 minutes, 10 seconds.
[info] Total number of tests run: 572
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 572, failed 0, canceled 0, ignored 59, pending 0
[info] All tests passed.
```

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#42580 from LuciferYang/SPARK-44888.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: yangjie01 <yangjie01@baidu.com>
MaxGekk added a commit that referenced this pull request Sep 6, 2023
…`to_char`/`to_varchar`

### What changes were proposed in this pull request?
In the PR, I propose to document the recent changes related to the `format` of the `to_char`/`to_varchar` functions:
1. binary formats added by #42632
2. datetime formats introduced by #42534

### Why are the changes needed?
To inform users about recent changes.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
By CI.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #42801 from MaxGekk/doc-to_char-api.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants