-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-44868][SQL] Convert datetime to string by to_char/to_varchar
#42534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
DateFormatClass expression to format a datetime in to_charto_char/to_varchar
to_char/to_varcharto_char/to_varchar
HyukjinKwon
approved these changes
Aug 20, 2023
Member
|
Merged to master. |
Contributor
|
late LGTM |
LuciferYang
added a commit
that referenced
this pull request
Aug 21, 2023
…te` for Java 21 ### What changes were proposed in this pull request? SPARK-44507(#42130) updated `try_arithmetic.sql.out` and `numeric.sql.out`, SPARK-44868(#42534) updated `datetime-formatting.sql.out`, but these PRs didn’t pay attention to the test health on Java 21. So this PR has regenerated the golden files `try_arithmetic.sql.out.java21`, `numeric.sql.out.java21`, and `datetime-formatting.sql.out.java21` of `SQLQueryTestSuite` so that `SQLQueryTestSuite` can be tested with Java 21. ### Why are the changes needed? Restore `SQLQueryTestSuite` to be tested with Java 21. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass GitHub Actions - Manual checked: ``` java -version openjdk version "21-ea" 2023-09-19 OpenJDK Runtime Environment Zulu21+69-CA (build 21-ea+28) OpenJDK 64-Bit Server VM Zulu21+69-CA (build 21-ea+28, mixed mode, sharing) ``` ``` SPARK_GENERATE_GOLDEN_FILES=0 build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite" ``` **Before** ``` ... [info] - datetime-formatting.sql *** FAILED *** (316 milliseconds) [info] datetime-formatting.sql [info] Array("-- Automatically generated by SQLQueryTestSuite [info] ", "create temporary view v as select col from values [info] (timestamp '1582-06-01 11:33:33.123UTC+080000'), [info] (timestamp '1970-01-01 00:00:00.000Europe/Paris'), [info] (timestamp '1970-12-31 23:59:59.999Asia/Srednekolymsk'), [info] (timestamp '1996-04-01 00:33:33.123Australia/Darwin'), [info] (timestamp '2018-11-17 13:33:33.123Z'), [info] (timestamp '2020-01-01 01:33:33.123Asia/Shanghai'), [info] (timestamp '2100-01-01 01:33:33.123America/Los_Angeles') t(col) [info] ", "struct<> [info] ", " [info] [info] [info] ", "select col, date_format(col, 'G GG GGG GGGG') from v [info] ", "struct<col:timestamp,date_format(col, G GG GGG GGGG):string> [info] ", "1582-05-31 19:40:35.123 AD AD AD Anno Domini [info] 1969-12-31 15:00:00 AD AD AD Anno Domini [info] 1970-12-31 04:59:59.999 AD AD AD Anno Domini [info] 1996-03-31 07:03:33.123 AD AD AD Anno Domini [info] 2018-11-17 05:33:33.123 AD AD AD Anno Domini [info] 2019-12-31 09:33:33.123 AD AD AD Anno Domini [info] 2100-01-01 01:33:33.123 AD AD AD Anno Domini [info] [info] [info] ", "select col, date_format(col, 'y yy yyy yyyy yyyyy yyyyyy') from v [info] ", "struct<col:timestamp,date_format(col, y yy yyy yyyy yyyyy yyyyyy):string> [info] ", "1582-05-31 19:40:35.123 1582 82 1582 1582 01582 001582 [info] 1969-12-31 15:00:00 1969 69 1969 1969 01969 001969 [info] 1970-12-31 04:59:59.999 1970 70 1970 1970 01970 001970 [info] 1996-03-31 07:03:33.123 1996 96 1996 1996 01996 001996 [info] 2018-11-17 05:33:33.123 2018 18 2018 2018 02018 002018 [info] 2019-12-31 09:33:33.123 2019 19 2019 2019 02019 002019 [info] 2100-01-01 01:33:33.123 2100 00 2100 2100 02100 002100 [info] ... [info] - postgreSQL/numeric.sql *** FAILED *** (35 seconds, 848 milliseconds) [info] postgreSQL/numeric.sql [info] Expected "...rg.apache.spark.sql.[]AnalysisException [info] { [info] ...", but got "...rg.apache.spark.sql.[catalyst.Extended]AnalysisException [info] { [info] ..." Result did not match for query #544 [info] SELECT '' AS to_number_2, to_number('-34,338,492.654,878', '99G999G999D999G999') (SQLQueryTestSuite.scala:876) [info] org.scalatest.exceptions.TestFailedException: ... [info] - try_arithmetic.sql *** FAILED *** (314 milliseconds) [info] try_arithmetic.sql [info] Expected "...rg.apache.spark.sql.[]AnalysisException [info] { [info] ...", but got "...rg.apache.spark.sql.[catalyst.Extended]AnalysisException [info] { [info] ..." Result did not match for query #20 [info] SELECT try_add(interval 2 year, interval 2 second) (SQLQueryTestSuite.scala:876) [info] org.scalatest.exceptions.TestFailedException: ``` **After** ``` [info] Run completed in 9 minutes, 10 seconds. [info] Total number of tests run: 572 [info] Suites: completed 1, aborted 0 [info] Tests: succeeded 572, failed 0, canceled 0, ignored 59, pending 0 [info] All tests passed. ``` ### Was this patch authored or co-authored using generative AI tooling? No Closes #42580 from LuciferYang/SPARK-44888. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: yangjie01 <yangjie01@baidu.com>
valentinp17
pushed a commit
to valentinp17/spark
that referenced
this pull request
Aug 24, 2023
### What changes were proposed in this pull request? In the PR, I propose to use the `DateFormatClass` expression for formatting a datetime expression to string in the `to_char`/`to_varchar` functions, and the `ToCharacter` expression in other cases. ### Why are the changes needed? To achieve feature parity with other systems, and make the migration from them eaiser: - Snowflake: https://docs.snowflake.com/en/sql-reference/functions/to_char - PostgreSQL: https://www.postgresql.org/docs/current/functions-formatting.html - MariaDB: https://mariadb.com/kb/en/to_char/ - DB2: https://www.ibm.com/docs/en/db2/11.5?topic=sf-char - Exasol: https://docs.exasol.com/db/latest/sql_references/functions/alphabeticallistfunctions/to_char%20(datetime).htm ### Does this PR introduce _any_ user-facing change? No, but it can if an user's query depends on errors from `to_char`. ### How was this patch tested? By running the affected test suites: ``` $ SPARK_GENERATE_GOLDEN_FILES=1 PYSPARK_PYTHON=python3 build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite" $ build/sbt "test:testOnly *StringExpressionsSuite" $ build/sbt "sql/test:testOnly org.apache.spark.sql.expressions.ExpressionInfoSuite" ``` Closes apache#42534 from MaxGekk/replace-exprs. Authored-by: Max Gekk <max.gekk@gmail.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
valentinp17
pushed a commit
to valentinp17/spark
that referenced
this pull request
Aug 24, 2023
…te` for Java 21 ### What changes were proposed in this pull request? SPARK-44507(apache#42130) updated `try_arithmetic.sql.out` and `numeric.sql.out`, SPARK-44868(apache#42534) updated `datetime-formatting.sql.out`, but these PRs didn’t pay attention to the test health on Java 21. So this PR has regenerated the golden files `try_arithmetic.sql.out.java21`, `numeric.sql.out.java21`, and `datetime-formatting.sql.out.java21` of `SQLQueryTestSuite` so that `SQLQueryTestSuite` can be tested with Java 21. ### Why are the changes needed? Restore `SQLQueryTestSuite` to be tested with Java 21. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass GitHub Actions - Manual checked: ``` java -version openjdk version "21-ea" 2023-09-19 OpenJDK Runtime Environment Zulu21+69-CA (build 21-ea+28) OpenJDK 64-Bit Server VM Zulu21+69-CA (build 21-ea+28, mixed mode, sharing) ``` ``` SPARK_GENERATE_GOLDEN_FILES=0 build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite" ``` **Before** ``` ... [info] - datetime-formatting.sql *** FAILED *** (316 milliseconds) [info] datetime-formatting.sql [info] Array("-- Automatically generated by SQLQueryTestSuite [info] ", "create temporary view v as select col from values [info] (timestamp '1582-06-01 11:33:33.123UTC+080000'), [info] (timestamp '1970-01-01 00:00:00.000Europe/Paris'), [info] (timestamp '1970-12-31 23:59:59.999Asia/Srednekolymsk'), [info] (timestamp '1996-04-01 00:33:33.123Australia/Darwin'), [info] (timestamp '2018-11-17 13:33:33.123Z'), [info] (timestamp '2020-01-01 01:33:33.123Asia/Shanghai'), [info] (timestamp '2100-01-01 01:33:33.123America/Los_Angeles') t(col) [info] ", "struct<> [info] ", " [info] [info] [info] ", "select col, date_format(col, 'G GG GGG GGGG') from v [info] ", "struct<col:timestamp,date_format(col, G GG GGG GGGG):string> [info] ", "1582-05-31 19:40:35.123 AD AD AD Anno Domini [info] 1969-12-31 15:00:00 AD AD AD Anno Domini [info] 1970-12-31 04:59:59.999 AD AD AD Anno Domini [info] 1996-03-31 07:03:33.123 AD AD AD Anno Domini [info] 2018-11-17 05:33:33.123 AD AD AD Anno Domini [info] 2019-12-31 09:33:33.123 AD AD AD Anno Domini [info] 2100-01-01 01:33:33.123 AD AD AD Anno Domini [info] [info] [info] ", "select col, date_format(col, 'y yy yyy yyyy yyyyy yyyyyy') from v [info] ", "struct<col:timestamp,date_format(col, y yy yyy yyyy yyyyy yyyyyy):string> [info] ", "1582-05-31 19:40:35.123 1582 82 1582 1582 01582 001582 [info] 1969-12-31 15:00:00 1969 69 1969 1969 01969 001969 [info] 1970-12-31 04:59:59.999 1970 70 1970 1970 01970 001970 [info] 1996-03-31 07:03:33.123 1996 96 1996 1996 01996 001996 [info] 2018-11-17 05:33:33.123 2018 18 2018 2018 02018 002018 [info] 2019-12-31 09:33:33.123 2019 19 2019 2019 02019 002019 [info] 2100-01-01 01:33:33.123 2100 00 2100 2100 02100 002100 [info] ... [info] - postgreSQL/numeric.sql *** FAILED *** (35 seconds, 848 milliseconds) [info] postgreSQL/numeric.sql [info] Expected "...rg.apache.spark.sql.[]AnalysisException [info] { [info] ...", but got "...rg.apache.spark.sql.[catalyst.Extended]AnalysisException [info] { [info] ..." Result did not match for query apache#544 [info] SELECT '' AS to_number_2, to_number('-34,338,492.654,878', '99G999G999D999G999') (SQLQueryTestSuite.scala:876) [info] org.scalatest.exceptions.TestFailedException: ... [info] - try_arithmetic.sql *** FAILED *** (314 milliseconds) [info] try_arithmetic.sql [info] Expected "...rg.apache.spark.sql.[]AnalysisException [info] { [info] ...", but got "...rg.apache.spark.sql.[catalyst.Extended]AnalysisException [info] { [info] ..." Result did not match for query apache#20 [info] SELECT try_add(interval 2 year, interval 2 second) (SQLQueryTestSuite.scala:876) [info] org.scalatest.exceptions.TestFailedException: ``` **After** ``` [info] Run completed in 9 minutes, 10 seconds. [info] Total number of tests run: 572 [info] Suites: completed 1, aborted 0 [info] Tests: succeeded 572, failed 0, canceled 0, ignored 59, pending 0 [info] All tests passed. ``` ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#42580 from LuciferYang/SPARK-44888. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: yangjie01 <yangjie01@baidu.com>
MaxGekk
added a commit
that referenced
this pull request
Sep 6, 2023
…`to_char`/`to_varchar` ### What changes were proposed in this pull request? In the PR, I propose to document the recent changes related to the `format` of the `to_char`/`to_varchar` functions: 1. binary formats added by #42632 2. datetime formats introduced by #42534 ### Why are the changes needed? To inform users about recent changes. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? By CI. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #42801 from MaxGekk/doc-to_char-api. Authored-by: Max Gekk <max.gekk@gmail.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
In the PR, I propose to use the
DateFormatClassexpression for formatting a datetime expression to string in theto_char/to_varcharfunctions, and theToCharacterexpression in other cases.Why are the changes needed?
To achieve feature parity with other systems, and make the migration from them eaiser:
Does this PR introduce any user-facing change?
No, but it can if an user's query depends on errors from
to_char.How was this patch tested?
By running the affected test suites: