-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-35998][SQL] Make from_csv/to_csv to handle year-month intervals properly #33210
Conversation
cc: @MaxGekk |
Kubernetes integration test starting |
Kubernetes integration test status success |
checkAnswer(toCsvDF, Row(toCsvExpected)) | ||
|
||
DataTypeTestUtils.yearMonthIntervalTypes.foreach { fromCsvDtype => | ||
val fromJsonDF = toCsvDF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fromJsonDF -> fromCsvDF
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How embarrassing... Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. The last commit is trivial, and tests passed on the previous one. Merging to master/3.2.
Thank you, @sarutak .
…s properly ### What changes were proposed in this pull request? This PR fixes an issue that `from_csv/to_csv` doesn't handle year-month intervals properly. `from_csv` throws exception if year-month interval types are given. ``` spark-sql> select from_csv("interval '1-2' year to month", "a interval year to month"); 21/07/03 04:32:24 ERROR SparkSQLDriver: Failed in [select from_csv("interval '1-2' year to month", "a interval year to month")] java.lang.Exception: Unsupported type: interval year to month at org.apache.spark.sql.errors.QueryExecutionErrors$.unsupportedTypeError(QueryExecutionErrors.scala:775) at org.apache.spark.sql.catalyst.csv.UnivocityParser.makeConverter(UnivocityParser.scala:224) at org.apache.spark.sql.catalyst.csv.UnivocityParser.$anonfun$valueConverters$1(UnivocityParser.scala:134) ``` Also, `to_csv` doesn't handle year-month interval types properly though any exception is thrown. The result of `to_csv` for year-month interval types is not ANSI interval compliant form. ``` spark-sql> select to_csv(named_struct("a", interval '1-2' year to month)); 14 ``` The result above should be `INTERVAL '1-2' YEAR TO MONTH`. ### Why are the changes needed? Bug fix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? New tests. Closes #33210 from sarutak/csv-yminterval. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Max Gekk <max.gekk@gmail.com> (cherry picked from commit f4237af) Signed-off-by: Max Gekk <max.gekk@gmail.com>
Test build #140632 has finished for PR 33210 at commit
|
Test build #140655 has finished for PR 33210 at commit
|
What changes were proposed in this pull request?
This PR fixes an issue that
from_csv/to_csv
doesn't handle year-month intervals properly.from_csv
throws exception if year-month interval types are given.Also,
to_csv
doesn't handle year-month interval types properly though any exception is thrown.The result of
to_csv
for year-month interval types is not ANSI interval compliant form.The result above should be
INTERVAL '1-2' YEAR TO MONTH
.Why are the changes needed?
Bug fix.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
New tests.