-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-9794] [SQL] Fix datetime parsing in SparkSQL. #8396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
See https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark for how to name pull requests, which would automatically link them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might bear explanation -- why does it implementing parsing correctly and why is GMT special-cased?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because XML dates are ISO8601 so the XML date parser parses ISO8601 dates. The reason why GMT is special cased is because this method handles a number of date formats so I preserved the special casing from the previous version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, that's guaranteed by the method in the sense that XML dates must be in ISO8601 format? Do we have tests for this?
Nits: import DatatypeConverter and I think you can avoid the return keyword in the code you added, for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no return, do you still want it to be imported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant the return in the first branch of the if-else above; it's not related to imports. Yes I think the right default is to import classes that are used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I have updated the PR with those two fixes.
|
@kevincox can you follow up on this, adding tests (or else point out the tests I missed)? I think this deserves some basic coverage. |
|
Sorry, had a busy week. I can definitely add a couple of tests. The existing tests also verify that this is working with the previously tested formats (assuming that there were tests). I'll update this PR and ping you as soon as I get a chance. |
|
Tests added (there weren't any previously). I tried to ensure they covered all supported cases. |
|
ok to test |
|
Test build #42161 has finished for PR 8396 at commit
|
|
Test build #1736 has finished for PR 8396 at commit
|
|
OK this looks pretty good. @marmbrus any objections? the logic looks reasonable as do the tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't include the comma here - but it's ok i will fix it after merging it.
This fixes https://issues.apache.org/jira/browse/SPARK-9794 by using a real ISO8601 parser. (courtesy of the xml component of the standard java library)
cc: @angelini