-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-10287] [SQL] Fixes JSONRelation refreshing on read path #8469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@liancheng Maybe it is better to make JSON, Parquet, and ORC consistent instead of fixing JSON's refresh problem. |
|
LGTM. We should mention this in the release note and migration guide. |
|
Test build #41650 has finished for PR 8469 at commit
|
|
I will test it with my partitioned JSON table. |
|
It works. I will update doc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the release note, we need to add JSON data source will not automatically load new files that are created by other applications (i.e. files that are not inserted to the dataset through Spark SQL). [SPARK-10287].
|
Test build #41705 has finished for PR 8469 at commit
|
|
Test build #1698 has finished for PR 8469 at commit
|
|
I am merging it to master and branch 1.5. |
https://issues.apache.org/jira/browse/SPARK-10287 After porting json to HadoopFsRelation, it seems hard to keep the behavior of picking up new files automatically for JSON. This PR removes this behavior, so JSON is consistent with others (ORC and Parquet). Author: Yin Huai <yhuai@databricks.com> Closes #8469 from yhuai/jsonRefresh. (cherry picked from commit b3dd569) Signed-off-by: Yin Huai <yhuai@databricks.com>
https://issues.apache.org/jira/browse/SPARK-10287
After porting json to HadoopFsRelation, it seems hard to keep the behavior of picking up new files automatically for JSON. This PR removes this behavior, so JSON is consistent with others (ORC and Parquet).