Skip to content

Conversation

@liancheng
Copy link
Contributor

Review on Reviewable

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 12, 2015

Test build #32524 has started for PR 6091 at commit 8ff07e8.

@yhuai
Copy link
Contributor

yhuai commented May 12, 2015

Also cc @marmbrus

@marmbrus
Copy link
Contributor

Should we be ignoring anything that starts with _?

@SparkQA
Copy link

SparkQA commented May 12, 2015

Test build #32524 has finished for PR 6091 at commit 8ff07e8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • s"FileOutputCommitter or its subclass is expected, but got a $
    • trait FSBasedRelationProvider
    • abstract class OutputWriter

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32524/
Test PASSed.

@liancheng
Copy link
Contributor Author

@marmbrus No, because there can be dynamic partition column names starting with _ or . (e.g. /path/to/_1=1/_2=2).

@liancheng
Copy link
Contributor Author

For now, we can check whether there is an = in the directory name. If the directory name

  1. doesn't contain =, and
  2. starts with _ or .

then we can ignore them. But we may want to support customized partition discovery strategies later. For example, one possible strategy that some users had once asked for is not to encode partition column names in directory paths, (i.e., path/1/2 instead of path/a=1/b=2). In that case, we simply can't have any assumption over partition directory names, even using _temporary is not 100% safe.

@SparkQA
Copy link

SparkQA commented May 18, 2015

Test build #819 has started for PR 6091 at commit 8ff07e8.

@yhuai
Copy link
Contributor

yhuai commented May 18, 2015

LGTM

@SparkQA
Copy link

SparkQA commented May 18, 2015

Test build #819 has finished for PR 6091 at commit 8ff07e8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request May 18, 2015
<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/6091)
<!-- Reviewable:end -->

Author: Cheng Lian <lian@databricks.com>

Closes #6091 from liancheng/spark-7570 and squashes the following commits:

8ff07e8 [Cheng Lian] Ignores _temporary during partition discovery

(cherry picked from commit 010a1c2)
Signed-off-by: Michael Armbrust <michael@databricks.com>
@asfgit asfgit closed this in 010a1c2 May 18, 2015
@liancheng liancheng deleted the spark-7570 branch May 19, 2015 02:46
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/6091)
<!-- Reviewable:end -->

Author: Cheng Lian <lian@databricks.com>

Closes apache#6091 from liancheng/spark-7570 and squashes the following commits:

8ff07e8 [Cheng Lian] Ignores _temporary during partition discovery
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/6091)
<!-- Reviewable:end -->

Author: Cheng Lian <lian@databricks.com>

Closes apache#6091 from liancheng/spark-7570 and squashes the following commits:

8ff07e8 [Cheng Lian] Ignores _temporary during partition discovery
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/6091)
<!-- Reviewable:end -->

Author: Cheng Lian <lian@databricks.com>

Closes apache#6091 from liancheng/spark-7570 and squashes the following commits:

8ff07e8 [Cheng Lian] Ignores _temporary during partition discovery
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants