[SPARK-5830][Core]Don't create unnecessary directory for local root dir #4620

Sephiroth-Lin · 2015-02-16T06:39:57Z

Now will create an unnecessary directory for local root directory, and this directory will not be deleted after application exit.
For example:
before will create tmp dir like "/tmp/spark-UUID"
now will create tmp dir like "/tmp/spark-UUID/spark-UUID"
so the dir "/tmp/spark-UUID" will not be deleted as a local root directory.

srowen · 2015-02-16T11:11:05Z

Since this appears to be exactly the same as https://issues.apache.org/jira/browse/SPARK-5801 I wonder if you can reassociate to that JIRA? and consider whether it solves the deeper nesting reported there?

srowen · 2015-02-16T11:11:13Z

ok to test

SparkQA · 2015-02-16T13:12:59Z

Test build #27553 timed out for PR 4620 at commit 26670d8 after a configured wait of 120m.

andrewor14 · 2015-02-16T18:07:45Z

retest this please

SparkQA · 2015-02-16T19:13:31Z

Test build #27560 has finished for PR 4620 at commit 26670d8.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

Sephiroth-Lin · 2015-02-17T01:36:53Z

@srowen yes, this is same as SPARK-5801. In standalone, worker will create temp directories for executor, so if we create an unnecessary directory for local root directory, then when we create temp directory will create too many nested directories.

For PR "Make sure only owner can read / write to directories created for the …" will create a subdirectory for local root dir, but this will cause nested directories, and the subdirectory for local root dir will not be deleted, which will create too many empty directories on the local root dir.

SparkQA · 2015-02-17T04:01:14Z

Test build #27604 has finished for PR 4620 at commit 673e4e3.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-02-17T06:03:28Z

Test build #27611 has finished for PR 4620 at commit 88e4b05.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-02-17T09:36:00Z

Test build #27623 has finished for PR 4620 at commit c0635f1.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2015-02-17T09:51:52Z

@Sephiroth-Lin yes I think this should be directed at SPARK-5801 then. SPARK-5830 is a duplicate.
CC @kayousterhout

Does this correct the many levels of extra temp dirs? it sounds like you're addressing a case where there is one extra but I probably misunderstand.

Sephiroth-Lin · 2015-02-18T13:02:00Z

@srowen as in function "getOrCreateLocalRootDirs" will create a subdirectory for root local dir, then if we call "getLocalDir" will create a subdirectory for root local dir who call getOrCreateLocalRootDirs directly. In current master branch, when we create tmp dir will call getLocaalDir first, so it will create nested directories. And in standalone mode, will create tmp dir first when lunch executor, so total it will create 4 levels directories, in other mode it will create 2 levels directories for all tmp dir.

srowen · 2015-02-18T13:12:55Z

Sounds good. I'd love to get a check from @growse and/or @vanzin on this change since you can see there was a purported purpose to making the subdirectory before, which may now be moot after #4509

Sephiroth-Lin · 2015-02-18T13:35:00Z

@srowen ok, thank you. If this subdirectory is really needed, may be we can add code to delete this subdirectory after jvm exit or sc.stop().

growse · 2015-02-18T14:11:23Z

core/src/main/scala/org/apache/spark/util/Utils.scala

The default value of spark.local.dir is /tmp, which should exist on all linux systems at least. If I've read the patch right, that means that getOrCreateLocalRootDirs will simply return /tmp, which is usually readable by everyone. Presumably, the spark worker will then just write a load of stuff into /tmp. It seems that the whole purpose of the previous behaviour was to create an isolated directory inside spark.local.dir, because it was assumed that this value was a shared directory.

Is this not desirable behaviour, or have I missed something?

srowen · 2015-02-18T14:32:40Z

Yeah, the problem was that it was creating a cascade of .../spark-xxxx/spark-xxxx/spark-xxxx/... directories. The desired outcome is to create exactly one spark-xxxx subdirectory, I think. @Sephiroth-Lin I'm not sure that's quite what this does. This would not create any subdirectories I think. Is there a way to directly and cleanly fix just the problem of extra nested dirs?

vanzin · 2015-02-18T17:34:45Z

Sorry, but this patch is not correct. As @growse mentions, when SPARK_LOCAL_DIRS is not set, this code will try to change the permissions of /tmp on Unix machines. It will also use /tmp/ as the local dir for the driver in client mode, which was the exact thing the original change was trying to avoid.

The correct fix here, if you really care about cleaning up the extra directory, is to export a different env variable from the Worker (here) and handle that variable specially in getOrCreateLocalRootDirs. When that new env variable is set, the code would behave just like the isRunningInYarnContainer() case above the change you're making.

@srowen the current code shouldn't create a cascade of directories, but it does create a two-level-deep "spark-xxxx" hierarchy for executors in standalone mode.

vanzin · 2015-02-18T17:44:33Z

Ah, wait, there's a second problem (which would result in the cascading directories, I think). getLocalDir should cache the local directory it returns, to avoid having to recreate it. (And should probably be made synchronized in the process.)

vanzin · 2015-02-18T18:15:48Z

Hi, me again, sorry for the spam. Regarding my last comment, it's probably better if getOrCreateLocalRootDirs() caches its return value instead of getLocalDir(), since the former is called in several places, and doing that would also cover the getLocalDir() case.

srowen · 2015-02-24T11:07:39Z

I think we should close this PR, on the grounds that the proposed change apparently goes a step too far to remove all nested directories. SPARK-5830 is a duplicate of SPARK-5801, which concerns creating more than 1 nested spark-xxx directory. I think a PR should be reopened for SPARK-5801 that specifically addresses that issue.

Sephiroth-Lin · 2015-02-25T07:54:00Z

@srowen as PR #4747 will cache the local root directories, then we can close this PR first. For PR #4747 I think we also need to remove the local root directories after application is exited or SparkContext is stoped, or else also will create too many empty directories.

srowen · 2015-03-02T12:34:51Z

Mind closing this PR?

Sephiroth-Lin · 2015-03-03T06:42:23Z

@srowen ok, pls help to close this.

srowen · 2015-03-03T08:58:45Z

@Sephiroth-Lin you will have to close it yourself

Sephiroth-Lin added 2 commits February 16, 2015 14:36

Don't create unnecessary directory for local root dir

916fa04

Don't create unnecessary directory for local root dir

26670d8

format code

88e4b05

fix style

c0635f1

growse reviewed Feb 18, 2015
View reviewed changes

Sephiroth-Lin closed this Mar 5, 2015

Sephiroth-Lin deleted the SPARK-5830 branch May 15, 2016 10:15

[SPARK-5830][Core]Don't create unnecessary directory for local root dir #4620

[SPARK-5830][Core]Don't create unnecessary directory for local root dir #4620

Uh oh!

Conversation

Sephiroth-Lin commented Feb 16, 2015

Uh oh!

srowen commented Feb 16, 2015

Uh oh!

srowen commented Feb 16, 2015

Uh oh!

SparkQA commented Feb 16, 2015

Uh oh!

andrewor14 commented Feb 16, 2015

Uh oh!

SparkQA commented Feb 16, 2015

Uh oh!

Sephiroth-Lin commented Feb 17, 2015

Uh oh!

SparkQA commented Feb 17, 2015

Uh oh!

SparkQA commented Feb 17, 2015

Uh oh!

SparkQA commented Feb 17, 2015

Uh oh!

srowen commented Feb 17, 2015

Uh oh!

Sephiroth-Lin commented Feb 18, 2015

Uh oh!

srowen commented Feb 18, 2015

Uh oh!

Sephiroth-Lin commented Feb 18, 2015

Uh oh!

growse Feb 18, 2015

Choose a reason for hiding this comment

Uh oh!

srowen commented Feb 18, 2015

Uh oh!

vanzin commented Feb 18, 2015

Uh oh!

vanzin commented Feb 18, 2015

Uh oh!

vanzin commented Feb 18, 2015

Uh oh!

srowen commented Feb 24, 2015

Uh oh!

Sephiroth-Lin commented Feb 25, 2015

Uh oh!

srowen commented Mar 2, 2015

Uh oh!

Sephiroth-Lin commented Mar 3, 2015

Uh oh!

srowen commented Mar 3, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants