Skip to content

Conversation

@vanzin
Copy link
Contributor

@vanzin vanzin commented Jun 4, 2015

Even with all the efforts to cleanup the temp directories created by
unit tests, Spark leaves a lot of garbage in /tmp after a test run.
This change overrides java.io.tmpdir to place those files under the
build directory instead.

After an sbt full unit test run, I was left with > 400 MB of temp
files. Since they're now under the build dir, it's much easier to
clean them up.

Also make a slight change to a unit test to make it not pollute the
source directory with test data.

Even with all the efforts to cleanup the temp directories created by
unit tests, Spark leaves a lot of garbage in /tmp after a test run.
This change overrides java.io.tmpdir to place those files under the
build directory instead.

After an sbt full unit test run, I was left with > 400 MB of temp
files. Since they're now under the build dir, it's much easier to
clean them up.

Also make a slight change to a unit test to make it not pollute the
source directory with test data.
@JoshRosen
Copy link
Contributor

Nice catch. This plays nicely with some Jenkins tuning work that I'm doing, where we're experimenting with using ramdisks to store the Jenkins workspaces for the pull request builders.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

${project.build.directory}/target/tmp ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

${project.build.directory} is already the target directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, ok!

@harishreedharan
Copy link
Contributor

+1

@SparkQA
Copy link

SparkQA commented Jun 4, 2015

Test build #34212 has finished for PR 6653 at commit aa92944.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 4, 2015

Test build #883 has finished for PR 6653 at commit aa92944.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor

Update: ramdisk didn't make a difference, so we're not using it.

@SparkQA
Copy link

SparkQA commented Jun 5, 2015

Test build #34219 has finished for PR 6653 at commit 31e2dd5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • public final class UnsafeRow extends BaseMutableRow
    • public abstract class BaseMutableRow extends BaseRow implements MutableRow
    • public abstract class BaseRow implements Row
    • protected class CodeGenContext
    • abstract class BaseMutableProjection extends MutableProjection
    • class SpecificProjection extends $
    • class BaseOrdering extends Ordering[Row]
    • class SpecificOrdering extends $
    • abstract class Predicate
    • class SpecificPredicate extends $
    • abstract class BaseProject extends Projection
    • class SpecificProjection extends $
    • final class SpecificRow extends $

@srowen
Copy link
Member

srowen commented Jun 5, 2015

Looks OK to me.

@JoshRosen
Copy link
Contributor

We should probably merge this into branch-1.4 and 1.3 as well, since those are the more active backport branches; I think that Shane will appreciate the reduction in /tmp garbage on the Jenkins boxes :)

asfgit pushed a commit that referenced this pull request Jun 5, 2015
Even with all the efforts to cleanup the temp directories created by
unit tests, Spark leaves a lot of garbage in /tmp after a test run.
This change overrides java.io.tmpdir to place those files under the
build directory instead.

After an sbt full unit test run, I was left with > 400 MB of temp
files. Since they're now under the build dir, it's much easier to
clean them up.

Also make a slight change to a unit test to make it not pollute the
source directory with test data.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #6653 from vanzin/unit-test-tmp and squashes the following commits:

31e2dd5 [Marcelo Vanzin] Fix tests that depend on each other.
aa92944 [Marcelo Vanzin] [minor] [build] Use custom temp directory during build.

(cherry picked from commit b16b543)
Signed-off-by: Sean Owen <sowen@cloudera.com>
@asfgit asfgit closed this in b16b543 Jun 5, 2015
asfgit pushed a commit that referenced this pull request Jun 5, 2015
Even with all the efforts to cleanup the temp directories created by
unit tests, Spark leaves a lot of garbage in /tmp after a test run.
This change overrides java.io.tmpdir to place those files under the
build directory instead.

After an sbt full unit test run, I was left with > 400 MB of temp
files. Since they're now under the build dir, it's much easier to
clean them up.

Also make a slight change to a unit test to make it not pollute the
source directory with test data.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #6653 from vanzin/unit-test-tmp and squashes the following commits:

31e2dd5 [Marcelo Vanzin] Fix tests that depend on each other.
aa92944 [Marcelo Vanzin] [minor] [build] Use custom temp directory during build.

(cherry picked from commit b16b543)
Signed-off-by: Sean Owen <sowen@cloudera.com>
@vanzin vanzin deleted the unit-test-tmp branch June 5, 2015 16:49
@JoshRosen
Copy link
Contributor

Looks like this may have broken a SparkSubmitSuite test: https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-pre-YARN/2654/

We may need to ensure that the temp dir. exists before we run tests or something like that; haven't had time to investigate. Can someone look at this and either revert this patch or push a hotfix?

@vanzin
Copy link
Contributor Author

vanzin commented Jun 5, 2015

Looking. Worked for me locally but may have been a test ordering thing.

@andrewor14
Copy link
Contributor

I reverted this. Could we file a JIRA for this actually? This is a borderline minor change that would be good to track. The implications of the changes here are not super obvious from the diff.

vanzin pushed a commit to vanzin/spark that referenced this pull request Jun 5, 2015
Even with all the efforts to cleanup the temp directories created by
unit tests, Spark leaves a lot of garbage in /tmp after a test run.
This change overrides java.io.tmpdir to place those files under the
build directory instead.

After an sbt full unit test run, I was left with > 400 MB of temp
files. Since they're now under the build dir, it's much easier to
clean them up.

Also make a slight change to a unit test to make it not pollute the
source directory with test data.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes apache#6653 from vanzin/unit-test-tmp and squashes the following commits:

31e2dd5 [Marcelo Vanzin] Fix tests that depend on each other.
aa92944 [Marcelo Vanzin] [minor] [build] Use custom temp directory during build.
@pwendell
Copy link
Contributor

pwendell commented Jun 6, 2015

@vanzin @srowen this broke the Spark 1.3 builds, so I revert it in the Spark 1.3 branch.

jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
Even with all the efforts to cleanup the temp directories created by
unit tests, Spark leaves a lot of garbage in /tmp after a test run.
This change overrides java.io.tmpdir to place those files under the
build directory instead.

After an sbt full unit test run, I was left with > 400 MB of temp
files. Since they're now under the build dir, it's much easier to
clean them up.

Also make a slight change to a unit test to make it not pollute the
source directory with test data.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes apache#6653 from vanzin/unit-test-tmp and squashes the following commits:

31e2dd5 [Marcelo Vanzin] Fix tests that depend on each other.
aa92944 [Marcelo Vanzin] [minor] [build] Use custom temp directory during build.
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
Even with all the efforts to cleanup the temp directories created by
unit tests, Spark leaves a lot of garbage in /tmp after a test run.
This change overrides java.io.tmpdir to place those files under the
build directory instead.

After an sbt full unit test run, I was left with > 400 MB of temp
files. Since they're now under the build dir, it's much easier to
clean them up.

Also make a slight change to a unit test to make it not pollute the
source directory with test data.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes apache#6653 from vanzin/unit-test-tmp and squashes the following commits:

31e2dd5 [Marcelo Vanzin] Fix tests that depend on each other.
aa92944 [Marcelo Vanzin] [minor] [build] Use custom temp directory during build.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants