Skip to content

Conversation

@tmnd1991
Copy link
Contributor

@tmnd1991 tmnd1991 commented Jun 4, 2016

What changes were proposed in this pull request?

"test big model load / save" in Word2VecSuite, lately resulted into OOM.
Therefore we decided to make the partitioning adaptive (not based on spark default "spark.kryoserializer.buffer.max" conf) and then testing it using a small buffer size in order to trigger partitioning without allocating too much memory for the test.

How was this patch tested?

It was tested running the following unit test:
org.apache.spark.mllib.feature.Word2VecSuite

@srowen
Copy link
Member

srowen commented Jun 4, 2016

@tmnd1991 tmnd1991 changed the title SPARK-15740 SPARK-15740 MLLIB Jun 4, 2016
@tmnd1991 tmnd1991 changed the title SPARK-15740 MLLIB [SPARK-15740] [MLLIB] Word2VecSuite "big model load / save" caused OOM in maven jenkins builds Jun 4, 2016
@tmnd1991
Copy link
Contributor Author

tmnd1991 commented Jun 4, 2016

I noticed a scala style error, wait till new commit before triggering a jenkins build.

@tmnd1991
Copy link
Contributor Author

Can anyone verify this?

@rxin
Copy link
Contributor

rxin commented Jun 15, 2016

I triggered multiple test runs.

@SparkQA
Copy link

SparkQA commented Jun 15, 2016

Test build #3112 has finished for PR 13509 at commit dfcd850.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 15, 2016

Test build #3113 has finished for PR 13509 at commit dfcd850.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 15, 2016

Test build #3111 has finished for PR 13509 at commit dfcd850.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tmnd1991
Copy link
Contributor Author

The only thing I don't like is that "64m" hard coded, but I couldn't find where default spark confs are stored!

// est. size of this model, given the formula:
// (floatSize * vectorSize + 15) * numWords
// (4 * 10 + 15) * 10 = 550
// therefore it should generate 12 partitions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"12 partitions" --> "multiple partitions" (The exact number isn't important.)

@jkbradley
Copy link
Member

I don't think you can access the default confs in this case. The class KryoSerializer seems to store those privately.

@tmnd1991
Copy link
Contributor Author

I corrected the style errors you pointed out. If you say I cannot retrieve default values, I will leave the 64m hard coded that way.

@jkbradley
Copy link
Member

I verified locally that the test creates a model file with multiple partitions, so LGTM

I'll merge once tests run again.

Thanks!

@SparkQA
Copy link

SparkQA commented Jul 6, 2016

Test build #3164 has finished for PR 13509 at commit 909b6e1.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 6, 2016

Test build #3166 has finished for PR 13509 at commit 909b6e1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member

Merging with master and branch-2.0
Thank you!

asfgit pushed a commit that referenced this pull request Jul 6, 2016
… in maven jenkins builds

## What changes were proposed in this pull request?
"test big model load / save" in Word2VecSuite, lately resulted into OOM.
Therefore we decided to make the partitioning adaptive (not based on spark default "spark.kryoserializer.buffer.max" conf) and then testing it using a small buffer size in order to trigger partitioning without allocating too much memory for the test.

## How was this patch tested?
It was tested running the following unit test:
org.apache.spark.mllib.feature.Word2VecSuite

Author: tmnd1991 <antonio.murgia2@studio.unibo.it>

Closes #13509 from tmnd1991/SPARK-15740.

(cherry picked from commit 040f6f9)
Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
@asfgit asfgit closed this in 040f6f9 Jul 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants