Skip to content

Conversation

@zsxwing
Copy link
Member

@zsxwing zsxwing commented Feb 28, 2015

In

val copyForMemory = ByteBuffer.allocate(bytes.limit)
, when StorageLevel is MEMORY_AND_DISK_SER, it will copy the content from file into memory, then put it into MemoryStore.

              val copyForMemory = ByteBuffer.allocate(bytes.limit)
              copyForMemory.put(bytes)
              memoryStore.putBytes(blockId, copyForMemory, level)
              bytes.rewind()

However, if the file is bigger than the free memory, OOM will happen. A better approach is testing if there is enough memory. If not, copyForMemory should not be created, since this is an optional operation.

@SparkQA
Copy link

SparkQA commented Feb 28, 2015

Test build #28118 has started for PR 4827 at commit 0cc0257.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Feb 28, 2015

Test build #28118 has finished for PR 4827 at commit 0cc0257.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28118/
Test PASSed.

@srowen
Copy link
Member

srowen commented Mar 2, 2015

CC @andrewor14

@rxin
Copy link
Contributor

rxin commented Mar 18, 2015

CC @andrewor14 again in case it was missed.

@andrewor14
Copy link
Contributor

Sorry for the delay. I was away for a little more than a week and I will look at this later today.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add a // comment here that links to this JIRA? Otherwise it's not clear why we need to do this lazily.

@andrewor14
Copy link
Contributor

@zsxwing the approach looks reasonable. I'm wondering if there's a more readable way to do the same thing. In particular, currently this uses call-by-name (tryToPut, dropFromMemory...) and chained parameters (your new MemoryStore#putBytes), which we actually discourage in the Databricks style guide: https://github.com/databricks/scala-style-guide. How much of a change will this patch be if we rewrite this patch according to the style guide instead?

@zsxwing
Copy link
Member Author

zsxwing commented Mar 21, 2015

@andrewor14 is it acceptable that changing call-by-name to () => ByteBuffer?

@zsxwing
Copy link
Member Author

zsxwing commented Mar 23, 2015

@andrewor14 I updated to use () => T.

@SparkQA
Copy link

SparkQA commented Mar 23, 2015

Test build #28976 has started for PR 4827 at commit 1100a54.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 23, 2015

Test build #28976 has finished for PR 4827 at commit 1100a54.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28976/
Test PASSed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a little odd for this to be () => bytes everywhere. Can you create an alias that looks something like:

private def tryToPut(blockId: BlockId, value: Any, ...): ResultWithDroppedBlocks = {
  tryToPut(blockId, () => value, ...)
}

such that we only use the lazy version if we have to. Same for dropFromMemory

@andrewor14
Copy link
Contributor

Thanks @zsxwing I left one more code style comment and I think it's ready after that.

@SparkQA
Copy link

SparkQA commented Mar 25, 2015

Test build #29137 has started for PR 4827 at commit 7d25545.

  • This patch merges cleanly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of duplicating the javadocs here I would just refer to the other method. I can fix this when I merge don't worry.

@andrewor14
Copy link
Contributor

LGTM. I will merge this once tests pass thanks.

@SparkQA
Copy link

SparkQA commented Mar 25, 2015

Test build #29137 has finished for PR 4827 at commit 7d25545.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29137/
Test PASSed.

@andrewor14
Copy link
Contributor

Merged this into master.

@asfgit asfgit closed this in 883b7e9 Mar 25, 2015
@zsxwing zsxwing deleted the SPARK-6076 branch March 26, 2015 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants