Skip to content

Conversation

@JoshRosen
Copy link
Contributor

read() may return fewer bytes than requested; when this occurred, the old code would silently return less data than requested, which might cause stream corruption errors. skip() faces similar issues, too.

This patch fixes several cases where we mis-handle these methods' return values.

read() may return fewer bytes than requested; when this occurred, the old
code would silently return less data than requested, which might cause stream
corruption errors.
@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22318 has started for PR 2969 at commit db985ed.

  • This patch merges cleanly.

@pwendell
Copy link
Contributor

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22319 has started for PR 2969 at commit db985ed.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22318 has finished for PR 2969 at commit db985ed.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

This is a less critical issue since this code was
only called from the log viewer, but it’s still wrong.
@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22318/
Test PASSed.

In this case, we might unnecessarily fail to read
a block due to a partial read().
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haoyuan In addition to improper use of read(), I think this method could have potentially returned Some(null) when is == null (which should never happen, but still...).

Can you verify that these changes are correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the source code for all releases back until 0.4.0 (which is the first one Spark supports), and it's true that is cannot be null.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22324 has started for PR 2969 at commit b9265d2.

  • This patch merges cleanly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getName does not return the full path, we should probably use the path instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch; I've updated this to use getAbsolutePath.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22323/
Test FAILed.

@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In here (and FileServerSuite), I think that the bug is that this should be nRead >= 0. If nRead is less than the length of file but greater than 0, then I think this would exit the loop without having copied the whole file.

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22327 has started for PR 2969 at commit cbc03ce.

  • This patch merges cleanly.

@JoshRosen JoshRosen changed the title Fix unsafe usage of FileChannel.read() [SPARK-4107] Fix incorrect usage of FileChannel.read() Oct 28, 2014
@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22325/
Test FAILed.

@JoshRosen
Copy link
Contributor Author

Found more potential problems: we also appear to ignore the return value of skip() in a few places.

@JoshRosen JoshRosen changed the title [SPARK-4107] Fix incorrect usage of FileChannel.read() [SPARK-4107] Fix incorrect usage of Channel.read() and Stream.skip() Oct 28, 2014
@JoshRosen JoshRosen changed the title [SPARK-4107] Fix incorrect usage of Channel.read() and Stream.skip() [SPARK-4107] Fix incorrect handling of read() and skip() return values Oct 28, 2014
@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22330 has started for PR 2969 at commit e724a9f.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22319 timed out for PR 2969 at commit db985ed after a configured wait of 120m.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22319/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22324 has finished for PR 2969 at commit b9265d2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22324/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22327 has finished for PR 2969 at commit cbc03ce.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22327/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22330 has finished for PR 2969 at commit e724a9f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22330/
Test PASSed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the problem with the old code here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http://docs.oracle.com/javase/7/docs/api/java/io/FileInputStream.html#skip(long):

Skips over and discards n bytes of data from the input stream.

The skip method may, for a variety of reasons, end up skipping over some smaller number of bytes, possibly 0. If n is negative, an IOException is thrown, even though the skip method of the InputStream superclass does nothing in this case. The actual number of bytes skipped is returned.

This method may skip more bytes than are remaining in the backing file. This produces no exception and the number of bytes skipped may include some number of bytes that were beyond the EOF of the backing file. Attempting to read from the stream after skipping past the end will result in -1 indicating the end of the file.

@rxin
Copy link
Contributor

rxin commented Oct 28, 2014

BTW this LGTM.

@rxin
Copy link
Contributor

rxin commented Oct 28, 2014

I merged it in master.

Can you also create a patch for branch-1.1?

@asfgit asfgit closed this in 46c6341 Oct 28, 2014
JoshRosen added a commit to JoshRosen/spark that referenced this pull request Oct 28, 2014
`read()` may return fewer bytes than requested; when this occurred, the old code would silently return less data than requested, which might cause stream corruption errors.  `skip()` faces similar issues, too.

This patch fixes several cases where we mis-handle these methods' return values.

Author: Josh Rosen <joshrosen@databricks.com>

Closes apache#2969 from JoshRosen/file-channel-read-fix and squashes the following commits:

e724a9f [Josh Rosen] Fix similar issue of not checking skip() return value.
cbc03ce [Josh Rosen] Update the other log message, too.
01e6015 [Josh Rosen] file.getName -> file.getAbsolutePath
d961d95 [Josh Rosen] Fix another issue in FileServerSuite.
b9265d2 [Josh Rosen] Fix a similar (minor) issue in TestUtils.
cd9d76f [Josh Rosen] Fix a similar error in Tachyon:
3db0008 [Josh Rosen] Fix a similar read() error in Utils.offsetBytes().
db985ed [Josh Rosen] Fix unsafe usage of FileChannel.read():

Conflicts:
	core/src/main/scala/org/apache/spark/network/ManagedBuffer.scala
	core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockManager.scala
	core/src/main/scala/org/apache/spark/storage/DiskStore.scala
	core/src/test/scala/org/apache/spark/FileServerSuite.scala
@JoshRosen
Copy link
Contributor Author

Thanks! I've opened a new pull request for backporting to branch-1.1.

@JoshRosen JoshRosen deleted the file-channel-read-fix branch October 28, 2014 07:23
asfgit pushed a commit that referenced this pull request Oct 28, 2014
…s (branch-1.1 backport)

`read()` may return fewer bytes than requested; when this occurred, the old code would silently return less data than requested, which might cause stream corruption errors.  `skip()` faces similar issues, too.

This patch fixes several cases where we mis-handle these methods' return values.

This is a backport of #2969 to `branch-1.1`.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #2974 from JoshRosen/spark-4107-branch-1.1-backport and squashes the following commits:

d82c05b [Josh Rosen] [SPARK-4107] Fix incorrect handling of read() and skip() return values
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants