-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-4107] Fix incorrect handling of read() and skip() return values #2969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
read() may return fewer bytes than requested; when this occurred, the old code would silently return less data than requested, which might cause stream corruption errors.
|
Test build #22318 has started for PR 2969 at commit
|
|
Jenkins, test this please. |
|
Test build #22319 has started for PR 2969 at commit
|
|
Test build #22318 has finished for PR 2969 at commit
|
This is a less critical issue since this code was only called from the log viewer, but it’s still wrong.
|
Test PASSed. |
In this case, we might unnecessarily fail to read a block due to a partial read().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@haoyuan In addition to improper use of read(), I think this method could have potentially returned Some(null) when is == null (which should never happen, but still...).
Can you verify that these changes are correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the source code for all releases back until 0.4.0 (which is the first one Spark supports), and it's true that is cannot be null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Test build #22324 has started for PR 2969 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getName does not return the full path, we should probably use the path instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch; I've updated this to use getAbsolutePath.
|
Test FAILed. |
|
Jenkins, retest this please. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In here (and FileServerSuite), I think that the bug is that this should be nRead >= 0. If nRead is less than the length of file but greater than 0, then I think this would exit the loop without having copied the whole file.
|
Test build #22327 has started for PR 2969 at commit
|
|
Test FAILed. |
|
Found more potential problems: we also appear to ignore the return value of |
|
Test build #22330 has started for PR 2969 at commit
|
|
Test build #22319 timed out for PR 2969 at commit |
|
Test FAILed. |
|
Test build #22324 has finished for PR 2969 at commit
|
|
Test PASSed. |
|
Test build #22327 has finished for PR 2969 at commit
|
|
Test PASSed. |
|
Test build #22330 has finished for PR 2969 at commit
|
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the problem with the old code here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
http://docs.oracle.com/javase/7/docs/api/java/io/FileInputStream.html#skip(long):
Skips over and discards n bytes of data from the input stream.
The skip method may, for a variety of reasons, end up skipping over some smaller number of bytes, possibly 0. If n is negative, an IOException is thrown, even though the skip method of the InputStream superclass does nothing in this case. The actual number of bytes skipped is returned.
This method may skip more bytes than are remaining in the backing file. This produces no exception and the number of bytes skipped may include some number of bytes that were beyond the EOF of the backing file. Attempting to read from the stream after skipping past the end will result in -1 indicating the end of the file.
|
BTW this LGTM. |
|
I merged it in master. Can you also create a patch for branch-1.1? |
`read()` may return fewer bytes than requested; when this occurred, the old code would silently return less data than requested, which might cause stream corruption errors. `skip()` faces similar issues, too. This patch fixes several cases where we mis-handle these methods' return values. Author: Josh Rosen <joshrosen@databricks.com> Closes apache#2969 from JoshRosen/file-channel-read-fix and squashes the following commits: e724a9f [Josh Rosen] Fix similar issue of not checking skip() return value. cbc03ce [Josh Rosen] Update the other log message, too. 01e6015 [Josh Rosen] file.getName -> file.getAbsolutePath d961d95 [Josh Rosen] Fix another issue in FileServerSuite. b9265d2 [Josh Rosen] Fix a similar (minor) issue in TestUtils. cd9d76f [Josh Rosen] Fix a similar error in Tachyon: 3db0008 [Josh Rosen] Fix a similar read() error in Utils.offsetBytes(). db985ed [Josh Rosen] Fix unsafe usage of FileChannel.read(): Conflicts: core/src/main/scala/org/apache/spark/network/ManagedBuffer.scala core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockManager.scala core/src/main/scala/org/apache/spark/storage/DiskStore.scala core/src/test/scala/org/apache/spark/FileServerSuite.scala
|
Thanks! I've opened a new pull request for backporting to branch-1.1. |
…s (branch-1.1 backport) `read()` may return fewer bytes than requested; when this occurred, the old code would silently return less data than requested, which might cause stream corruption errors. `skip()` faces similar issues, too. This patch fixes several cases where we mis-handle these methods' return values. This is a backport of #2969 to `branch-1.1`. Author: Josh Rosen <joshrosen@databricks.com> Closes #2974 from JoshRosen/spark-4107-branch-1.1-backport and squashes the following commits: d82c05b [Josh Rosen] [SPARK-4107] Fix incorrect handling of read() and skip() return values
read()may return fewer bytes than requested; when this occurred, the old code would silently return less data than requested, which might cause stream corruption errors.skip()faces similar issues, too.This patch fixes several cases where we mis-handle these methods' return values.