-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-19394. Integrate with AAL's readVectored(). #7720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@steveloughran @mukund-thakur could you please review this initial version, couple of small TODO's coming (more tests, and using the |
| * @throws IOException IOE if any. | ||
| */ | ||
| @Override | ||
| public synchronized void readVectored(List<? extends FileRange> ranges, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just for my knowledge- why is this a synchronised method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry my miss.. we just discussed, this does not need to be synchronized. will cut
...op-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/streams/AnalyticsStream.java
Show resolved
Hide resolved
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, though you'll need to include the sdk update on this PR, which can then be split off separately
| import org.apache.hadoop.fs.statistics.IOStatistics; | ||
| import org.apache.hadoop.fs.statistics.StreamStatisticNames; | ||
|
|
||
| import org.junit.Test; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ordering
|
why the SDK upgrade? |
3219341 to
5121093
Compare
|
💔 -1 overall
This message was automatically generated. |
5121093 to
142d1bb
Compare
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another review
hadoop-project/pom.xml
Outdated
| <aws-java-sdk-v2.version>2.29.52</aws-java-sdk-v2.version> | ||
| <amazon-s3-encryption-client-java.version>3.1.1</amazon-s3-encryption-client-java.version> | ||
| <amazon-s3-analyticsaccelerator-s3.version>1.0.0</amazon-s3-analyticsaccelerator-s3.version> | ||
| <amazon-s3-analyticsaccelerator-s3.version>1.2.0</amazon-s3-analyticsaccelerator-s3.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the final merge, this should be pulled out to its own.
should we try to get into 3.4.2 now?
| import java.util.function.Consumer; | ||
| import java.util.function.IntFunction; | ||
|
|
||
| import org.apache.hadoop.fs.FileRange; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: import block
| } | ||
|
|
||
| /** | ||
| * {@inheritDoc} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd put that at L148 and cut the params/IOE as superfluous
| } | ||
|
|
||
| /** | ||
| * {@inheritDoc} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
| return new S3AContract(conf); | ||
| } | ||
|
|
||
| @Override |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a javadoc explaining why the override
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| public void testReadVectoredWithAALStatsCollection() throws Exception { | ||
|
|
||
| List<FileRange> fileRanges = createSampleNonOverlappingRanges(); | ||
| try (FSDataInputStream in = openVectorFile()){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, space before {
|
|
||
| // AAL does not do any range coalescing, so input and combined ranges are the same. | ||
| this.getS3AStreamStatistics().readVectoredOperationStarted(ranges.size(), ranges.size()); | ||
| inputStream.readVectored(objectRanges, allocate, release); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this call release on errors? curious -and hopeful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah no :( release isn't called right now, but will make that change and get it released in the next version
|
|
||
| List<FileRange> fileRanges = createSampleNonOverlappingRanges(); | ||
| try (FSDataInputStream in = openVectorFile()){ | ||
| in.readVectored(fileRanges, getAllocate()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we not verifying the data after vectored read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we should be. lot easy to do faster vector IO if you always return the same array of emtpy bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added in validation
...test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractAnalyticsStreamVectoredRead.java
Show resolved
Hide resolved
786dac7 to
6f730d5
Compare
|
thanks @mukund-thakur and @steveloughran for the reviews, addressed the feedback. Will need to cut a PR to AAL for the release() changes 1/ call release on failure and 2/ add in null check so that Don't think think we need to block this on the above, i will create a new PR to update once there is a new version of AAL |
|
Addressed the release() changes in: awslabs/analytics-accelerator-s3#337 |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
8aa9a1e to
f334768
Compare
|
@steveloughran @mukund-thakur i've address feedback on this, could you please review? |
|
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
pending moving those java. imports in ITestS3AContractAnalyticsStreamVectoredRead.java to a new block of imports above everything else.
| import org.junit.jupiter.params.ParameterizedClass; | ||
| import org.junit.jupiter.params.provider.MethodSource; | ||
|
|
||
| import java.util.List; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: prefer the java.* stuff in a block just under line 20.
|
🎊 +1 overall
This message was automatically generated. |
Description of PR
Adds in support for AAL's readVectored().
How was this patch tested?
Testing in progress.