-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add S3 upload integrity check #26
Conversation
src/test/resources/s3proxy.conf
Outdated
# authorization must be aws-v2 or none | ||
s3proxy.authorization=aws-v2 | ||
s3proxy.authorization=AWS_V2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why change this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to easily convert it to AuthenticationType
:
AuthenticationType.valueOf(s3ProxyProperties.getProperty(S3ProxyConstants.PROPERTY_AUTHORIZATION));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why this code doesn't call S3Proxy.Builder.fromProperties
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! I'll try to refactor it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really messy. In order to keep this PR clean and simple, I suggest to do tests refactoring separately. In this PR I'll write it a way so that s3proxy.authorization=aws-v2
remains unchanged.
if (partNumber1 == partNumber2) { | ||
return 0; | ||
} | ||
return partNumber1 > partNumber2 ? 1 : -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace body with:
return Integer.compare(o1.getPartNumber(), o2.getPartNumber())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is available since java 1.7. This project uses 1.6.
@@ -1,6 +1,6 @@ | |||
s3proxy.endpoint=https://127.0.0.1:0 | |||
s3proxy.endpoint=http://127.0.0.1:0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I leave https
I get com.amazonaws.SdkClientException: Unable to execute HTTP request: Unrecognized SSL message, plaintext connection?
. What I am doing wrong. Even without my changes it fails with s3proxy:1.7.0.
// When S3 combines the parts of a multipart upload into the final object, the ETag value is set to the | ||
// hex-encoded MD5 hash of the concatenated binary-encoded (raw bytes) MD5 hashes of each part followed by | ||
// "-" and the number of parts. | ||
MessageDigest md = MessageDigest.getInstance("MD5"); | ||
md.reset(); | ||
for (PartETag partETag : parts) { | ||
md.update(BinaryUtils.fromHex(partETag.getETag())); | ||
} | ||
// Represent byte array as a 32-digit number hexadecimal format followed by "-<partCount>". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you link to a source for these comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish I could add a link to the official documentation. I can only link some more or less reliable sources such as:
- https://stackoverflow.com/questions/42208998/aws-s3-upload-integrity
- https://www.savjee.be/2015/10/Verifying-Amazon-S3-multi-part-uploads-with-ETag-hash/
- https://teppen.io/2018/06/23/aws_s3_etags/
- https://github.com/lambfrier/calc_s3_etag
- https://stackoverflow.com/a/19304527/794158 and https://github.com/antespi/s3md5
@@ -241,6 +247,23 @@ public StreamTransferManager partSize(long partSize) { | |||
return this; | |||
} | |||
|
|||
/** | |||
* Sets whether data integrity check should be performed during upload. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some documentation explaining the two ways that integrity is checked, and that the final check is based on undocumented behaviour of S3 that may change, so they may get a false alarm exception. Also just note how this may lead to exceptions in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
I added IntegrityCheckException
to improve error handling. But I've also noticed that when a part upload fails, logs do not contain any information related. Sample logs when I deliberately make the request fail:
12:09:25.340 [main] INFO o.g.s.org.eclipse.jetty.util.log - Logging initialized @1232ms
12:09:25.384 [main] INFO o.g.s.o.eclipse.jetty.server.Server - jetty-9.2.z-SNAPSHOT
12:09:25.421 [main] INFO o.g.s.o.e.j.server.ServerConnector - Started ServerConnector@9e1cb34{HTTP/1.1}{127.0.0.1:53212}
12:09:25.422 [main] INFO o.g.s.o.eclipse.jetty.server.Server - Started @1316ms
12:09:26.365 [main] INFO a.m.s3upload.StreamTransferManager - Initiated multipart upload to s3proxy-1209536669/stuff with full ID f8d39bbe-b92e-47ab-bb16-9debbe56ec47
12:09:28.818 [pool-3-thread-2] INFO a.m.s3upload.MultiPartOutputStream - Called close() on [MultipartOutputStream for parts 5001 - 10000]
12:09:28.823 [pool-3-thread-1] INFO a.m.s3upload.MultiPartOutputStream - Called close() on [MultipartOutputStream for parts 1 - 5000]
12:09:31.401 [main] INFO o.g.s.o.e.j.server.ServerConnector - Stopped ServerConnector@9e1cb34{HTTP/1.1}{127.0.0.1:0}
java.lang.NullPointerException
at alex.mojaki.s3upload.ExecutorServiceResultsHandler$1.next(ExecutorServiceResultsHandler.java:64)
at alex.mojaki.s3upload.ExecutorServiceResultsHandler.awaitCompletion(ExecutorServiceResultsHandler.java:96)
at alex.mojaki.s3upload.StreamTransferManager.complete(StreamTransferManager.java:358)
at alex.mojaki.s3upload.test.StreamTransferManagerTest.testTransferManager(StreamTransferManagerTest.java:211)
at alex.mojaki.s3upload.test.StreamTransferManagerTest.testTransferManager(StreamTransferManagerTest.java:156)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate? What did you do in this case to make the request fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.withPartSize(-1)
or .setMd5Digest("123");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean I modify UploadPartRequest
in alex.mojaki.s3upload.StreamTransferManager#uploadStreamPart
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I reproduced this and ensured that a message shows in d3e4e20
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! Thanks!
Minor improvements to integrity check
Thanks @gkolakowski-ias for your PR, and @gaul for your reviewing help! I have released this in 2.2.0, eventually (it can take quite some time) it will appear on https://mvnrepository.com/artifact/com.github.alexmojaki/s3-stream-upload |
No description provided.