Skip to content

Commit

Permalink
update docs with more s3 details
Browse files Browse the repository at this point in the history
Update the docs to describe how to configure the s3 client to work
with 3rd party s3 implementations.
  • Loading branch information
pwinckles committed Mar 21, 2024
1 parent 0aace04 commit 9332ace
Show file tree
Hide file tree
Showing 4 changed files with 60 additions and 17 deletions.
43 changes: 38 additions & 5 deletions docs/USAGE.md
Original file line number Diff line number Diff line change
Expand Up @@ -186,17 +186,24 @@ multipart uploads and downloads with the CRT client. However, you can
make multipart uploads work with the old client if it's wrapped in a
`MultipartS3AsyncClient`, but multipart downloads will still not work.

Unfortunately, from our testing, it appears that the CRT client only
works with the official AWS S3, and it does not work with third party
implementations. So, if you are using a third party implementation,
please make sure you wrap your client in a `MultipartS3AsyncClient`.
Otherwise, you will experience performance degradation.
Additionally, if you are using a 3rd party S3 implementation, you will
likely need to disable [object integrity
checks](https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity.html)
on the client that is used by the transfer manager. This is because
most/all 3rd party implementations do not support it, and it causes
the requests to fail.

If you do not specify a transfer manager when constructing the
`OcflS3Client`, then it will create the default transfer manager using
the S3 client it was provided. When you use the default transfer
manager, you need to be sure to close the `OcflRepository` when you
are done with it, otherwise the transfer manager will not be closed.
Note that if you construct your own transfer manager, which is
advisable so that you can configure it to your specifications, it does
not need to use the same S3 client as the one already specified on
`OcflS3Client` but it can. For example, maybe you only want to use the
CRT client in the transfer manager, and you want to run everything
else through the regular client.

If you are using the CRT client, then you need to add
`software.amazon.awssdk.crt:aws-crt` to your project, and create the
Expand All @@ -218,6 +225,32 @@ MultipartS3AsyncClient.create(

Note the use of `MultipartS3AsyncClient`. Very important!

If you are using a 3rd party S3 implementation and need to disable the
object integrity check, then you can do so as follows:

``` java
S3AsyncClient.crtBuilder().checksumValidationEnabled(false).build();
```

Unfortunately, this is harder to do if you use the Netty client
wrapped in `MultipartS3AsyncClient`. As of this writing, it must be
disabled per-request as follows:

``` java
OcflS3Client.builder()
.bucket(bucket)
.s3Client(MultipartS3AsyncClient.create(
S3AsyncClient.builder().build(),
MultipartConfiguration.builder().build()))
.putObjectModifier(
(key, builder) -> builder.overrideConfiguration(override -> override.putExecutionAttribute(
AwsSignerExecutionAttribute.SERVICE_CONFIG,
S3Configuration.builder()
.checksumValidationEnabled(false)
.build())))
.build();
```

### Configuration

Use `OcflStorageBuilder.builder()` to create and configure an
Expand Down
25 changes: 15 additions & 10 deletions ocfl-java-aws/src/main/java/io/ocfl/aws/OcflS3Client.java
Original file line number Diff line number Diff line change
Expand Up @@ -198,10 +198,7 @@ public Future<CloudObjectKey> uploadFileAsync(Path srcPath, String dstPath, Stri
putObjectModifier.accept(dstKey.getKey(), builder);

var upload = transferManager.uploadFile(req -> req.source(srcPath)
.putObjectRequest(builder.bucket(bucket)
.key(dstKey.getKey())
.contentLength(fileSize)
.build())
.putObjectRequest(builder.bucket(bucket).key(dstKey.getKey()).build())
.build());

return new UploadFuture(upload, srcPath, dstKey);
Expand Down Expand Up @@ -622,17 +619,25 @@ public static class Builder {
/**
* The AWS SDK S3 client. Required.
* <p>
* <b>Important:</b> You <b>MUST</b> either use the <a href="https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/crt-based-s3-client.html">CRT client</a>
* or wrap the regular S3AsyncClient in {@link software.amazon.awssdk.services.s3.internal.multipart.MultipartS3AsyncClient}
* in order for multipart uploads to work. Otherwise, files will be uploaded in single PUT requests.
* If a {@link #transferManager(S3TransferManager)} is not specified, then the client specified here will be
* used to create a default transfer manager. If you specify a transfer manager, it does not need to use the
* same client as the one specified here. However, when creating a client to be used by the transfer manager,
* it is important to understand the following gotchas.
* <p>
* Additionally, only the CRT client supports multipart downloads. However, from what I've seen, the CRT client
* <i>only</i> works with AWS, and it does <b>not</b> work with third party S3 implementations. In which case,
* you should use the regular S3AsyncClient with the MultipartS3AsyncClient wrapper.
* The client used by the transfer manager <b>MUST</b> either be the <a href="https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/crt-based-s3-client.html">CRT client</a>
* or the regular S3AsyncClient wrapped in {@link software.amazon.awssdk.services.s3.internal.multipart.MultipartS3AsyncClient}
* in order for multipart uploads to work. Otherwise, files will be uploaded in single PUT requests. Additionally,
* only the CRT client supports multipart downloads.
* <p>
* If you are using a 3rd party S3 implementation, then you will likely additionally need to disable the
* <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity.html">object integrity check</a>
* as most 3rd party implementations do not support it. This easy to do on the CRT client builder by setting
* {@code checksumValidationEnabled()} to {@code false}.
* <p>
* This client is NOT closed when the repository is closed, and the user is responsible for closing it when appropriate.
* <p>
* <pre>{@code
* // Please refer to the official documentation to properly configure your client.
* // When using the CRT client, create it something like this:
* S3AsyncClient.crtBuilder().build();
*
Expand Down
5 changes: 5 additions & 0 deletions ocfl-java-itest/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,11 @@
<artifactId>s3mock-junit5</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>software.amazon.awssdk.crt</groupId>
<artifactId>aws-crt</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-core</artifactId>
Expand Down
4 changes: 2 additions & 2 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -340,14 +340,14 @@
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>bom</artifactId>
<version>2.24.13</version>
<version>2.25.13</version>
<type>pom</type>
<scope>import</scope>
</dependency>
<dependency>
<groupId>software.amazon.awssdk.crt</groupId>
<artifactId>aws-crt</artifactId>
<version>0.21.9</version>
<version>0.29.12</version>
</dependency>

<!-- Test -->
Expand Down

0 comments on commit 9332ace

Please sign in to comment.