Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-configure S3 TransferManager. #361

Merged
merged 30 commits into from
May 4, 2022

Conversation

krimsz
Copy link
Contributor

@krimsz krimsz commented May 1, 2022

📢 Type of change

  • Bugfix
  • New feature
  • Enhancement
  • Refactoring

📜 Description

  • Adds autoconfiguration for S3TransferManager. Will create a Bean and instantiate the S3TransferManager implementation of S3OutputStream if the preview dependency is in the classpath
  • Adds a new implementation of S3OutputStream using the S3TransferManager. As a side effect since a temporary file is also needed for this implementation, a small refactoring extracting up some common behaviour to an abstract class BaseTempFileS3OutputStream is included.
    • In this case I opted to use inheritance (Abstract class) over composition (having some sort of strategy for the upload action) because I felt it was a bit more straightforward.
  • Adds documentation around the new feature and how to activate it
  • Adds sample code for using the S3TransferManager
  • Adds tests (unit and integration) around the new feature

💡 Motivation and Context

It fixes an open issue

#300

💚 How did you test it?

Unit tests, integration tests and sample app using the new feature

📝 Checklist

  • I reviewed submitted code
  • I added tests to verify changes
  • I updated reference documentation to reflect the change
  • All tests passing
  • No breaking changes

🔮 Next steps

@github-actions github-actions bot added type: dependency-upgrade Dependency version bump type: documentation Documentation or Samples related issue labels May 1, 2022
Copy link
Member

@MatejNedic MatejNedic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @krimsz ,
Tnx so much on contribution!
Left few comments we should think about.

AwsRegionProvider awsRegionProvider) {
return S3TransferManager.builder()
.s3ClientConfiguration(
cfg -> cfg.credentialsProvider(credentialsProvider).region(awsRegionProvider.getRegion()))
Copy link
Member

@MatejNedic MatejNedic May 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets introduce properties so people can choose:
maxConcurrency, targetThroughputInGbps, region ...

Check https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/transfer/s3/S3ClientConfiguration.html

Maybe we should share S3Client region and endpoint overrides? Not sure if there will be many use cases for having S3Client and S3TransferManager defined for different regions and endpoints. If that is case people can always provide their own bean.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also think about adding Cross region support for S3TransferManager.
@maciejwalkowiak wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with properties, also properties for things that can be set with transferConfiguration(..) should be exposed.

Regarding cross origin support - ideally yes, but i since i am not sure at this stage if its easily doable and generating these clients is a bit of hassle, if we decide to do it I can do it myself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of questions to get this right.:

  1. Would it be better to have a nested hierarchy under something like spring.cloud.aws.s3.transferManager or should I assume a flat set of properties at the current level? eg spring.cloud.aws.s3.minimumPartSizeInBytes vs spring.cloud.aws.s3.transferManager.minimumPartSizeInBytes. I tried to look up in the current example properties in this project but most of them feel very slim, although in CredentialsProperties the profile is considered nested so I was planning to create the hierarchy here too
  2. For naming of the properties to be exposed, I'm a fan of keeping the original property name from the library I wrap. In this case I'm asking because we have quite lengthy properties at hand (eg minimumPartSizeInBytes and targetThroughputInGbps). Shall I stick to original names or should I create contractions/slimmer names where applicable?
  3. Because I need a @ConditionalOnClass for the new S3TransferManager, it is currently handled through a static class, but since I will need access to the properties bean injected in the main AutoConfiguration class, is it ok if I create an extra AutoConfiguration class and load it before the existing one? (so the one we currently have acts as default for missing Beans etc)

Sorry for the wall of text

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. nested
  2. keep original
  3. 👍

Well done @krimsz!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, should be ready to review this part. Please note that I added an extra subpackage for the property classes since felt like having inner classes would make it less readable, but I can change it if needed.
On another note, I kept a single auto-configuration test for both auto-configuration classes together because I felt it was clearer than separating it into 2, let me know the preference there

@maciejwalkowiak
Copy link
Contributor

Did small polishing.

@maciejwalkowiak
Copy link
Contributor

@krimsz thanks! If there's anything confusing in comments let us know. I haven't checked docs and tests too deeply but that's something I can polish once other issues are resolved.

@krimsz
Copy link
Contributor Author

krimsz commented May 2, 2022

Hey I will take a look indeed, just FYI I might not be able to fix everything today/tomorrow due to some work constraints but I'll try

@maciejwalkowiak
Copy link
Contributor

@krimsz no worries there is no time pressure with this feature. It it becomes more urgent we will ping you here

Copy link
Contributor

@maciejwalkowiak maciejwalkowiak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@krimsz I did small polishing and most importantly changed the way client config is created so that it stays consistent with other clients - mainly that it takes spring.cloud.aws.endpoint into account.

We need better tests for transfer manager based output stream provider. Perhaps having the same set of tests as in S3ResourceIntegrationTests for all output stream providers would make sense - either through abstraction and separate class for each type of stream provider, or parameterized tests? Something to try out.

I believe S3UploadDirectoryProperties should be an inner static class of S3TransferManagerProperties.

Splitting auto-configurations - I am not sure here - in principle it sounds good, but on the other hand if its more convenient to have single test class for both autoconfigurations maybe there should be actually one. Or two separate tests classes. One way or the other :)

Thanks a lot for your work, it's very nice to receive such a high quality contribution!

@krimsz
Copy link
Contributor Author

krimsz commented May 4, 2022

Gotcha, already saw your changes and indeed make sense. It is my first time developing something here so some conventions/configurations are still unknown to me (I'm just an ocasional user of this library for work).

Will take a look into the points you raised :)

@maciejwalkowiak
Copy link
Contributor

@krimsz write here please when you've made all the changes you indented and I'll take another look then

@krimsz
Copy link
Contributor Author

krimsz commented May 4, 2022

@maciejwalkowiak should be ready, one thing I was not fully convinced is around the ParametrizedTests, one the one side we can simply add a factory method for potentially new ones and reuse the existing class, which is easy, on the other hand JUnit forced me to annotate each method individually since I couldn't find a way to parametrize the entire class, so it's a bit cumbersome to have them all annotated. I think like this could work but I'm all onboard to change it abstraction route if needed

@maciejwalkowiak
Copy link
Contributor

I think i like it more with parameterized test than with abstraction. Regarding repetition, yes this is a drawback, but perhaps we can create an inner annotation and meta-annotate it like described here https://junit.org/junit5/docs/current/user-guide/#writing-tests-meta-annotations ?

I haven't tested it, so it may not work.

@krimsz
Copy link
Contributor Author

krimsz commented May 4, 2022

Alright @maciejwalkowiak so tried a metaAnnotation with a custom value parameter through "@AliasFor" and it did not seem to work. I wanted to prevent to hardcode the MethodSource "value" (method name) in the annotation as it's not fully explicit and will fail if someone changes the name on the class
Also tried to re-implement the JUnit annotation MethodSource but it depends on a package protected class, so no luck there either.

For now used the new annotation but with the hardcoded method value and a comment in the corresponding method that returns the list of providers

@maciejwalkowiak
Copy link
Contributor

@krimsz I've made small polishing.

Since the TransferManagerFileController in samples shows nothing more than using S3TransferManager, its more like an AWS sample than Spring Cloud AWS sample. I think we can just remove completely transfer manager from samples. What do you think @krimsz @MatejNedic

@MatejNedic
Copy link
Member

@krimsz I've made small polishing.

Since the TransferManagerFileController in samples shows nothing more than using S3TransferManager, its more like an AWS sample than Spring Cloud AWS sample. I think we can just remove completely transfer manager from samples. What do you think @krimsz @MatejNedic

I agree, lets not have it in the sample since AWS has it covered on their website.

@maciejwalkowiak maciejwalkowiak added this to the 3.0.0 M1 milestone May 4, 2022
@maciejwalkowiak maciejwalkowiak added component: s3 S3 integration related issue type: feature Integration with a new AWS service or bigger change in existing integration and removed type: documentation Documentation or Samples related issue type: dependency-upgrade Dependency version bump labels May 4, 2022
@sonarqubecloud
Copy link

sonarqubecloud bot commented May 4, 2022

SonarCloud Quality Gate failed.    Quality Gate failed

Bug B 1 Bug
Vulnerability A 0 Vulnerabilities
Security Hotspot E 2 Security Hotspots
Code Smell A 2 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@maciejwalkowiak maciejwalkowiak merged commit 263211e into awspring:main May 4, 2022
@maciejwalkowiak
Copy link
Contributor

Big thanks @krimsz! It's been a pleasure to work with you on this PR. If there's anything more you would like to contribute feel free to drop a comment in an issue (or create new issue if you have ideas).

@krimsz
Copy link
Contributor Author

krimsz commented May 4, 2022

Likewise :), If I see something around I might be able to help again :)

@maciejwalkowiak maciejwalkowiak changed the title S3 transfermanager preview Auto-configure S3 TransferManager. May 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: s3 S3 integration related issue type: feature Integration with a new AWS service or bigger change in existing integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants