-
Notifications
You must be signed in to change notification settings - Fork 54
feat: project setup and upload file #1712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: s3-tm-main
Are you sure you want to change the base?
Conversation
|
A new generated diff is ready to view.
|
|
A new generated diff is ready to view.
|
|
A new generated diff is ready to view.
|
|
A new generated diff is ready to view.
|
|
A new generated diff is ready to view.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice start
| public enum class AwsBusinessMetric(public override val identifier: String) : BusinessMetric { | ||
| S3_EXPRESS_BUCKET("J"), | ||
| DDB_MAPPER("d"), | ||
| S3_TRANSFER("G"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two new features S3_TRANSFER_UPLOAD_DIRECTORY and S3_TRANSFER_DOWNLOAD_DIRECTORY we should add here
| subprojects { | ||
| group = "aws.sdk.kotlin" | ||
| version = hllPreviewVersion | ||
| version = if (name == "s3-transfer-manager") sdkVersion else hllPreviewVersion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can override the version in the subproject rather than hacking it in here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or better, override / set hllPreviewVersion in the submodules which need it, and use sdkVersion as the default
| */ | ||
|
|
||
| description = "S3 Transfer Manager for the AWS SDK for Kotlin" | ||
| extra["displayName"] = "AWS :: SDK :: Kotlin :: HLL :: S3TransferManager" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use spaces in the display name, S3 Transfer Manager
| commonMain { | ||
| dependencies { | ||
| implementation(project(":aws-runtime:aws-http")) | ||
| implementation(libs.s3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of depending on an already-published version of S3, we should require S3 to be bootstrapped locally for S3TM to build. Like how we did it for DDB Mapper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was an intentional choice to help prevent changes to S3 from breaking the S3 TM. We don't need to have the latest APIs from S3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it would happen often, but I think either approach is fine. The current one is just slightly safer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't that mean that S3TM will always depend on the n-1 S3 client? So S3TM-1.2.3 uses S3-1.2.2? That doesn't seem right.
We want to catch changes to S3 which break S3TM and fix them as soon as possible. All of our modules are a comprehensive platform and should use the same, unified set of dependency versions.
| S3TransferManager.Companion { | ||
| client = s3Client | ||
| }.uploadFile { | ||
| bucket = "aoperez" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests should either be made generic or not committed yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see they have an @Ignore annotation, that seems fine too
| * Builds a low-level S3 upload part request from a high-level upload file request | ||
| * and data from the S3 Transfer Manager. | ||
| */ | ||
| internal fun buildUploadPartRequest( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: This and the function below might be better as extension functions
| /** | ||
| * An interceptor that emits the S3 Transfer Manager business metric | ||
| */ | ||
| internal object BusinessMetricInterceptor : HttpInterceptor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming: add S3 / S3 Transfer Manager somewhere in the name for better IDE search discovery
|
|
||
| internal fun build(): S3TransferManager = | ||
| S3TransferManager( | ||
| client = client?.withConfig { interceptors += BusinessMetricInterceptor } ?: error("client must be set"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correctness: the underlying client will emit a business metric for every request, whether the S3TM was used or not. Is that what we want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
withConfig makes a copy of the client with some overrides as far as I understand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right
| public suspend fun uploadFile(uploadFileRequest: UploadFileRequest): Deferred<UploadFileResponse> = coroutineScope { | ||
| val multiPartUpload = uploadFileRequest.contentLength >= multipartUploadThreshold | ||
| val uploadedParts = mutableListOf<CompletedPart>() | ||
| var mpuUploadId = "null" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this feels better as a lateinit var
| operationHook(TransferInitiated) { | ||
| if (multiPartUpload) { | ||
| context.response = client.createMultipartUpload(context.request as CreateMultipartUploadRequest) | ||
| mpuUploadId = (context.response as CreateMultipartUploadResponse).uploadId ?: throw Exception("Missing upload id in create multipart upload response") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's throw a better Exception here and elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a scenario the user can cause themselves, is it? This could only happen because of a bug in our S3TM implementation it seems. If so, I'm generally fine with non-specific errors since we cannot expect users to catch them and act on them.
| public val checksumCrc32: String?, | ||
| public val checksumCrc32C: String?, | ||
| public val checksumCrc64Nvme: String?, | ||
| public val checksumSha1: String?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correctness: The builder shouldn't be passing all of its values to the class's constructor. The constructor should accept a builder object and set its own fields that way. This results in less boilerplate:
public class UploadFileRequest(builder: UpdateFileRequest.Builder) {
public val acl = builder.acl
public val body = builder.body
public val bucket = builder.bucket
// ...etc...
public companion object {
public operator fun invoke(block: Builder.() -> Unit): UploadFileRequest =
Builder().apply(block).build()
}
public class Builder {
public var acl: ObjectCannedAcl? = null
public var body: ByteStream? = null
public var bucket: String? = null
// ...etc...
internal fun build(): UploadFileRequest = UploadFileRequest(this)
}
}We follow this pattern in our codegenned shapes and many of our handwritten ones.
Comment applies to other types in this PR.
| public var tagging: String? = null | ||
| public var websiteRedirectLocation: String? = null | ||
|
|
||
| public fun build(): UploadFileRequest = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Generally our build methods are internal because we don't want users to think they have to invoke it manually inside of a DSL block:
val req = UploadFileRequest {
bucket = "foo"
key = "bar"
build() // <-- this is unnecessary
}| import aws.smithy.kotlin.runtime.content.ByteStream | ||
| import aws.smithy.kotlin.runtime.time.Instant | ||
|
|
||
| public class UploadFileRequest private constructor( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: These classes, their builders, their converters, etc. all seem like a lot of repeated boilerplate. Why aren't we code-generating these based on the list of fields from the spec?
| /** | ||
| * Determines the actual part size to use for a multipart S3 upload. | ||
| * | ||
| * This function calculates the part size based on the total size | ||
| * of the file and the requested part size. If the requested part size is | ||
| * too small to allow the upload to fit within S3's 10,000-part limit, the | ||
| * part size will be automatically increased so that exactly 10,000 parts | ||
| * are uploaded. | ||
| */ | ||
| internal fun resolvePartSize(uploadFileRequest: UploadFileRequest, tm: S3TransferManager, logger: Logger): Long { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This function doesn't feel like it belongs in this file. Nothing else in UploadFile.kt uses it, only S3TransferManager.kt.
| val targetNumberOfParts = uploadFileRequest.contentLength / tm.targePartSize | ||
| return if (targetNumberOfParts > MAX_NUMBER_PARTS) { | ||
| ceilDiv(uploadFileRequest.contentLength, MAX_NUMBER_PARTS).also { | ||
| logger.debug { "Target part size is too small to meet the 10,000 S3 part limit. Increasing part size to $it" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: 10000 is already a constant in this file. Reuse that constant in this string rather than duplicating it. This'll reduce the likelihood that the values get out of sync.
| /** | ||
| * High level utility for managing transfers to Amazon S3. | ||
| */ | ||
| public class S3TransferManager private constructor( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style: This class has too much logic inside of it and that won't be sustainable once we add more operations and variants. We should refactor this, possibly so that individual operations are located in a single file and maybe even represented by a class.
| * all parts to fit within S3's limit of 10,000 parts, the part size will be | ||
| * automatically increased so that exactly 10,000 parts are uploaded. | ||
| */ | ||
| public suspend fun uploadFile(uploadFileRequest: UploadFileRequest): Deferred<UploadFileResponse> = coroutineScope { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style: Method is too long and should be refactored.
| * all parts to fit within S3's limit of 10,000 parts, the part size will be | ||
| * automatically increased so that exactly 10,000 parts are uploaded. | ||
| */ | ||
| public suspend fun uploadFile(uploadFileRequest: UploadFileRequest): Deferred<UploadFileResponse> = coroutineScope { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correctness: This method shouldn't return Deferred<UploadFileResponse>, it should return UploadFileResponse just like our low-level APIs do. If users want this to happen asynchronously, they can wrap it in async { } themselves.
Note that coroutineScope does not return until all the work inside is completed (including child coroutines) so the Deferred value would already be filled anyway.
| public interface TransferInterceptor { | ||
| // Transfer initialization hooks | ||
| public fun readBeforeTransferInitiated(context: TransferContext) {} | ||
| public fun modifyBeforeTransferInitiated(context: TransferContext): TransferContext = context | ||
| public fun readAfterTransferInitiated(context: TransferContext) {} | ||
| public fun modifyAfterTransferInitiated(context: TransferContext): TransferContext = context | ||
|
|
||
| // Byte transferring hooks | ||
| public fun readBeforeBytesTransferred(context: TransferContext) {} | ||
| public fun modifyBeforeBytesTransferred(context: TransferContext): TransferContext = context | ||
| public fun readAfterBytesTransferred(context: TransferContext) {} | ||
| public fun modifyAfterBytesTransferred(context: TransferContext): TransferContext = context | ||
|
|
||
| // File transfer hooks | ||
| public fun readBeforeFileTransferred(context: TransferContext) {} | ||
| public fun modifyBeforeFileTransferred(context: TransferContext): TransferContext = context | ||
| public fun readAfterFileTransferred(context: TransferContext) {} | ||
| public fun modifyAfterFileTransferred(context: TransferContext): TransferContext = context | ||
|
|
||
| // Transfer completion hooks | ||
| public fun readBeforeTransferCompleted(context: TransferContext) {} | ||
| public fun modifyBeforeTransferCompleted(context: TransferContext): TransferContext = context | ||
| public fun readAfterTransferCompleted(context: TransferContext) {} | ||
| public fun modifyAfterTransferCompleted(context: TransferContext): TransferContext = context | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Why do all of these hooks have the same TransferContext parameter? Our other interceptors have unique context parameters per hook because we gradually accumulate more context over the lifetime of the request. For instance, readBeforeFileTransferred can never have an upload ID but readAfterFileTransferred can.
| jvmTest { | ||
| dependencies { | ||
| implementation(libs.smithy.kotlin.test.jvm) | ||
| implementation(libs.smithy.kotlin.testing.jvm) | ||
| implementation(libs.smithy.kotlin.aws.signing.common) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correctness: We should not have JVM-only tests in most cases. S3TM must work across all targets and our test dependencies should be available on all targets.
|
A new generated diff is ready to view.
|
Issue #
N/A
Description of changes
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.