Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[aws-cpp-sdk-s3-crt]: TransferManager for S3CrtClient #2380

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

grrtrr
Copy link
Contributor

@grrtrr grrtrr commented Mar 7, 2023

This is a self-contained implementation of the TransferManager for s3-crt, taken from working production code.
It solves the problem of handling bulk transfers in parallel.
The API looks mostly similar (hence keeping the copyright note), but there are several differences.

Main differences

The existing TransferManager uses thread pools and thread management, which are not required for s3-crt.

1. Catch local write errors

In order to avoid corruption (e.g. due to a full filesystem), it was important for us to also monitor local write errors, and declare a transfer as failed when file data could not be written out.

This is implemented via an error callback which affects the state of the transfer. It also necessitates an intermediate FAILING state, to properly handle state transition when the final .shutdown_callback is called.

2. Atomic file creation

It was important for us that files are created in a transaction-like fashion: a destination file should only exist if the transfer and the local writes succeed.

To accomplish this, data is written to a "partial" file with a suffix (<fileName>.partial-xxx), and renamed into <fileName> on successful completion. On failure, the "partial" file is removed by the destructor.

3. Cancel-on-first-failure policy

This is similar to (2). Bulk-transfer files are often related so that users are likely not interested to obtain a subset if one or more transfers fail.

Implemented by adding an optional "cancel on first failure" policy.

4. Support for S3 storage tags and object metadata

Implemented by adding these to the corresponding requests.

Not covered by this PR

  1. This is from a Linux system, Windows support will need a bit more work.
  2. We added some unit tests, but left converting out end-to-end or integration tests for now.
  3. Metadata key/value pairs need to be RFC 2047 - encoded. We left out the library to do this.
  4. Instead of Bucket/Key, the TransferManager uses s3://<Bucket>/<Key>. This could be changed easily.

AWS Checklist

Check all that applies:

  • Did a review by yourself.
  • Added proper tests to cover this PR. (If tests are not applicable, explain.) - See notes above.
  • Checked if this PR is a breaking (APIs have been changed) change.
  • Checked if this PR will not introduce cross-platform inconsistent behavior.
  • Checked if this PR would require a ReadMe/Wiki update.

Check which platforms you have built SDK on to verify the correctness of this PR.

  • Linux
  • Windows
  • Android
  • MacOS
  • IOS
  • Other Platforms

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@jmklix jmklix added the needs-review This issue or pull request needs review from a core team member. label Mar 24, 2023
@jmklix jmklix removed the needs-review This issue or pull request needs review from a core team member. label Aug 9, 2024
@jmklix
Copy link
Member

jmklix commented Aug 9, 2024

#1888

@grrtrr
Copy link
Contributor Author

grrtrr commented Aug 9, 2024

#1888

@jmklix - is that the right issue, it does not refer to TransferManager?

@jmklix
Copy link
Member

jmklix commented Aug 10, 2024

@grrtrr yes, thats the correct issue. One of the changes we plan on including in the CMake rework is to allow the choice to pick between s3 and s3crt based transfermanager

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants