-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3 multipart upload with Length::UpTo
appears to be broken
#821
Comments
Hey @jdisanti I'm new to rust but looking to contribute to a project and increase my rust knowledge. This seems like a good issue to start, do you think I can work on this? Thanks! |
@RAJAGOPALAN-GANGADHARAN, you can have a go at this but we may be slow to review your changes b/c we're quite busy. A good place to start would be by writing a failing integration test and then updating |
Hi @Velfi thanks for responding back. SO I have setup the project, and made a sample unit test to reproduce. As part of this I want to setup a faster debug-run-loop, Currently I do:
This seems to rebuild everything, when I make a change to the test in s3 I do assemble. My assumption is that while building s3 it also have to rebuild sts etc. which triggers the rebuild of every other components while running tests. I would like to run tests for s3 specifically. how can I achieve that? Thanks! This seems to |
You may not need to be including
|
You need ./gradlew :aws:sdk:assemble -P aws.services=+sts,+sso,+s3,+ssooidc && (cd ../sdk-snapshot/sdk/s3 && cargo check --all-features --all-targets) I have |
Ah I was trying them both individually but not together as the names were slightly confusing, But yeah let me try this. Thanks Velfi and rcoh, will try them out. |
When I do
I think some test specific dependency is missing specifically for s3, which is getting resolved when run at workspace level. However I realized for getting the tests to run, I could just modify the tests directly in the build directory and let incremental build take care of this Combined this with running specific tests my reproduction setup is faster. There by it now recompiles only my changed file, this might not work when we want to regenerate the sdk (maybe some sort of selected copying to build work here). |
This is fixed in the latest version
…On Sat, Nov 25, 2023, 7:11 AM Rajagopalan Gangadharan < ***@***.***> wrote:
***@***.***:~/OpenSource/smithy-rs/aws/sdk/build/aws-sdk$ cargo test --all-features --package aws-sdk-s3
Compiling aws-smithy-async v0.56.1 (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/aws-smithy-async)
Compiling aws-types v0.56.1 (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/aws-types)
Compiling aws-smithy-types v0.56.1 (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/aws-smithy-types)
Compiling aws-smithy-xml v0.56.1 (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/aws-smithy-xml)
Compiling aws-smithy-runtime-api v0.56.1 (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/aws-smithy-runtime-api)
Compiling aws-smithy-eventstream v0.56.1 (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/aws-smithy-eventstream)
Compiling aws-smithy-json v0.56.1 (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/aws-smithy-json)
Compiling aws-smithy-query v0.56.1 (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/aws-smithy-query)
Compiling aws-smithy-http v0.56.1 (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/aws-smithy-http)
Compiling aws-credential-types v0.56.1 (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/aws-credential-types)
Compiling aws-smithy-protocol-test v0.56.1 (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/aws-smithy-protocol-test)
Compiling aws-sigv4 v0.56.1 (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/aws-sigv4)
Compiling aws-smithy-checksums v0.56.1 (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/aws-smithy-checksums)
Compiling aws-http v0.56.1 (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/aws-http)
Compiling aws-smithy-runtime v0.56.1 (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/aws-smithy-runtime)
Compiling aws-runtime v0.56.1 (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/aws-runtime)
Compiling aws-sdk-sso v0.0.0-local (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/sso)
Compiling aws-sdk-ssooidc v0.0.0-local (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/ssooidc)
Compiling aws-sdk-sts v0.0.0-local (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/sts)
Compiling aws-sdk-s3 v0.0.0-local (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/s3)
Compiling aws-config v0.56.1 (/home/raj/OpenSource/smithy-rs/aws/sdk/build/aws-sdk/sdk/aws-config)
error[E0425]: cannot find function `join_all` in module `future`
--> sdk/s3/tests/concurrency.rs:221:31
|
221 | let res: Vec<_> = future::join_all(futures).await;
| ^^^^^^^^ not found in `future`
For more information about this error, try `rustc --explain E0425`.
error: could not compile `aws-sdk-s3` (test "concurrency") due to previous error
warning: build failed, waiting for other jobs to finish...
When I do
cargo test --all-features --package aws-sdk-s3
I think some test specific dependency is missing specifically for s3.
However I realized for getting the tests I could just modify the tests
directly in the build directory and let incremental build take care of
this. There by it now recompiles only my changed file, this might not work
when we want to regenerate the sdk (maybe some sort of selected copying to
build work here).
—
Reply to this email directly, view it on GitHub
<#821 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADYKZ2XAEKOSU6K5YIGWATYGHN65AVCNFSM6AAAAAAY3VZ3YCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRWGI4TGOBSHA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
I tried to reproduce the initial bug report with the following snippet that I stitched together from various AWS examples: use aws_sdk_s3::config::Credentials;
use aws_sdk_s3::operation::create_multipart_upload::CreateMultipartUploadOutput;
use aws_sdk_s3::types::CompletedPart;
use aws_sdk_s3::Config;
use aws_sdk_s3::{config::Region, Client as S3Client};
use aws_smithy_types::byte_stream::{ByteStream, Length};
use rand::distributions::Alphanumeric;
use rand::{thread_rng, Rng};
use std::fs::File;
use std::io::prelude::*;
use std::path::Path;
const CHUNK_SIZE: u64 = 1024 * 1024 * 5;
const MAX_CHUNKS: u64 = 10000;
#[tokio::main]
pub async fn main() {
let access_key_id = "..."; // E.g. "XXXXXXXXXXXXXXXXXXXX"
let secret_access_key = "..."; // E.g. "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
let region = "..."; // E.g. "us-east-1"
let bucket_name = "..."; // E.g. "my-bucket"
let endpoint_url = "..."; // E.g. &format!("https://{bucket_name}.s3.{region}.amazonaws.com");
let key = "..."; // E.g. "my-key"
let client = S3Client::from_conf(
Config::builder()
.behavior_version_latest()
.endpoint_url(endpoint_url)
.region(Region::from_static(region))
.force_path_style(true)
.credentials_provider(Credentials::new(
access_key_id,
secret_access_key,
None,
None,
"",
))
.build(),
);
let multipart_upload_res: CreateMultipartUploadOutput = client
.create_multipart_upload()
.bucket(bucket_name)
.key(key)
.send()
.await
.unwrap();
let upload_id = multipart_upload_res.upload_id().unwrap();
println!("upload_id: {upload_id}");
let mut file = File::create(key).expect("Could not create sample file.");
while file.metadata().unwrap().len() <= CHUNK_SIZE * 4 {
let rand_string: String = thread_rng()
.sample_iter(&Alphanumeric)
.take(256)
.map(char::from)
.collect();
let return_string: String = "\n".to_string();
file.write_all(rand_string.as_ref())
.expect("Error writing to file.");
file.write_all(return_string.as_ref())
.expect("Error writing to file.");
}
println!("File created");
let path = Path::new(&key);
let file_size = tokio::fs::metadata(path).await.unwrap().len();
let chunk_count = (file_size / CHUNK_SIZE) + 1;
if file_size == 0 {
panic!("Bad file size.");
}
if chunk_count > MAX_CHUNKS {
panic!("Too many chunks! Try increasing your chunk size.")
}
let mut upload_parts: Vec<CompletedPart> = Vec::new();
for chunk_index in 0..chunk_count {
let stream = ByteStream::read_from()
.path(path)
.offset(chunk_index * CHUNK_SIZE)
.length(Length::UpTo(CHUNK_SIZE))
.build()
.await
.unwrap();
let part_number = (chunk_index as i32) + 1;
let upload_part_res = client
.upload_part()
.key(key)
.bucket(bucket_name)
.upload_id(upload_id)
.body(stream)
.part_number(part_number)
.send()
.await
.unwrap();
let e_tag = upload_part_res.e_tag.unwrap_or_default();
println!("part_number: {part_number}, e_tag: {e_tag}");
upload_parts.push(
CompletedPart::builder()
.e_tag(e_tag)
.part_number(part_number)
.build(),
);
}
} When running it, I get essentially the same error:
The
@rcoh could you point me to a release where this is fixed, or how to correct my example? |
ah! my comment referred to only the |
…le file length (#3797) ## Motivation and Context awslabs/aws-sdk-rust#821 ## Description This PR addresses an issue with the [Length::UpTo](https://docs.rs/aws-smithy-types/1.2.2/aws_smithy_types/byte_stream/enum.Length.html) usage in `FsBuilder`. Previously, if a value specified for `UpTo` exceeded the remaining file length, `FsBuilder` would incorrectly accept the value. This discrepancy led to failures in subsequent request dispatches, as the actual body size did not match the advertised `Content-Length`, as explained [here](awslabs/aws-sdk-rust#821 (comment)) (thank you @pablosichert for a self-contained reproducer and problem analysis!). ## Testing - Added a unit test for `FsBuilder` verifying the `Length::UpTo` usage - Ran successfully a customer provided [reproducer](awslabs/aws-sdk-rust#821 (comment)) with the code changes in this PR (with an added a call to `complete_multipart_upload` at the end, it also succeeded in uploading the object): ``` upload_id: cTDSngbubD25cOoFCNgjpG55o0hAMQNjO16dNFyNTKjg9PEtkcrKG5rTGzBns7CXoO8T.Qm9GpNj6jgwJTKcXDpsca95wSMWMDfPF0DBhmbk3OAGHuuGM1E70spk2suW File created part_number: 1, e_tag: "5648ddf58c7c90a788d7f16717a61b08" part_number: 2, e_tag: "a6bdad6d65d18d842ef1d57ca4673bc3" part_number: 3, e_tag: "f518f6b19b255ec49b61d511288554fc" part_number: 4, e_tag: "1496524801eb1d0a7cfbe608eb037d9c" part_number: 5, e_tag: "21340de04927ce1bed58ad0375c03e01" ``` ## Checklist - [x] For changes to the smithy-rs codegen or runtime crates, I have created a changelog entry Markdown file in the `.changelog` directory, specifying "client," "server," or both in the `applies_to` key. - [x] For changes to the AWS SDK, generated SDK code, or SDK runtime crates, I have created a changelog entry Markdown file in the `.changelog` directory, specifying "aws-sdk-rust" in the `applies_to` key. ---- _By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice._
Comments on closed issues are hard for our team to see. |
Describe the bug
When doing a multipart upload to S3, I should be able to use ByteStream's
Length::UpTo
, but that fails on the last chunk.Expected Behavior
It should upload the last part of the file even though it's not a full sized part.
Current Behavior
In my test, I was uploading a file with 12345 bytes in parts of 1000 bytes, so the last part should only be 345 bytes long. The 655 in the error below is 1000 - 345:
Note: The minimum part size is 5 MiB, so my example part sizes above won't succeed in the CompleteMultipartUpload call, but the individual UploadPart calls should succeed.
Reproduction Steps
Possible Solution
No response
Additional Information/Context
No response
Version
Environment details (OS name and version, etc.)
MacOS Ventura
Logs
No response
The text was updated successfully, but these errors were encountered: