Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multipart etag issues with filesystem and transient storage providers #338

Closed
bsfarrell opened this issue Oct 29, 2020 · 4 comments
Closed
Labels
jclouds Requires jclouds changes

Comments

@bsfarrell
Copy link

I'm using S3ProxyRule in a JUnit test (it's really nice!) and I noticed that if I perform a multipart upload and then read back the resulting object metadata, the etag from the HEAD request doesn't match the etag from the multipart complete request.

Here's an example testcase:

public class S3ProxyTest {
  @Rule
  public S3ProxyRule s3Proxy = S3ProxyRule.builder()
      .withCredentials("access", "secret")
      // .withBlobStoreProvider("transient")
      .build();

  @Test
  public void test() {
    final AmazonS3 s3Client = AmazonS3ClientBuilder
        .standard()
        .withCredentials(
            new AWSStaticCredentialsProvider(
                new BasicAWSCredentials(this.s3Proxy.getAccessKey(), this.s3Proxy.getSecretKey())))
        .withEndpointConfiguration(
            new AwsClientBuilder.EndpointConfiguration(this.s3Proxy.getUri().toString(),
                Regions.US_EAST_1.getName()))
        .build();
    s3Client.createBucket("test-bucket");

    final InitiateMultipartUploadResult initiateResult = s3Client
        .initiateMultipartUpload(new InitiateMultipartUploadRequest("test-bucket", "test_object"));
    final UploadPartResult uploadPartResult = s3Client.uploadPart(new UploadPartRequest()
        .withBucketName("test-bucket")
        .withKey("test_object")
        .withUploadId(initiateResult.getUploadId())
        .withPartNumber(1)
        .withInputStream(new ByteArrayInputStream("test".getBytes())));

    final CompleteMultipartUploadResult completeResult = s3Client.completeMultipartUpload(
        new CompleteMultipartUploadRequest(
            "test-bucket",
            "test_object",
            initiateResult.getUploadId(),
            Collections.singletonList(uploadPartResult.getPartETag())));

    final String eTag = completeResult.getETag();

    final ObjectMetadata metadata = s3Client.getObjectMetadata("test-bucket", "test_object");
    Assert.assertEquals(eTag, metadata.getETag());
  }
}

This fails with

org.junit.ComparisonFailure: 
Expected :59adb24ef3cdbe0297f05b395827453f-1
Actual   :d41d8cd98f00b204e9800998ecf8427e

with both the filesystem and transient providers.

I'm running this on a Mac, and it sounds like the lack of extended attributes support might the cause of the etag mismatch for the filesystem provider. So if you want to say that's working as designed, that's fair. But the transient provider would work just fine for my purposes- any chance it can be enhanced to save the multipart etag rather than recalculating it (incorrectly) on read? Thanks!

@gaul
Copy link
Owner

gaul commented Nov 15, 2020

The underlying Apache jclouds LocalBlobStore converts multi-part uploads into single-part uploads in completeMultipartUpload. This allows filesystem access to read large files uploaded via S3Proxy but does not give the expected behavior for tests. I think the best we can do here is add a configuration flag to jclouds. Would you like to submit a pull request to do this?

@gaul
Copy link
Owner

gaul commented Jun 6, 2021

I think I misinterpreted this issue. S3Proxy/jclouds should preserve the MPU-style ETag even when it internally converts the multiparts into a single part. There is no need for a configuration flag. @bsfarrell @timuralp do either of you want to work on this?

@timuralp
Copy link
Collaborator

timuralp commented Jun 6, 2021

I can look into the change in jclouds to store the MPU ETag using extended attributes (where supported). I believe this would resolve this issue and help improve the filesystem blobstore S3 compatibility.

@gaul
Copy link
Owner

gaul commented Aug 5, 2021

References apache/jclouds#118. Upgrading to jclouds 2.4.0 will resolve this issue. ETA: September.

@gaul gaul added jclouds Requires jclouds changes and removed help wanted labels Sep 7, 2021
@gaul gaul closed this as completed in 2521db9 Sep 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jclouds Requires jclouds changes
Projects
None yet
Development

No branches or pull requests

3 participants