Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle S3 virtual host request type #2782

Merged
merged 10 commits into from
Oct 1, 2022
Merged

Handle S3 virtual host request type #2782

merged 10 commits into from
Oct 1, 2022

Conversation

askoa
Copy link
Contributor

@askoa askoa commented Sep 27, 2022

Which issue does this PR close?

Closes #2777

Are there any user-facing changes?

Additional input parameter to determine if the virtual host request type should be use for aws S3.

@github-actions github-actions bot added the object-store Object Store Interface label Sep 27, 2022
@tustvold
Copy link
Contributor

This could be my mistake, but running the integration tests with this change I get an error

called `Result::unwrap()` on an `Err` value: Generic { store: "S3", source: ListRequest { source: Error { retries: 0, message: "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>raphael-test-bucket-e27e29e3</Key><RequestId>579ZMFJ7XFMM0CYV</RequestId><HostId>gdpmDPN2NW2o7FBJ6VG8xM4PTSd5xtuVWXlE8WT8LdznRBUAgTB3zlE7KM66YsU3r2ge2jRHCto=</HostId></Error>", source: reqwest::Error { kind: Status(404), url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("raphael-test-bucket-e27e29e3.s3.eu-west-2.amazonaws.com")), port: None, path: "/raphael-test-bucket-e27e29e3", query: Some("list-type=2"), fragment: None } } } } }

The bucket called raphael-test-bucket-e27e29e3 does exist in eu-west-2 so I'm not sure what is going on here?

@askoa
Copy link
Contributor Author

askoa commented Sep 28, 2022

This could be my mistake, but running the integration tests with this change I get an error

called `Result::unwrap()` on an `Err` value: Generic { store: "S3", source: ListRequest { source: Error { retries: 0, message: "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>raphael-test-bucket-e27e29e3</Key><RequestId>579ZMFJ7XFMM0CYV</RequestId><HostId>gdpmDPN2NW2o7FBJ6VG8xM4PTSd5xtuVWXlE8WT8LdznRBUAgTB3zlE7KM66YsU3r2ge2jRHCto=</HostId></Error>", source: reqwest::Error { kind: Status(404), url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("raphael-test-bucket-e27e29e3.s3.eu-west-2.amazonaws.com")), port: None, path: "/raphael-test-bucket-e27e29e3", query: Some("list-type=2"), fragment: None } } } } }

The bucket called raphael-test-bucket-e27e29e3 does exist in eu-west-2 so I'm not sure what is going on here?

The error message says NoSuchKey. From the error message I see that the string raphael-test-bucket-e27e29e3 is used both as bucket and key. Are you passing different argument for key? Does that key exist? The format is https://bucket-name.s3.region-code.amazonaws.com/key-name. If you are passing a different key argument and if that exist then let me know. There might be some problem with the code.

@tustvold
Copy link
Contributor

tustvold commented Sep 28, 2022

I'm just running the standard integration tests, with the new environment variable set to "true"... Is it possible that the request URI is still including the bucket when it doesn't need to?

@askoa
Copy link
Contributor Author

askoa commented Sep 28, 2022

I'm just running the standard integration tests, with the new environment variable set to "true"... Is it possible that the request URI is still including the bucket when it doesn't need to?

Yes. It looks like I have to update the code in multiple places. I'll revise this PR.

@askoa askoa marked this pull request as draft September 28, 2022 13:02
@askoa
Copy link
Contributor Author

askoa commented Sep 28, 2022

@tustvold when an endpoint is given, should we assume it to comply with virtual_hosted_request_style? More precisely, if the endpoint is given and virtual_hosted_request_style is set to true, should we append bucket name at the end?

/// Sets the endpoint for communicating with AWS S3. Default value
/// is based on region.
///
/// For example, this might be set to `"http://localhost:4566:`
/// for testing against a localstack instance.
pub fn with_endpoint(mut self, endpoint: impl Into<String>) -> Self {
self.endpoint = Some(endpoint.into());
self
}

Edit: I am updating the PR assuming if the input endpoint is provided then we assume path style and append bucket at the end. Let me know if this needs to change.

@askoa askoa marked this pull request as ready for review September 28, 2022 16:38
@tustvold
Copy link
Contributor

tustvold commented Sep 30, 2022

I am updating the PR assuming if the input endpoint is provided then we assume path style and append bucket at the end

I would expect the following behaviour:

  • If an endpoint is provided and virtual_hosted_request_style - assume bucket already present in provided endpoint
  • If no endpoint is provided and virtual_hosted_request_style - construct bucket_endpoint with bucket name prefix
  • If an endpoint is provided and not virtual_hosted_request_style - include bucket in path

@askoa
Copy link
Contributor Author

askoa commented Sep 30, 2022

I would expect the following behaviour:

  • If an endpoint is provided and virtual_hosted_request_style - assume bucket already present in provided endpoint
  • If no endpoint is provided and virtual_hosted_request_style - construct bucket_endpoint with bucket name prefix
  • If an endpoint is provided and not virtual_hosted_request_style - include bucket in path

@tustvold I updated the code. Can you please take a look?

Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, just some minor nits

let bucket_endpoint: String;

//If `endpoint` is provided then its assumed to be consistent with
// `virutal_hosted_style_request`. i.e. if `virtual_hosted_request_style` is true then
Copy link
Contributor

@tustvold tustvold Oct 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// `virutal_hosted_style_request`. i.e. if `virtual_hosted_request_style` is true then
// `virtual_hosted_request_style`. i.e. if `virtual_hosted_request_style` is true then

Copy link
Contributor Author

@askoa askoa Oct 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

virtual_hosted_style_request directly comes form AWS page defining it. See https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html. I am not inclined to make this change.

Edit: I see the problem now. I think I have to change the field name. I think it should be virtual_hosted_style_request in all places. Will make that change.

object_store/src/aws/mod.rs Outdated Show resolved Hide resolved
@tustvold tustvold merged commit 0052d25 into apache:master Oct 1, 2022
@ursabot
Copy link

ursabot commented Oct 1, 2022

Benchmark runs are scheduled for baseline = 3999c77 and contender = 0052d25. 0052d25 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
object-store Object Store Interface
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Object Store S3 Alibaba Cloud OSS support
3 participants