Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use a custom base_url with delete_object #189

Closed
amoeba opened this issue Jan 11, 2018 · 20 comments
Closed

Unable to use a custom base_url with delete_object #189

amoeba opened this issue Jan 11, 2018 · 20 comments
Labels

Comments

@amoeba
Copy link

amoeba commented Jan 11, 2018

I was writing a function delete an Object from a DigitalOcean Space (which follows the AWS S3 API) using aws.s3::delete_object and was getting an error about a bucket not found. A quick debug and a look at the source of delete_object revealed the cause was a superfluous call to

regionname <- get_region(bucket)

It's superfluous in that it isn't used later on in the function body and, because I'm querying a custom endpoint and the function (appears?) hard-coded to only support AWS, I was getting an error.

Since this line of code isn't doing anything, I'm going to submit a PR so, if you agree with my change, this can get closed out quickly.

@kaneplusplus
Copy link
Contributor

kaneplusplus commented Feb 19, 2018

Please note, this is also fixed in pull request #194.

@amoeba
Copy link
Author

amoeba commented Feb 19, 2018

Glad to hear it, @kaneplusplus! I guess I'll leave this open for now, though I'm okay with a maintainer closing it if needed.

@leeper
Copy link
Member

leeper commented Mar 19, 2018

I think this is now fixed, as it is essentially the same problem as #191. Let me know if it's not working and I will reopen. You'll need to set the AWS_S3_ENDPOINT environment variable.

@leeper leeper closed this as completed Mar 19, 2018
@amoeba
Copy link
Author

amoeba commented Mar 19, 2018

Great, thanks @leeper. That fix looks like it'll work for my use case so I'll give it a run through soon and let you know if I run into any issues.

@leeper
Copy link
Member

leeper commented Mar 19, 2018

Great. Thanks!

@amoeba
Copy link
Author

amoeba commented Apr 16, 2018

Hey @leeper, now that I sit down to test this, it isn't working.

I spent some time yesterday and today attempting to debug my issue and I've had no luck. I get a SignatureDoesNotMatch error seemingly no matter what type of request I make:

> Sys.setenv(AWS_S3_ENDPOINT="nyc.digitaloceanspaces.com",
           AWS_ACCESS_KEY_ID = MY_DO_SPACES_KEY,
           AWS_SECRET_ACCESS_KEY = MY_DO_SPACES_SECRET)

> aws.s3::s3HTTP(verb = "GET")
List of 3
 $ Code     : chr "SignatureDoesNotMatch"
 $ RequestId: chr "tx0000000000000000d753a-005ad3e457-39dd70-nyc3a"
 $ HostId   : chr "39dd70-nyc3a-nyc"
 - attr(*, "headers")=List of 6
  ..$ content-length           : chr "190"
  ..$ x-amz-request-id         : chr "tx0000000000000000d753a-005ad3e457-39dd70-nyc3a"
  ..$ accept-ranges            : chr "bytes"
  ..$ content-type             : chr "application/xml"
  ..$ date                     : chr "Sun, 15 Apr 2018 23:46:31 GMT"
  ..$ strict-transport-security: chr "max-age=15552000; includeSubDomains; preload"
  ..- attr(*, "class")= chr [1:2] "insensitive" "list"
 - attr(*, "class")= chr "aws_error"
 - attr(*, "request_canonical")= chr "GET\n/\n\nhost:nyc3.digitaloceanspaces.com\nx-amz-date:20180415T234631Z\n\nhost;x-amz-date\ne3b0c44298fc1c149af"| __truncated__
 - attr(*, "request_string_to_sign")= chr "AWS4-HMAC-SHA256\n20180415T234631Z\n20180415/us-east-1/s3/aws4_request\nc7bd088d33d593ec90cb8ca81e13d1d5bf6cde0"| __truncated__
 - attr(*, "request_signature")= chr "AWS4-HMAC-SHA256 Credential=PKY6K4DJPR25XGGG5VQO/20180415/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-d"| __truncated__
NULL

Upon inspection, I notice the local variable region in s3HTTP is set to us-east-1 (which you can see in my output above). Adjusting this doesn't fix the issue however. I looked at an example auth header from the DigitalOcean Spaces docs:

Authorization: AWS4-HMAC-SHA256
Credential=II5JDQBAN3JYM4DNEB6C/20170710/nyc3/s3/aws4_request,
SignedHeaders=host;x-amz-acl;x-amz-content-sha256;x-amz-date,
Signature=6cab03bef74a80a0441ab7fd33c829a2cdb46bba07e82da518cdb78ac238fda5

It looks like maybe the Authorization header is indicating a different set of hashed fields than what are being sent but I haven't figured out whether this is the case just yet.

Any insight here would be greatly appreciated!

@leeper leeper reopened this Apr 16, 2018
@leeper
Copy link
Member

leeper commented May 19, 2018

I can revisit this at some point. Those docs have a full example, so we should be able to add test to aws.signature to ensure it's working on that level.

@leeper
Copy link
Member

leeper commented May 19, 2018

I've added a basic test to aws.signature. Unfortunately their examples don't give enough detail to provide a full test, but from looking at it, this seems like it is an aws.s3 rather than aws.signature problem.

Two ideas:

  1. Could be region. Is it supposed to be nyc or something? I think you're best off setting the AWS_DEFAULT_REGION environment variable.
  2. Bucket name and url_style. Looks like they may be requiring url_style = "virtual" (which is not the default. That affects how the bucket name is attached to the endpoint URL. (Also looks like you may be trying to specifying region and/or bucket name in the endpoint, but this should be a generic API url.)

@amoeba
Copy link
Author

amoeba commented May 19, 2018

Awesome, thanks @leeper. Nice to narrow it down a bit. Both your points make sense (I think) and I'll take a look at fixing our code and report back.

@amoeba
Copy link
Author

amoeba commented May 28, 2018

Made some slight debugging process. I now get this HTTP 400:

[1] <Code>XAmzContentSHA256Mismatch</Code>
[2] <BucketName>asdfasdfasd</BucketName>
[3] <RequestId>tx00000000000000996dd9c-005b0b6173-5c29c8-nyc3a</RequestId>
[4] <HostId>5c29c8-nyc3a-nyc</HostId>

when I run aws.s3::put_bucket(bucket = "analogsea") off of a build of master of aws.s3 and I've set AWS_DEFAULT_REGION, AWS_ACCESS_KEY_ID, and AWS_SECRET_ACCESS_KEY to my DO Spaces region, access key, and secret key. I also commented out the section that decides url_style and re-built the package from master so I'd get the right URL.

Here's the example for creating a Space (bucket) from DigitalOcean:

PUT / HTTP/1.1

Host: static-images.nyc3.digitaloceanspaces.com
x-amz-acl: public-read
x-amz-content-sha256: c6f1fc479f5f690c443b73a258aacc06ddad09eca0b001e9640ff2cd56fe5710
x-amz-date: 20170710T173143Z
Authorization: AWS4-HMAC-SHA256 Credential=II5JDQBAN3JYM4DNEB6C/20170710/nyc3/s3/aws4_request,SignedHeaders=host;x-amz-acl;x-amz-content-sha256;x-amz-date,Signature=6cab03bef74a80a0441ab7fd33c829a2cdb46bba07e82da518cdb78ac238fda5

<CreateBucketConfiguration>
  <LocationConstraint>nyc3</LocationConstraint>
</CreateBucketConfiguration>

Compared to what I end up sending:

HOST https://test.nyc3.digitaloceanspaces.com/

* x-amz-acl: private
* x-amz-date: 20180528T020035Z
* x-amz-content-sha256: 9b5bac799e64c876a137f1a521853789811af62e13cdcf4743a6dcb6c29290bd
* Authorization: AWS4-HMAC-SHA256 Credential=KSED6MLEXFXYZGPDN4KH/20180528/nyc3.digitaloceanspaces.com/s3/aws4_request, SignedHeaders=host;x-amz-acl;x-amz-date, Signature=fec8143dcfa41048547499b00c8412ea1238b55463026e892d44ff2caeff2e39

<CreateBucketConfiguration xmlns=\"http://s3.amazonaws.com/doc/2006-03-01/\">
  <LocationConstraint>nyc3.digitaloceanspaces.com</LocationConstraint>
</CreateBucketConfiguration>"

The first difference I see is that the request body has a different LocationConstraint. So I put a browser() into aws.s3::s3HTTP and override the request_body variable before the PUT gets sent and I got the same error.

I also see the Credential part of the Authorization header has the full URL and not just the region name nyc3. So I edited that with a browser() while the function body was executing and got a

[1] <Code>SignatureDoesNotMatch</Code>
[2] <RequestId>tx00000000000000995b18c-005b0b6600-5c29c3-nyc3a</RequestId>
[3] <HostId>5c29c3-nyc3a-nyc</HostId>

which makes total sense.

The last thing I see is the difference in x-amz-acl (private vs. public). But that seems less related.

@amoeba
Copy link
Author

amoeba commented May 28, 2018

Also noticed that x-amz-content-sha256 isn't in the signed headers list in the docs but is in the request I sent.

@leeper
Copy link
Member

leeper commented Jul 28, 2018

@amoeba What did you set AWS_DEFAULT_REGION to? Just nyc? I suspect that's the only problem. You can change acl in put_object() with put_object(acl = "public-read").

@amoeba
Copy link
Author

amoeba commented Jul 29, 2018

I'm not totally sure. Let me try this again and post a more reproducible report.

@amoeba
Copy link
Author

amoeba commented Jul 29, 2018

I'm not sure what I had set these to before, but I just put together a script that uses the latest build of aws.s3 from the master branch to create a Space (bucket). Pardon the verbosity but I just wanted to run through this from top to bottom again (mostly to jog my memory):

I set my env vars like:

Sys.setenv(
  	"AWS_S3_ENDPOINT" = "digitaloceanspaces.com",
  	"AWS_DEFAULT_REGION" = "nyc3",
	"AWS_ACCESS_KEY_ID"="...elided...",
	"AWS_SECRET_ACCESS_KEY"="...elided..."
)

When calling put_bucket, I can't find a configuration of region and base_url that works. If I call, for example,

aws.s3::put_bucket("analogsea-nyc3-test-two",
                   region = "nyc3",
                   key = Sys.getenv("DO_SPACES_ACCESS_KEY"),
                   secret = Sys.getenv("DO_SPACES_SECRET_KEY"),
                   base_url = "digitaloceanspaces.com")

(which is the code I'd ideally like to write), I get a Method Not Allowed (HTTP 405). The URL that the PUT goes to is https://digitaloceanspaces.com/{space_name} which is missing the region in and space name in the subdomain.

If I try to work around it by setting base_url to nyc3.digitaloceanspaces.com,

aws.s3::put_bucket("analogsea-nyc3-test-two",
                   region = "nyc3",
                   key = Sys.getenv("DO_SPACES_ACCESS_KEY"),
                   secret = Sys.getenv("DO_SPACES_SECRET_KEY"),
                   base_url = "nyc3.digitaloceanspaces.com")

I get a Bad Request (HTTP 400). The URL the PUT is being sent to in this case is https://nyc3.digitaloceanspaces.com/{space_name} which is nearly correct.

Verbose output from the above is

> aws.s3::put_bucket("analogsea-nyc3-test-two",
+                    region = "nyc3",
+                    key = Sys.getenv("DO_SPACES_ACCESS_KEY"),
+                    secret = Sys.getenv("DO_SPACES_SECRET_KEY"),
+                    base_url = "nyc3.digitaloceanspaces.com",
+                    verbose = TRUE)
Checking for credentials in user-supplied values
Using user-supplied value for AWS Access Key ID
Using user-supplied value for AWS Secret Access Key
Using user-supplied value for AWS Region ('nyc3')
S3 Request URL: https://nyc3.digitaloceanspaces.com/analogsea-nyc3-test-two/
Executing request with AWS credentials
Checking for credentials in user-supplied values
Using user-supplied value for AWS Access Key ID
Using user-supplied value for AWS Secret Access Key
Using user-supplied value for AWS Region ('nyc3')
Checking for credentials in user-supplied values
Using user-supplied value for AWS Secret Access Key
Using user-supplied value for AWS Region ('nyc3')
Parsing AWS API response
Client error: (400) Bad Request
List of 10
 $ url        : chr "https://nyc3.digitaloceanspaces.com/analogsea-nyc3-test-two/"
 $ status_code: int 400
 $ headers    :List of 3
  ..$ date          : chr "Sun, 29 Jul 2018 03:09:13 GMT"
  ..$ content-length: chr "242"
  ..$ content-type  : chr "text/xml; charset=utf-8"
  ..- attr(*, "class")= chr [1:2] "insensitive" "list"
 $ all_headers:List of 1
  ..$ :List of 3
  .. ..$ status : int 400
  .. ..$ version: chr "HTTP/1.1"
  .. ..$ headers:List of 3
  .. .. ..$ date          : chr "Sun, 29 Jul 2018 03:09:13 GMT"
  .. .. ..$ content-length: chr "242"
  .. .. ..$ content-type  : chr "text/xml; charset=utf-8"
  .. .. ..- attr(*, "class")= chr [1:2] "insensitive" "list"
 $ cookies    :'data.frame':	0 obs. of  7 variables:
  ..$ domain    : logi(0) 
  ..$ flag      : logi(0) 
  ..$ path      : logi(0) 
  ..$ secure    : logi(0) 
  ..$ expiration: 'POSIXct' num(0) 
  ..$ name      : logi(0) 
  ..$ value     : logi(0) 
 $ content    : raw [1:242] 3c 3f 78 6d ...
 $ date       : POSIXct[1:1], format: "2018-07-29 03:09:13"
 $ times      : Named num [1:6] 0 0.000057 0.00006 0.000133 0.161529 ...
  ..- attr(*, "names")= chr [1:6] "redirect" "namelookup" "connect" "pretransfer" ...
 $ request    :List of 7
  ..$ method    : chr "PUT"
  ..$ url       : chr "https://nyc3.digitaloceanspaces.com/analogsea-nyc3-test-two/"
  ..$ headers   : Named chr [1:6] "application/json, text/xml, application/xml, */*" "" "private" "20180729T030908Z" ...
  .. ..- attr(*, "names")= chr [1:6] "Accept" "Content-Type" "x-amz-acl" "x-amz-date" ...
  ..$ fields    : NULL
  ..$ options   :List of 5
  .. ..$ useragent    : chr "libcurl/7.54.0 r-curl/3.2 httr/1.3.1"
  .. ..$ post         : logi TRUE
  .. ..$ postfieldsize: int 148
  .. ..$ postfields   : raw [1:148] 3c 43 72 65 ...
  .. ..$ customrequest: chr "PUT"
  ..$ auth_token: NULL
  ..$ output    : list()
  .. ..- attr(*, "class")= chr [1:2] "write_memory" "write_function"
  ..- attr(*, "class")= chr "request"
 $ handle     :Class 'curl_handle' <externalptr> 
 - attr(*, "class")= chr "aws_error"
 - attr(*, "headers")=List of 3
  ..$ date          : chr "Sun, 29 Jul 2018 03:09:13 GMT"
  ..$ content-length: chr "242"
  ..$ content-type  : chr "text/xml; charset=utf-8"
  ..- attr(*, "class")= chr [1:2] "insensitive" "list"
 - attr(*, "request_canonical")= chr "PUT\n/analogsea-nyc3-test-two/\n\nhost:nyc3.digitaloceanspaces.com\nx-amz-acl:private\nx-amz-date:20180729T0309"| __truncated__
 - attr(*, "request_string_to_sign")= chr "AWS4-HMAC-SHA256\n20180729T030908Z\n20180729/nyc3/s3/aws4_request\naa684522d294e89c941afa692608f5a8b44441ef8a81"| __truncated__
 - attr(*, "request_signature")= chr "AWS4-HMAC-SHA256 Credential=PQX2IDA54KISDM4CFF4T/20180729/nyc3/s3/aws4_request, SignedHeaders=host;x-amz-acl;x-"| __truncated__
NULL

If I override setup_s3_url so that it sets the URL to https://analogsea-nyc3-test-two.nyc3.digitaloceanspaces.com/ I still get an HTTP 400. As before, if I dump the raw response from the API, I get this document:

<?xml version="1.0" encoding="UTF-8"?>
<Error>
  <Code>XAmzContentSHA256Mismatch</Code>
  <BucketName>analogsea-nyc3-test-two</BucketName>
  <RequestId>tx000000000000000dada75-005b5d320c-ac4154-nyc3a</RequestId>
  <HostId>ac4154-nyc3a-nyc</HostId>
</Error>

@leeper
Copy link
Member

leeper commented Jul 29, 2018

Thanks, again, for really detailed info. Give version 0.3.18 (current on Github) a try. There was some inflexible code, which seems to have been generating these errors. The SHA256 thing might be a separate bug or it might just be a reflection of trying to hack around with the internals and it will go away once the URL parsing stuff is fixed.

@amoeba
Copy link
Author

amoeba commented Jul 29, 2018

Thanks for the fix. No luck with the latest build.

Running:

Sys.setenv(
...
           "AWS_S3_ENDPOINT" = "digitaloceanspaces.com",
           "AWS_DEFAULT_REGION" = "nyc3",
...
)

aws.s3::put_bucket("analogsea-nyc3-test-two-two",
                   region = "nyc3",
                   key = Sys.getenv("DO_SPACES_ACCESS_KEY"),
                   secret = Sys.getenv("DO_SPACES_SECRET_KEY"),
                   base_url = "digitaloceanspaces.com",
                   verbose = TRUE,
                   url_style = "virtual")

I get an HTTP 400 w/

<Error>
  <Code>XAmzContentSHA256Mismatch</Code>
  <BucketName>analogsea-nyc3-test-two-two</BucketName>
  <RequestId>tx0000000000000003f8c93-005b5e3b46-ad3dec-nyc3a</RequestId>
  <HostId>ad3dec-nyc3a-nyc</HostId>
</Error>
> devtools::session_info()
...
aws.s3          0.3.18  2018-07-29 local

Would it be helpful for debugging if I shared my DO Spaces API keys with you?

@leeper
Copy link
Member

leeper commented Jul 29, 2018

If you're okay with that, yes. My gmail is thosjleeper.

@leeper
Copy link
Member

leeper commented Jul 30, 2018

I can confirm with the edits just pushed that the following works:

Sys.setenv("AWS_S3_ENDPOINT" = "digitaloceanspaces.com")
Sys.setenv("AWS_DEFAULT_REGION" = "nyc3")
Sys.getenv("DO_SPACES_ACCESS_KEY" = "something")
Sys.getenv("DO_SPACES_SECRET_KEY" = "something")

put_bucket("analogsea-nyc3-test-two-two",
                   location_constraint = NULL,
                   key = Sys.getenv("DO_SPACES_ACCESS_KEY"),
                   secret = Sys.getenv("DO_SPACES_SECRET_KEY"))
## [1] TRUE
bucketlist(key = Sys.getenv("DO_SPACES_ACCESS_KEY"),
                secret = Sys.getenv("DO_SPACES_SECRET_KEY"))
##                        Bucket             CreationDate
## 1                redactedname 2018-07-29T02:34:49.009Z
## 2 analogsea-nyc3-test-two-two 2018-07-30T00:00:37.504Z

@amoeba
Copy link
Author

amoeba commented Jul 30, 2018

Works for me too! Nice work, @leeper!! I'll test other API methods shortly and close this unless I find anything.

@amoeba
Copy link
Author

amoeba commented Aug 4, 2018

This looks golden, thanks a million, @leeper. I was able to make some minor tweaks and PR this on analogsea just tonight.

@amoeba amoeba closed this as completed Aug 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants