Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for basic digital ocean spaces API #138

Merged
merged 19 commits into from
Oct 17, 2017

Conversation

rpodcast
Copy link
Contributor

@rpodcast rpodcast commented Oct 7, 2017

Description

This PR brings preliminary support for utilizing the digital ocean spaces object storage API. The spaces API is backward-compatible with the Amazon S3 object storage API, so I leverage key functions from the aws.s3 package to work with the API. Since the authentication is controlled by separate keys from the normal digital ocean API, the user is expected to create these keys within their DO spaces account, and the user can pass the key ID and secret directly in the functions or define them as environment variables (DO_SPACES_ACCESS_KEY and DO_SPACES_SECRET_KEY). The main functions introduced are the following:

  • spaces() : Obtain all spaces in your account. Each one will be given a class called space with custom print and summary S3 methods.
  • spaces_GET() : Supporting function that wraps aes.se::s3HTTP() to obtain the spaces information using the API.
  • space_info() : Obtain metadata around all files contained in a space. Note that if the space contains more than 1,000 files that this operation can take a bit of time, since the S3 API only allows for up to 1,000 files to be queried at a time.

Related Issue

#136

Example

# obtain the spaces in your DO account
my_spaces <- spaces()

# print summary information of the space called "test-space"
summary(my_spaces[["rpodcast-space3"]])

<space_detail>rpodcast-space3
  Size (GB):     37.54697 
  Files:        17331
  Created at:    2017-08-27T19:01:05.737Z 

# create a new space in your account
space_create("my-new-space")

Copy link
Collaborator

@sckott sckott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this!

have some questions in line, and also:

  • would be good to add some examples in the documentation for use of each fxn

R/spaces.R Outdated

res <- put_bucket(name,
region = NULL,
acl = acl,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this acl thing is undefined, what is it supposed to be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is one of the parameters used by put_bucket function from aws.s3 to determine if the bucket is made private or public upon creating. On default new buckets are made as private, and it appears to also be true when I use it to createi new spaces. I tried specifying this to be public in my testing but the spaces were still being created as private. Until I can get this resolved I will remove references to it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, thx

R/spaces.R Outdated
base_url = spaces_base,
...)

if (res) message(sprintf("New space %s created successfully"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't sprintf expect an object passed to it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I will fix that. It was supposed to be passing name. That's what I get for coding late 😞

R/spaces.R Outdated
#' @importFrom aws.s3 bucketlist
#' @export
#' @rdname spaces
spaces_GET <- function(spaces_key = NULL, spaces_secret = NULL, ...) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you mean to export spaces_GET ? if so, what does it do? from the name of the fxn it seems like it's meant to be an internal fxn maybe

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was also supposed to be an internal function, since the spaces() function is the one intended to be user facing. I'll make sure this is internal in my revisions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, thx

R/spaces.R Outdated
base_url = spaces_base,
...)

if (res) message(sprintf("New space %s created successfully"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this error well if the space isn't created? curious what that looks like. i assume that error handling is in aws pkg

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far I haven't had any errors occur when creating a new space. But I am going to add a simple check to ensure the user does not have an existing space with the same name.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, something to explore later, though may be handled well already in aws.s3 pkg

R/spaces.R Outdated
#' Spaces storage operations
#'
#' \describe{
#' \item{space}{List space contents}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't see a space function, though maybe it's something to be added

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, maybe space == space_info

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and forth with this, but now I'm going to omit a space() function since spaces() will grab everything and it is straight forward for the user to grab whichever space they want to work with. They all have a special space class so we'll be able to add many more methods and operations as time goes on I'll delete references to a space function in my next PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I've made space_info() an internal function

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

@rpodcast
Copy link
Contributor Author

rpodcast commented Oct 8, 2017

Thanks for your review @sckott! I've added comments above and pushed the updates.

@sckott
Copy link
Collaborator

sckott commented Oct 10, 2017

  • It's not a great idea to use internal variables that are the same name as the function they are within, e.g,. spaces within spaces function

Copy link
Collaborator

@sckott sckott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments. also:

  • i haven't done it in every file yet, but we try to keep to 80 chars in width for docs/comments/code

R/spaces.R Outdated
#' @rdname spaces
spaces <- function(spaces_key = NULL, spaces_secret = NULL, ...) {
res <- spaces_GET(spaces_key = spaces_key, spaces_secret = spaces_secret, ...)
spaces <- lapply(res$Buckets, structure, class = "space")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try doing unname(res$Buckets) because if you have only 1 item returned, it fails, try it out you'll see what i mean

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was rather nefarious, but when only one space is present the structure of the result changed such that only the Name and CreationDate are populated in res$Buckets, instead of each space occupying a Bucket object (list) containing the Name and CreationDate. I've added a check of this via 35ff528.

R/spaces.R Outdated
spaces_secret <- check_space_secret(spaces_secret)

# check if space name already exists in user's account
spaces_existing <- spaces()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think i might rather have the digital ocean API handle this than us handle it client side. that is, they handle both cases where a name already exists in the users set of spaces or anyone else's spaces - one reason is to have consistent error behavior, another is that the function will be faster, as it's one less HTTP request

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, why make additional HTTP requests when it's not necessary. I was initially put off by the verbose printing of the result by aws.s3, but it is getting the right status and message in these situations. I removed my manual check via cea751a.

#' @importFrom aws.s3 put_bucket
#' @export
#' @rdname spaces
space_create <- function(name, spaces_key = NULL, spaces_secret = NULL, ...) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't know if that makes sense for spaces, def. feel free to not change it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to not give name a default because like Amazon S3, a space name must be unique across all spaces defined in digital ocean, not just the spaces within a user's account. While the actual chance of users getting the same randomly generated name are quite small, I'm still weary of generating a random name as default.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@rpodcast
Copy link
Contributor Author

I finally got a chance to make updates and address your latest comments. Please take a look and let me know if you see other areas that need revisions.

@sckott
Copy link
Collaborator

sckott commented Oct 17, 2017

Looks good @Thercast

Let's merge this, then we can go back to #136 and discuss how to divy up tasks

@sckott sckott merged commit b55f1e3 into pachadotdev:master Oct 17, 2017
@sckott sckott added this to the v0.7 milestone Oct 17, 2017
@amoeba amoeba mentioned this pull request Oct 18, 2017
27 tasks
@sckott sckott added the spaces label May 19, 2018
@sckott sckott modified the milestones: v0.7, v0.8 Jun 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants