Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start only resumable sessions to do browser-based uploading #1240

Closed
ernestoalejo opened this issue Nov 23, 2015 · 12 comments
Closed

Start only resumable sessions to do browser-based uploading #1240

ernestoalejo opened this issue Nov 23, 2015 · 12 comments
Assignees
Labels
api: storage Issues related to the Cloud Storage API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@ernestoalejo
Copy link

Start a resumable upload in the server and then send the URL to the client to finish the upload.

Documentation: https://cloud.google.com/storage/docs/json_api/v1/how-tos/upload#resumable

Related issue in gcloud-node: googleapis/google-cloud-node#641

@dhermes dhermes added the api: storage Issues related to the Cloud Storage API. label Nov 23, 2015
@tseaver
Copy link
Contributor

tseaver commented Dec 11, 2015

Is the intent here to avoid doing the stream_file call as part of upload_from_file, perhaps returning the upload object to the caller?

@ernestoalejo
Copy link
Author

It is a little bit more difficult. It should be able to:

  • Change the endpoint URL: it's not the one that code generates, it's different.
  • Add the X-Upload-Content-Type and X-Upload-Content-Length as usual if I want to limit the user user upload.
  • Add the Origin header to be able to continue from a browser without the last chunk failing.
  • Return the raw Location response header to the caller somehow to use it.

@ernestoalejo
Copy link
Author

An example extracted from my code:

def start_resumable_upload(content_type, content_length, filename, bucket_name, origin):
  headers = {
    'X-Upload-Content-Type': content_type,
    'X-Upload-Content-Length': content_length,
    'Origin': origin,
    'Content-Type': 'application/json; charset=utf-8',
  }
  body = {
    'name': filename,
  }
  uri = 'https://www.googleapis.com/upload/storage/v1/b/%s/o?uploadType=resumable' % bucket_name

  credentials = SignedJwtAssertionCredentials(...)
  http = credentials.authorize(httplib2.Http())
  response, content = http.request(uri=uri, method='POST', body=json.dumps(body), headers=headers)

  if response.status != 200:
    raise ...

  return response['location']

@tseaver
Copy link
Contributor

tseaver commented Dec 11, 2015

The Upload class (forked from apitools) already does resumable requests for files larger than 5 Mb: the difference is that the client doesn't have to handle the secondary requests (those to the redirect URL with the uploadId parameter). Is your goal to allow the client to do that work explicitly, e.g. so that it can log progress?

@ernestoalejo
Copy link
Author

Yes, log progress and avoid transferring the file to my server first, and then again to Cloud Storage. I can upload the file directly from the browser (using XHR2 for example) with this method.

@tseaver tseaver added the type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. label Dec 11, 2015
@tseaver
Copy link
Contributor

tseaver commented Dec 11, 2015

OK, I've marked this as enhancement, since it isn't a report that the existing resumable upload is broken (as I first read it).

@dhermes
Copy link
Contributor

dhermes commented Dec 14, 2015

👍

@theacodes
Copy link
Contributor

For more context here:

I'd like to see this happen so I can write a sample demonstrating this.

@theacodes
Copy link
Contributor

I quick implementation of this by modifying Blob.upload_from_file:

from gcloud import storage
from gcloud.storage.blob import _UrlBuilder, _UploadConfig
from gcloud.streaming.http_wrapper import Request
from gcloud.streaming.transfer import RESUMABLE_UPLOAD
from gcloud.streaming.transfer import Upload


class BlobWithResumable(storage.Blob):
    def start_resumable_upload(
            self,
            content_type=None,
            content_length=None,
            origin=None,
            client=None):
        client = self._require_client(client)

        # Use the private ``_connection`` rather than the public
        # ``.connection``, since the public connection may be a batch. A
        # batch wraps a client's connection, but does not store the `http`
        # object. The rest (API_BASE_URL and build_api_url) are also defined
        # on the Batch class, but we just use the wrapped connection since
        # it has all three (http, API_BASE_URL and build_api_url).
        connection = client._connection
        content_type = (content_type or self._properties.get('contentType') or
                        'application/octet-stream')

        headers = {
            'Accept': 'application/json',
            'Accept-Encoding': 'gzip, deflate',
            'User-Agent': connection.USER_AGENT,
            # This headers are specifically for client-side uploads, it
            # determines the origins allowed for CORS.
            'Origin': origin
        }

        upload = Upload(
            None, content_type, total_size=content_length, auto_transfer=False)

        url_builder = _UrlBuilder(
            bucket_name=self.bucket.name,
            object_name=self.name)
        upload_config = _UploadConfig()

        # Temporary URL, configure_request will determine the full URL.
        base_url = connection.API_BASE_URL + '/upload'
        upload_url = connection.build_api_url(
            api_base_url=base_url,
            path=self.bucket.path + '/o')

        # Use apitools 'Upload' facility.
        request = Request(upload_url, 'POST', headers)

        # Force resumable upload.
        upload.strategy = RESUMABLE_UPLOAD
        upload.configure_request(upload_config, request, url_builder)

        query_params = url_builder.query_params
        base_url = connection.API_BASE_URL + '/upload'
        request.url = connection.build_api_url(
            api_base_url=base_url,
            path=self.bucket.path + '/o',
            query_params=query_params)

        response = upload.initialize_upload(request, connection.http)

        # The location header contains the session URL. This can be used
        # to continue the upload.
        resumable_upload_session_url = response.headers['location']

        return resumable_upload_session_url

@dhermes
Copy link
Contributor

dhermes commented May 11, 2016

@jonparrott What is the delta from the original code?

@theacodes
Copy link
Contributor

It's a net new function. I used upload_from_file as a guide. I don't
presently think that they can use the same thing, but I could be wrong.

On Wed, May 11, 2016, 10:27 AM Danny Hermes notifications@github.com
wrote:

@jonparrott https://github.com/jonparrott What is the delta from the
original code?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#1240 (comment)

@theacodes
Copy link
Contributor

Bump, we want to show this in the new Cloud Storage documentation. I've added a label to that end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

4 participants