Skip to content

tusd cloud run GCS #700

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
AngeloGiurano opened this issue Jul 2, 2024 · 8 comments
Closed

tusd cloud run GCS #700

AngeloGiurano opened this issue Jul 2, 2024 · 8 comments
Labels

Comments

@AngeloGiurano
Copy link

Question
Hi,
I'm using the tusd latest binary image in cloud run to upload to GCS, I'm able to call the endpoints through tus-js-client but no progress is being reported back, all I have configured is the -gcs-bucket, -gcs-object-prefix and the -behind-proxy flags

If I switch to use your demo server as the endpoint I correctly get the progress reported back on the onProgress callback

Setup details
Please provide following details, if applicable to your situation:

  • Runtime environment: Browser
  • Used tus-js-client version: 4.1.0
  • Used tus server software: tusd:latest
@Acconut
Copy link
Member

Acconut commented Jul 2, 2024

Please share more details. Your report is not sufficient to help you. Would you be able to generate a HAR file and share it with us (see https://toolbox.googleapps.com/apps/har_analyzer/)? HAR files contain the network traffic from a website, so we can inspect the sent and received headers more easily. If you don't want to share this file in the public you can also email me at marius@transloadit.com.

@AngeloGiurano
Copy link
Author

Sure, thanks @Acconut, I will generate the HAR file and send it to your e-mail, thank you for your quick reply and help

@AngeloGiurano
Copy link
Author

AngeloGiurano commented Jul 3, 2024

@Acconut the HAR file does not contain the PATCH request for updating the file unfortunately, probably because it's still pending, I'm wondering if this has to do with the Cloud run http request max size https://cloud.google.com/run/quotas#request_limits

One more finding is that if I use chunking and set the chunk size to a value lower than 32MiB which is the Maximum HTTP/1 request size the upload works correctly

@Acconut
Copy link
Member

Acconut commented Jul 3, 2024

The request limit is likely the cause, yes, and setting tus-js-client's chunkSize helps. 32 MiB is a very low request limit in my opinion. It would be a problem for smaller files (< 100 MiB), but larger uploads (~1 GiB) will likely be slowed down by this setting as it has to spread the upload across multiple requests.

I am not sure what they mean with

32 MiB if using HTTP/1 server. No limit if using HTTP/2 server.

tusd supports HTTP/2 if you supply it with a TLS certificate (Go requires TLS encryption for HTTP/2). Maybe that would allow you to use HTTP/2 and circumvent the 32 MiB limit. However, I am unsure if they mean that the connection from browser to Google Cloud must be HTTP/2 (which is not always the case) or if the connection from Google's load balancer to tusd must be HTTP/2.

@AngeloGiurano
Copy link
Author

AngeloGiurano commented Jul 3, 2024

Hi @Acconut, thanks for the support and help here, I'll share my findings:

We took a look into this and looks like GCP's load balancer that sits in front of Cloud Run is terminating TLS and sending cleartext HTTP/2 to the container. BUT, tusd's Golang implementation only allows HTTP/2 on a TLS connection, unfortunately the LB is abstracted away from us and isn't configurable.

We've looked at your implementation of tusd and looks like you use the golang standard lib which only activates http2 when receiving https traffic https://pkg.go.dev/net/http#hdr-HTTP_2

We could fork tusd and open a PR to use h2c lib instead which doesnt require TLS but we need to know if this is something you potentially might approve and get implemented into tusd or not

@Acconut
Copy link
Member

Acconut commented Jul 4, 2024

GCP terminating the TLS for you is not directly relevant as far as the communication from the load balancer to tusd would again go through an encrypted HTTP/2 connection. If GCP supports this mode, then no modifications to tusd would be necessary. However, from quickly scanning over https://cloud.google.com/run/docs/configuring/http2#before_you_configure, it does not appear that they support encrypted H2 to the origin server:

Your Cloud Run service must handle requests in HTTP/2 cleartext (h2c) format. Google's frontend-serving infrastructure terminates TLS and then forwards the h2c traffic

Too bad.

We could fork tusd and open a PR to use h2c lib instead which doesnt require TLS but we need to know if this is something you potentially might approve and get implemented into tusd or not

Yes, I would be open to adding an option to enable h2c in tusd, as long as the implementation is not vastly more complex than the example in https://pkg.go.dev/golang.org/x/net/http2/h2c#example-NewHandler. Thank you for proposing this. I have thought about experiment with using H2 for the connection between proxies and tusd, but never got around to do it.

I was a bit concerned by the comment in https://pkg.go.dev/golang.org/x/net/http2/h2c#NewHandler:

The first request on an h2c connection is read entirely into memory before the Handler is called. To limit the memory consumed by this request, wrap the result of NewHandler in an http.MaxBytesHandler.

Reading an entire request body in memory can be devastating for tusd. Imagine if the first request is a large PATCH request and suddenly memory usage skyrockets. However, in golang/go#67816 (comment) (and the following two comments), Damien Niel conforms that the request body is not read into memory if H2 with prior knowledge is used. I believe this should be the case with Google Cloud Run, so you should fine. That being said, we should add a notice for this caveat to the corresponding documentation.

@AngeloGiurano
Copy link
Author

Hi @Acconut, first of all thanks once again for the detailed response and the support.

We solved the issue with placing an nginx proxy which would receive http2 from cloudrun and communicate with the tusd server over http1/1

As much as we would like to help and open a PR on adding an option to enable h2c in tusd, we are very short on resources now and have some deadlines to meet, I will put it on my todo list and maybe one we will get it

Thanks once again!

@kristophM
Copy link

Hey @Acconut let me know if the above PR meets your requirements. We tested it out and it works really well on Cloud Run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants