Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API upload object without using tmp file #4848

Merged
merged 3 commits into from
Dec 22, 2022

Conversation

nopcoder
Copy link
Contributor

@nopcoder nopcoder commented Dec 22, 2022

Fix #4849

Multipart request read content directly to block adapter.
The issue is that the Go implementation uses local temporary file when accessing content size >32MB.
In order to support reading the data directly in this case we stream the request data into the adapter's Put request without specify the content size.

To support Put with reader without size (stream), updated the S3 to use of s3manager upload in case the size == -1.
All other adapters ignore the size in the request to upload.

Tested manually with file size ~170MB

Multipart request read content directly to block adapter.
The Go implementation use local temporary file to enable read from
content.
@nopcoder nopcoder added area/API Improvements or additions to the API include-changelog PR description should be included in next release changelog labels Dec 22, 2022
@nopcoder nopcoder self-assigned this Dec 22, 2022
@nopcoder nopcoder marked this pull request as ready for review December 22, 2022 12:53
Copy link
Contributor

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat!

I think that this can change behaviour with some S3 "compatible" storage backends, since we're no longer uploading data with a length. We might need to add an option to turn this off.

Also, the current implementation will be dangerous if we ever add another part to UploadObject in swagger.yml. Let's add a comment on its definition not to do that!

Comment on lines 2144 to 2147
if !strings.HasPrefix(mt, "multipart/") {
writeError(w, r, http.StatusInternalServerError, http.ErrNotMultipart)
return
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this works for any multipart Content-Type, perhaps we should limit ourselves to a list of known types? (Package http probably limits itself in some way, so if we do the same would we be safe?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - I'll limit it to "multipart/form-data". The Go's request implementation support form-data and optional mixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the current implementation will be dangerous if we ever add another part to UploadObject in swagger.yml. Let's add a comment on its definition not to do that!

Currently we added the information in the swagger description: Only a single file per upload which must be named "content".

writeError(w, r, http.StatusInternalServerError, err)
return
}
contentUploaded = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not break here? Nothing remains that could be useful (I think).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix - wanted to complete the body processing - but I see that Close will do it for me!

pkg/api/controller.go Show resolved Hide resolved
pkg/api/controller.go Outdated Show resolved Hide resolved
@nopcoder nopcoder enabled auto-merge (squash) December 22, 2022 14:54
@nopcoder
Copy link
Contributor Author

I think that this can change behaviour with some S3 "compatible" storage backends, since we're no longer uploading data with a length. We might need to add an option to turn this off.

The the backend will get the size as the s3manager upload operation works with a single put or multi-part upload.

@nopcoder nopcoder merged commit 7fc357a into master Dec 22, 2022
@nopcoder nopcoder deleted the feature/api-upload-without-tmp branch December 22, 2022 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/API Improvements or additions to the API include-changelog PR description should be included in next release changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API UploadObject uses tmp file for large objects
3 participants