Skip to content
This repository has been archived by the owner on Dec 13, 2022. It is now read-only.

feat(graphql) Add file upload support. #22

Merged
merged 4 commits into from
Mar 20, 2022
Merged

Conversation

Schartey
Copy link
Contributor

@Schartey Schartey commented Mar 15, 2022

Problem

Outserv does currently not support uploading files as multipart-form. For general usage it might be of interest to be able to upload binary data in form of files. Also for my next planned feature WASM Lambda it would be ideal to upload the compiled lambda script as binary instead of base64 encoded string.

Solution

With this pull request it is possible to add use the new type Upload within the admin and public schema.
Example:

type FileUpload {
    id: ID!
    upload: Upload!
}

type MultiUpload {
    id: ID!
    uploads: [Upload!]!
}

The uploaded data is stored in binary. When querying for the files they are sent back base64 encoded.

Uploading data is done based on this spec. Batch-Upload as shown in the spec is not supported yet. I will do some testing to check if most commonly used clients use this spec.

Curl Example based on schema above

Single File Upload:

curl localhost:8080/graphql \
-F operations='{ "query": "mutation($file: Upload!) { addFileUpload(input: {upload: $file }) { fileUpload { id, upload } }}", "variables": { "file": null } }' \
-F map='{ "0": ["variables.file"]}' \
-F 0=@somefile.txt

Multiple Mutations with File Arrays:

curl localhost:8080/graphql \
-F operations='{ "query": "mutation($file: Upload!, $files: [Upload!]!) { addFileUpload(input: {upload: $file }) { fileUpload { id, upload } } addMultiUpload(input: { uploads: $files }) { multiUpload { id, uploads }}}", "variables": { "file": null, "files": null }}' \
-F map='{"0":["0.variables.file"], "1": ["1.variables.files.0"], "2": ["1.variables.files.1"]}' \
-F 0=@somefile.txt \
-F 1=@somefile2.txt \
-F 2=@somefile3.txt

This change is Reviewable

@Schartey
Copy link
Contributor Author

Schartey commented Mar 16, 2022

I've come to realise that batching is not possible with the current graphql implementation of Outserv, but we support multiple mutations in the same mutation operation.

@Schartey
Copy link
Contributor Author

Added Curl examples.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

I'm curious about the curl command. Why does it have use indirection to specify the file name in variable. Why not: "variables": { "file": a.txt } }?

Reviewed 2 of 16 files at r1, all commit messages.
Reviewable status: 2 of 16 files reviewed, 8 unresolved discussions (waiting on @manishrjain and @Schartey)


graphql/admin/http.go, line 348 at r2 (raw file):

			if err = r.ParseMultipartForm(x.Config.UploadMemoryLimit); err != nil {
				if strings.Contains(err.Error(), "request body too large") {
					return nil, errors.Wrap(err, "failed to parse multipart form, request body too large")

over 100 chars.

Do we need two flags here? max upload size makes sense, because that would need to be stored in Badger. So we must ensure that it doesn't exceed Badger's value size limitation.

Memory limit sounds like a temporary usage? If so, it would make sense to hard code it to a generous number, like 4*max-upload-size.


graphql/admin/http.go, line 364 at r2 (raw file):

			variables := make(map[string][][]byte)

remove vertical space


graphql/admin/http.go, line 367 at r2 (raw file):

			for key, paths := range uploadsMap {
				if len(paths) == 0 {
					return nil, errors.Wrapf(err, "invalid empty operations paths list for key %s", key)

100 chars. here and elsewhere.


outserv/cmd/alpha/run.go, line 208 at r2 (raw file):

		Flag("max-splits", "How many splits can a single key have, before it is forbidden. "+
			"Also known as Jupiter key.").
		Flag("upload-max-size", "What is the maximum upload size that can be uploaded with a multi-part file upload.").

100 chars

max-upload-size-mb . So, the units are clear.


outserv/cmd/alpha/run.go, line 209 at r2 (raw file):

			"Also known as Jupiter key.").
		Flag("upload-max-size", "What is the maximum upload size that can be uploaded with a multi-part file upload.").
		Flag("upload-memory-limit", "What is the maximum memory consumption when uploading a multi-part file.").

As commented above, I wonder if this can be skipped. We can use a multiple of max-upload-size-mb for memory consumption limit.


protos/pb.proto, line 304 at r2 (raw file):

    STRING = 9;
    OBJECT = 10;
    UPLOAD = 11;

should this just be stored under binary?


protos/pb.proto, line 561 at r2 (raw file):

    string password_val = 10;
    uint64 uid_val=11;
    string upload_val = 12;

I think bytes_val could be reused here. Uploading as a file is a mechanism. But, storage is still bytes.


worker/server_state.go, line 34 at r2 (raw file):

	LimitDefaults   = `mutations=allow; query-edge=1000000; normalize-node=10000; ` +
		`mutations-nquad=1000000; disallow-drop=false; query-timeout=0ms; txn-abort-after=5m;` +
		`max-pending-queries=64;  max-retries=-1; shared-instance=false; max-splits=1000; upload-max-size=20000; upload-memory-limit=20000`

100 chars.

@Schartey
Copy link
Contributor Author

Schartey commented Mar 18, 2022

I'm curious about the curl command. Why does it have use indirection to specify the file name in variable. Why not: "variables": { "file": a.txt } }?

I'm not clear on that too. According to this issue from the spec, it's to allow async uploads. But I don't see how that makes a difference. Maybe it's for the batch uploading.

Changes

  • removed the memory limit flag and set it to 4*max-upload-size-mb*1000 as requested.
  • reused binary value type for upload. That makes sense of course.
  • fixed an issue with mapping. Now files can be reused for different variables.
  • linting - I tried to fix some stuff, but when I run the linter it does not highlight lines being too long. I hope it's ok now.

@manishrjain
Copy link
Contributor

One more thing, could you please confirm that you accept the ICLA: https://github.com/outcaste-io/outserv/blob/main/ICLA.md

Generally, there's a bot which does this, but it doesn't work on private repos. So, you can just acknowledge here that you accept the ICLA before I merge this PR.

@manishrjain manishrjain marked this pull request as ready for review March 18, 2022 15:29
Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Looks good!

Reviewable status: 1 of 16 files reviewed, 8 unresolved discussions (waiting on @manishrjain and @Schartey)

@Schartey
Copy link
Contributor Author

One more thing, could you please confirm that you accept the ICLA: https://github.com/outcaste-io/outserv/blob/main/ICLA.md

Generally, there's a bot which does this, but it doesn't work on private repos. So, you can just acknowledge here that you accept the ICLA before I merge this PR.

I hereby accept the ICLA.

@manishrjain manishrjain merged commit 96d45d0 into main Mar 20, 2022
@manishrjain
Copy link
Contributor

Thanks for the contribution, @Schartey !

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants