Skip to content
This repository has been archived by the owner on Aug 20, 2020. It is now read-only.

Feature: File upload? #6

Closed
stubailo opened this issue Jul 7, 2017 · 18 comments · Fixed by #14
Closed

Feature: File upload? #6

stubailo opened this issue Jul 7, 2017 · 18 comments · Fixed by #14

Comments

@stubailo
Copy link
Contributor

stubailo commented Jul 7, 2017

I think apollo-fetch has an opportunity to be the only simple GraphQL client library out there with support for file uploads. Looks like apollo-upload-client is quite simple implementation-wise, so it would be fine to build that stuff right in!

@jaydenseric @evanshauser @jbaxleyiii let's talk here, I think this would be a big win not only for Apollo Client having file upload capability, but also for apollo-fetch as a standalone utility!

@stubailo
Copy link
Contributor Author

stubailo commented Jul 7, 2017

(Note: the final decision on the best path forward here is up to @evanshauser and @jbaxleyiii, I'm just facilitating!)

@jbaxleyiii
Copy link
Contributor

@jaydenseric I 💯 think we should include your work in apollo-fetch. Let me know how I can help you out!

@jaydenseric
Copy link
Contributor

Cool, I'll get familiar with the codebase and the middleware/afterware system.

@stubailo
Copy link
Contributor Author

stubailo commented Jul 7, 2017

@jaydenseric
Copy link
Contributor

Is query batching a consideration here?

@evans
Copy link
Contributor

evans commented Jul 10, 2017

@jaydenseric Currently, batching could be done in a custom fetch function(which merges the http specific headers) or in another stand-alone library(middleware/afterware have access to the batch). Does either option make file upload easier for you now or for a later transition?

For now though, servicing a single upload works great! I envision the manipulations would be made before the middleware, so that they have the ability to modify the full request. Or file upload could be a middleware that is exported with the apollo-fetch. Let me know if that fits with your thoughts or you like something better.

CC @jbaxleyiii

@jaydenseric
Copy link
Contributor

jaydenseric commented Jul 10, 2017

We have a choice: Run middleware before, or after the upload process transforms the request to FormData for a multipart upload.

If middleware runs before, it will be able to access the request as an object whether uploads are present or not. Middleware would not have the last say on what gets sent.

If it runs after, authors will need to account for the request being a FormData instance sometimes. Middleware would have the last say.

I think it's best to run middleware before uploads are processed.

As the codebase is currently, it would not be possible to implement uploads via middleware. The request body is forcibly converted to a JSON string. We could change that to first check if the request is an instance of FormData or just not use middleware to transform the request.

To get client part done here, the easiest approach is to make it compatible with apollo-upload-server so we can see it working. The way the request is transformed for transport between client and server is really tricky, particularly factoring in batching. I will try to document the transformations with examples here so we can be sure everyone is happy because it will become a bit of a defacto standard, like query batching with Apollo server is.

@jaydenseric
Copy link
Contributor

jaydenseric commented Jul 10, 2017

Noting here the current spec for request transforms between apollo-upload-client and apollo-upload-server.

This is just a JSON representation of what's going on each step; I have tidied the order of some of the properties, omitted others that we discard anyway (File fields in FormData have a lot going on), and used JSON to represent instances.

You can use jaydenseric/apollo-upload-examples to log this behavior. I am working on a new apollo-fetch branch.

No query batching

Normal request body

For context, here is a normal POST request body, as per graphql-server docs:

{
  "operationName": "aTest",
  "query": "query aTest($arg1: String!) { test(who: $arg1) }",
  "variables": { "arg1": "me" }
}

Upload request body before apollo-upload-server

This is a multipart FormData stream request body – the top level properties are a mix of regular and file multipart form fields.

The pattern is that the operations field contains the JSON encoded GraphQL operation request. It may contain a single operation, or an array of batched operations. All file fields have their original object path within operations as their field name.

I suggest once the client work is done, we update graphql-server to follow this pattern for multipart requests, and update the POST requests documentation here.

{
  "operations": "{\"query\":\"mutation multipleUpload($files: [Upload!]!) {\\n  multipleUpload(files: $files) {\\n    id\\n    name\\n    type\\n    size\\n    path\\n    __typename\\n  }\\n}\\n\",\"variables\":{\"files\":[null,null]},\"operationName\":\"multipleUpload\"}",
  "variables.files.0": {
    "name": "test-file-1.png",
    "type": "image/png",
    "size": 44631
  },
  "variables.files.1": {
    "name": "test-file-2.pdf",
    "type": "application/pdf",
    "size": 13472
  }
}

Upload request body after apollo-upload-server

The request body is now an object suitable for graphql-server to consume. The operations field was used to seed the request. Files from each file field of the multipart form are reinserted in the right places using the paths stored in the field names.

{
  "operationName": "multipleUpload",
  "query": "mutation multipleUpload($files: [Upload!]!) {\n  multipleUpload(files: $files) {\n    id\n    name\n    type\n    size\n    path\n    __typename\n  }\n}\n",
  "variables": {
    "files": [
      {
        "name": "test-file-1.png",
        "type": "image/png",
        "size": 44631,
        "path": "/tmp/apollo-upload-examples/upload_ca4fb54e9e8edf655b90764c97c5e28d"
      },
      {
        "name": "test-file-2.pdf",
        "type": "application/pdf",
        "size": 13472,
        "path": "/tmp/apollo-upload-examples/upload_6cece2a2285305d340a1d174af544ec6"
      }
    ]
  }
}

With query batching

Batched normal request body

For context, here is a normal POST request body, as per graphql-server docs:

[
  { "query": "{ testString }" },
  { "query": "query q2{ test(who: \"you\" ) }" }
]

Batched upload request body before apollo-upload-server

{
  "operations": "[{\"query\":\"mutation multipleUpload($files: [Upload!]!) {\\n  multipleUpload(files: $files) {\\n    id\\n    name\\n    type\\n    size\\n    path\\n    __typename\\n  }\\n}\\n\",\"variables\":{\"files\":[null,null]},\"operationName\":\"multipleUpload\"}]",
  "0.variables.files.0": {
    "name": "test-file-1.png",
    "type": "image/png",
    "size": 44631
  },
  "0.variables.files.1": {
    "name": "test-file-2.pdf",
    "type": "application/pdf",
    "size": 13472
  }
}

Batched upload request body after apollo-upload-server

[
  {
    "operationName": "multipleUpload",
    "query": "mutation multipleUpload($files: [Upload!]!) {\n  multipleUpload(files: $files) {\n    id\n    name\n    type\n    size\n    path\n    __typename\n  }\n}\n",
    "variables": {
      "files": [
        {
          "name": "test-file-1.png",
          "type": "image/png",
          "size": 44631,
          "path": "/tmp/apollo-upload-examples/upload_c6fa87e18003c9a5013d6ae33779aa6d"
        },
        {
          "name": "test-file-2.pdf",
          "type": "application/pdf",
          "size": 13472,
          "path": "/tmp/apollo-upload-examples/upload_dfe8315d7b78b9011998ec07a43217e3"
        }
      ]
    }
  }
]

@jaydenseric
Copy link
Contributor

@stubailo @evanshauser @jbaxleyiii I just did a lot of edits to my above comment regarding a spec for multipart post requests to graphql-server, that borrows from current behavior of apollo-upload-client and apollo-upload-server. Please let me know your thoughts so I can get stuck into implementing – I have cleared my calendar for the next few days to work on it 🙂

@stubailo
Copy link
Contributor Author

stubailo commented Jul 10, 2017

It's quite late here, so we won't be able to say anything useful until tomorrow. I agree this might not make sense to implement as a middleware right now.... but...

As one more question, would it make possible to expand the middleware functionality to enable file upload to be implemented in middleware form? Somehow make it possible for the request options set by the middleware contain multipart form data?

If we think there might be multiple ways of encoding and sending files, that might be an advantage. On the other hand, it might also be useful to "enforce" a standard way of sending files so that everyone can be using the same approach.

@jaydenseric
Copy link
Contributor

There are a few moving parts here, but the more I think about it, the best thing to start with is to standardize a structure for multipart POST requests between a client and graphql-server. Essentially the structure described under the "before apollo-upload-server" headings in my earlier comment.

Once that is agreed upon, client and server implementations can be decided separately and worked on immediately.

The details of how a client generates the multipart request (using middleware, etc.) can be worked out. The details of what a server does after receiving the multipart form (store files to a temp dir like apollo-upload-server, or try to provide them in resolver arguments as streams, etc.) can be worked out later or even be pluggable.

In the short term it would be a win for users if they only had to use apollo-upload-client, and graphql-server would just work. Or the other way around, use apollo-upload-server and apollo-fetch would just work.

@evans
Copy link
Contributor

evans commented Jul 11, 2017

I really like this idea that we can do batching, since query is just a string! We should also think through standardizing how the variables are associated with their specific query.

In terms of standardizing the single multipart POST request format, I hesitate to use the structure of "before apollo-upload-server" over the nesting that "after apollo-upload server" exhibits(minus the temporary paths). In my mind, this nesting feels more natural with apollo-fetch's current interface:

{
  "operationName": "multipleUpload",
  "query": "mutation multipleUpload($files: [Upload!]!) {\n  multipleUpload(files: $files) {\n    id\n    name\n    type\n    size\n    path\n    __typename\n  }\n}\n",
  "variables": {
    "files": [
      {
        "name": "test-file-1.png",
        "type": "image/png",
        "size": 44631,
      },
    ],
  }
}

I'm not convinced it is better, since it means we wouldn't be immediately compatible with apollo-upload-server and apollo-upload-client and we would either need to reserve some keywords in the query string or provide a fourth argument sent to the server, such as a context that indicates a file upload:

{
  "operationName": "multipleUpload",
  "query": "mutation multipleUpload($files: [Upload!]!) {\n  multipleUpload(files: $files) {\n    id\n    name\n    type\n    size\n    path\n    __typename\n  }\n}\n",
  "variables": {
    "files": [
      {
        "name": "test-file-1.png",
        "type": "image/png",
        "size": 44631,
      },
    ],
  }
  "context": {
    upload: true,
  }
}

@jaydenseric what do you think about the standard/does it provide the information the server needs? Would it be reasonable for the upload client and server to follow a similar pattern?

Also curious what @jbaxleyiii thinks

CC @stubailo

@jaydenseric
Copy link
Contributor

jaydenseric commented Jul 11, 2017

@evanshauser multipart forms are limited in that the actual files being uploaded can only be top level fields. They have to be extracted from the operation for transport then put back in the right place at the other end. Keep in mind that the file does not actually look like the nice JSON representation when a field in a multipart form, it is a big blob with all the binary data.

There is a lot to explain about how to efficiently process the multipart form stream on the server in relation to resolver arguments, but that can be another conversation.

Here is a nice explanation for how multipart forms work.

@jaydenseric
Copy link
Contributor

jaydenseric commented Jul 11, 2017

When batching, the files from each batch together have to occupy top level multipart form fields. That is why in the example the field name 0.variables.files.0 has a leading zero – the re-assembly path begins with the batch index.

@evans
Copy link
Contributor

evans commented Jul 12, 2017

Thank you for the pointer! I'll take a look at it in the morning and I have to think about the impact of adding fields to the operation passed to fetch, so I should get back to you in a couple hours.

Our future plan for apollo-fetch in addition to being a lightweight fetcher is to underlie an http link, which follows the specification of apollo-link. This prompts the thought to make a file upload into a link that ends a chain, similar to httpLink. I think it comes down to where we want to put the file upload complexity. Your thoughts on the best place, inside a link or apollo-fetch, would be really informative and most likely guide the decision.

The usage would look like this:

const networkInterface = Links.split(
  isFileUpload,
  new FileUploadLink({uri}),
  new HttpLink({uri}),
);

@jaydenseric
Copy link
Contributor

I expect to have a PR for apollo-fetch ready in the next day or so. I am currently breaking out some apollo-upload-client logic that would be useful here and in other clients into a seperate utility npm package.

@jaydenseric
Copy link
Contributor

jaydenseric commented Jul 13, 2017

I have published extract-files and released apollo-upload-client v5.1.0.

The logic (and ReactNativeFile class) to do with extracting files from the variables object has been pulled out from apollo-upload-client into extract-files and is now tested.

It should be pretty easy now to implement it here. I never use Typescript though so that will really slow me down. Lets see.

jaydenseric added a commit to jaydenseric/apollo-fetch that referenced this issue Jul 13, 2017
jaydenseric added a commit to jaydenseric/apollo-fetch that referenced this issue Jul 26, 2017
@evans evans closed this as completed in #14 Aug 10, 2017
evans pushed a commit that referenced this issue Aug 10, 2017
A work in progress. Fixes #6.
@vladinator1000
Copy link

vladinator1000 commented Feb 11, 2018

Hey guys, anyone got this working with TypeScript? I'm getting this error: TS7016: Could not find a declaration file for module 'extract-files'.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants