Skip to content

How should we handle file uploads? #246

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ghost opened this issue Jul 15, 2014 · 92 comments
Open

How should we handle file uploads? #246

ghost opened this issue Jul 15, 2014 · 92 comments
Labels
extension Related to existing and proposed extensions as well as extensions in general

Comments

@ghost
Copy link

ghost commented Jul 15, 2014

Part of the API that I'm working on allows for files to be uploaded. Is this something that the spec could / should / does already cover?

@steveklabnik
Copy link
Contributor

I don't think there's anything that makes file uploads particularly special. If you can think of some reason why there's something missing, we can talk about it, but we'd need more specifics.

@Fivell
Copy link

Fivell commented Nov 16, 2015

@steveklabnik , media type should be different or not in this case?

@NuckChorris
Copy link

@steveklabnik What makes file uploads special?

Let's assume you have a profile with an avatar field. In the JSON API it's exposed as a URL in a text field. You ideally want a symmetry where the client can set that field to a binary blob and they'll get back the new URL. Some APIs (Stripe, for instance) use an out-of-band upload system, but this brings with it a whole slew of timing and cleanup issues, and removes the symmetry.

This means we basically have three options: we can either submit them as base64 in a JSON body (and eat the 25% overhead plus additional processing), you can submit them as multipart/form-data and ditch the whole JSON-API thing, or you can set them up as a sub-resource which can be edited by uploading to it, like PUT /users/17/avatar (which means you have to handle it separate from your standard download-modify-upload lifecycle)

I think this qualifies as complex enough and encourages bikeshedding in teams enough that JSON-API ought to tackle it, even if only in an extension.

@patrykorwat
Copy link

Note that 25% is only about the amount of sent data over the network. Current implementations of web servers (at least in Java) for multipart/form-data uses algorithms which just stream data using a single buffer (O(1) memory complexity), whereas in case of JSON files it is required to process whole file in memory (O(n) memory complexity).

@NuckChorris
Copy link

@meshuga The 25% is for base64 embedded in JSON, not multipart. Multipart should send it over the wire as plain binary I believe, with just the added overhead of the headers and dividers

You're absolutely right that Multipart is O(1) memory and data: is O(n), and this just goes to prove my point: file uploads are friggin confusing and should be within the scope of JSONAPI or an extension

@johnnncodes
Copy link

+1. Would be great if jsonapi spec would cover this.

@ethanresnick ethanresnick reopened this Apr 27, 2016
@ersinakinci
Copy link

+1.

@gponsu
Copy link

gponsu commented Aug 29, 2016

+1

6 similar comments
@saneshark
Copy link

+1

@shannontan-addepar
Copy link

+1

@wendelin
Copy link

wendelin commented Oct 6, 2016

+1

@nickreynke
Copy link

+1

@joseftw
Copy link

joseftw commented Nov 1, 2016

+1

@alexndlm
Copy link

alexndlm commented Nov 1, 2016

+1

@hauleth
Copy link

hauleth commented Nov 15, 2016

Ping

@Vistur
Copy link

Vistur commented Nov 17, 2016

+1

10 similar comments
@alikhan-io
Copy link

+1

@mvrkljan
Copy link

+1

@sstine
Copy link

sstine commented Nov 23, 2016

+1

@vasilenko
Copy link

+1

@TrevorHinesley
Copy link

+1

@senid231
Copy link

+1

@twitchard
Copy link

+1

@DanBradbury
Copy link

+1

@rokde
Copy link

rokde commented Feb 7, 2017

+1

@zezic
Copy link

zezic commented Feb 13, 2017

+1

@gabesullice
Copy link
Contributor

@d4rky-pl, comparing Apollo to the JSON:API spec is comparing apples to oranges. Apollo is a server which supports an implementation of the GraphQL query language specification, which itself says nothing about file uploads. JSON:API is a specification which servers can implement.

You'll notice that the Apollo team is not entirely prescriptive about how to handle file uploads; they offer a number of best practices and describe their trade-offs in this blog post.

@x-yuri asks: "Are you not allowed to handle requests, not covered by the spec?"

This question implies the "answer" to this issue. You are perfectly within your rights to create server endpoints to handle file uploads that do not use the JSON:API media type (application/vnd.api+json). A server is not constrained to serving only JSON:API requests.

The spec says:

Clients MUST send all JSON:API data in request documents with the header Content-Type: application/vnd.api+json without any media type parameters.

Since a file is not JSON:API data and since binary data probably do not belong in a request document (base64'd or multipart), the implication is that you shouldn't be using the JSON:API media type to upload files.

Moreover, since the specification is not prescriptive about URL structure, there is simply no way for it to say "your file upload endpoints should live under this URL and should have these side effects on your data."

@wimleers offered up Drupal's implementation as a recommendation/best practice.

So, you suggest doing two requests. First to create a file, the other to attach it to some object. What happens if the second request never arrives?

His original link offers an option for doing uploads in a single request. The link you shared does not document a JSON:API implementation but Drupal's "homegrown" HTTP API. Here is another link specific to which further describes Drupal's JSON:API-adjacent file uploads. Flow 1 uses a single request. Flow 2 uses 2 requests. The reason two requests are needed is due to Drupal's implementation details. I think newly created servers probably can get away with only Flow 1.

Here are my recommendations for handling file uploads. If you would like to open a PR to add to our recommendations section, I'd be happy to help with that :)


If you want to "attach" a file to an existing resource object you can do it in a single request:

POST /users/{uuid}/headshot HTTP/1.1
Content-Type: application/octet-stream
Accept: application/vnd.api+json
Content-Disposition: file; filename="headshot.jpg"

[… binary file data …]

Your server can then accept the file upload and update the /users/{uuid} resource accordingly, all in a single request. It is up to you if the response is a 204 No Content or a 303 See Other with a Location header pointing to a resource of your choice (probably a representation of the file or the user resource, in some cases maybe a relationship endpoint).

If you want to upload a file without "attaching" it to anything and you don't want to have a JSON representation of it, then you don't need JSON:API for that endpoint at all:

PUT /uploads/headshot.jpg HTTP/1.1
Content-Type: application/octet-stream

[… binary file data …]

If you don't want to attach, but you do want a JSON representation of your file:

POST /files HTTP/1.1
Content-Type: application/octet-stream
Accept: application/vnd.api+json
Content-Disposition: file; filename="headshot.jpg"

[… binary file data …]
HTTP/1.1 201 Created
Location: /files/new-uuid-here

{
  "type": "file",
  "id": "new-uuid-here",
  "attributes": {
    "filename": "headshot.jpg",
    "bytes": 42042,
  },
  "links": {
    "self": " /files/new-uuid-here",
    "enclosure": "/uploads/headshot.jpg",
  }
}

I'm sure there are other permutations of attach vs. not attach, representation vs. no representation etc. The gist is this: you can and probably should use the Content-Type: application/octet-stream and Content-Disposition: file; filename="headshot.jpg" headers to upload a file. Encoding files in JSON is a bad practice. If you want to have uploads feel well integrated with the rest of your API, your clients should use the Accept: application/vnd.api+json header to get a useful response. Exactly how you respond to an upload and which URLs you choose for your uploads is implementation-specific.

@AnrDaemon
Copy link

@x-yuri , read my last reply again:

Having different API's to handle each resource is an overcomplication, maintenance overhead and in the end of the day just plain stupid.

Explanation is this:

JSON:API specification explicitly forbids CONTENT-TYPE other than application/json, which precludes multipart/form-data or www-form-urlencoded, which means, we can't use simple, robust, well-tested and widely supported upload techniques. Which mean, if we want to still use them, we would need to set up a separate API service specifically to handle uploads.
Why the… ? Is this API only useful to transfer metadata, but not actual data?

@gabesullice
Copy link
Contributor

@AnrDaemon , I'm not sure what you're saying. The spec does not forbid a server from serving requests with alternate Content-Type header values and definitely not anything "other than application/json" (the JSON:API media type is application/vnd.api+json). As my last comment said, it's fine to have endpoints that handle requests that don't contain "JSON:API data"... such as a file.

@AnrDaemon
Copy link

If you read the topic from the start, you'll understand it better. The problem is that API spec should cover binary uploads. At least uploads.

@TrevorHinesley
Copy link

TrevorHinesley commented Aug 5, 2020

@gabesullice while I definitely understand your point here, and the JSONAPI spec certainly has no obligation to implement this, it's strange to have basically every kind of request/response handled by the spec except file uploads, which is a core part of today's web. It also puts the burden on the client-side to implement more than one Content-Type for requests, which sort-of breaks the 4th wall, killing much of the standardization benefits that come with using a spec like JSONAPI.

@x-yuri
Copy link

x-yuri commented Aug 7, 2020

If you read the topic from the start, you'll understand it better. The problem is that API spec should cover binary uploads. At least uploads.

I did. Let me read your mind a bit. So that people had a suggested way how to handle uploads?

Here are my recommendations for handling file uploads. If you would like to open a PR to add to our recommendations section, I'd be happy to help with that :)


it's strange to have basically every kind of request/response handled by the spec except file uploads, which is a core part of today's web

The only possible answer I could find is:

Since a file is not JSON:API data and since binary data probably do not belong in a request document (base64'd or multipart), the implication is that you shouldn't be using the JSON:API media type to upload files.

Which to me sounds like an ideological reason not backed by practical considerations. At least not out loud. I might guess that it might make the spec harder to understand particularly for implementors (users). Or that it would require a major overhaul of the specification.

But it's not like the specification should cover uploads. You can put all kind of stuff in an HTML document, yet the HTML specification doesn't cover it all. Am I missing something?


It also puts the burden on the client-side to implement more than one Content-Type for requests

Is that that big of a burden? If yes, I'm not sure I'm following, can you elaborate?

which sort-of breaks the 4th wall, killing much of the standardization benefits that come with using a spec like JSONAPI.

Again, what benefits?

@AnrDaemon
Copy link

You're comparing JSON:API to HTML, where in fact this is a specification for API, which is a data exchange protocol, i.e. it is more comparable to HTTP.

Again, what benefits?

Quite clear: standardization leading to interoperability. See also http://xkcd.com/927/

If you ask my opinion, multipart/form-data/multipart/related content-types could be easily part of the API, where a part containing JSON:API document describes the rest of the payloads.

@x-yuri
Copy link

x-yuri commented Aug 7, 2020

You're comparing JSON:API to HTML, where in fact this is a specification for API, which is a data exchange protocol, i.e. it is more comparable to HTTP.

Okay, but again multipart/form-data is not covered by the HTTP RFC, is it?

https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol
https://en.wikipedia.org/wiki/MIME#Form-Data

Quite clear: standardization leading to interoperability. See also http://xkcd.com/927/

In place of creating an alternative to JSON:API, one can draft a standard governing file uploads as part of a REST API. Which is not an outright bad thing, you know, divide and conquer.

If you ask my opinion, multipart/form-data/multipart/related content-types could be easily part of the API, where a part containing JSON:API document describes the rest of the payloads.

That is, easy to add to the specification? You might be right, or not. I never looked at it from the other side of the fence. You might try to make a PR.

Anyway, @gabesullice, can you possibly provide us with some practical (not ideological) reasons for not including file uploads in the spec?

@gabesullice
Copy link
Contributor

gabesullice commented Aug 11, 2020

This conversation is getting too personal, IMO. Let's reset it and work together.

To start, let's ignore for a moment where the rules for file uploads will live (either in the spec itself, our recommendations, an extension, or somewhere else entirely) and instead decide what a "universal" set of rules should be (if there can be such a set of rules).

In this comment, I gave some examples of what file upload requests and responses might look like if they were provided by a server that also implements JSON:API endpoints. In each request, I used the application/octet-stream media type and the request payloads were the raw files themselves, not a file encapsulated in JSON. This seems to be the first point of contention.

@x-yuri and @TrevorHinesley, you both seem to be opposed to the idea of sending files to the server like this. Is that true? If so, why?

@TrevorHinesley , for you specifically, I wonder why you said:

It also puts the burden on the client-side to implement more than one Content-Type for requests

Having to send requests with a different content type seems like it would be a fairly trivial thing to implement. What burden am I not accounting for?


From my perspective, the advantage of application/octet-stream is its simplicity. It is fairly straightforward in any programming language for clients and a servers to stream into and out of the request payload. It has no overhead or special structure. IOW, the payload is the file. Whereas multipart/form-data is not so simple. The payload would be a mix of textual boundaries and binary data, making the file difficult to send/receive without language level or framework level helpers. This first concern applies equally to multipart/related.

Moreover, the multipart/form-data media type is specifically designed for submitting HTML form data and it appears like that has burdened its specification with lots of baggage (multipart/related does not suffer this baggage). So, given that our specification is about APIs using JSON, not HTML, it seems like we might be trying to fit a square peg in a round hole by trying to make our spec and RFC 7578 compatible with each other. This of course can't be proven without trying, it's simply my gut feeling.


I think once we decide on the right media type, everything else will become clear.

@AnrDaemon
Copy link

Simple PUT call is the easiest form of upload, I agree, however, the problem is that it does not carry any metadata. You'll have to invent ways to transfer it somehow, if needed.

@x-yuri
Copy link

x-yuri commented Aug 11, 2020

@x-yuri and @TrevorHinesley, you both seem to be opposed to the idea of sending files to the server like this. Is that true? If so, why?

@gabesullice Did I say that? I expected multipart/form-data. Well, a habit probably. The first thing that comes to mind when I hear "uploading files." But uploading files your way seems reasonable, especially after the explanation.

however, the problem is that it does not carry any metadata. You'll have to invent ways to transfer it somehow, if needed.

@AnrDaemon What kind of metadata? Can you give us a real life example? I guess in that case you can just define another resource?

@TrevorHinesley
Copy link

TrevorHinesley commented Aug 12, 2020

In my opinion, this is pretty cut and dry... an API specification should cover the bases of the atomic operations possible through an API, and file uploads is one of those. For instance, since we've been comparing HTML/JSON/etc, Swagger/OpenAPI has file uploads as part of its spec: https://swagger.io/docs/specification/describing-request-body/file-upload/

@gabesullice my point about putting that burden on the client is that it's the same reason we don't require a client to change the Content-Type for, say, each HTTP verb that an API allows requests with... standardization.

Even if a different media type is ultimately decided upon for this, uploading files isn't a fringe capability that should be sidelined to an extension. It's a standard part of application interfaces, and IMO, should be covered by the spec.

@AnrDaemon
Copy link

@x-yuri, I'd like to retract my statement in favor of, let's call it, "binary documents".
The project I worked with that used JSON:API for data exchange was requiring client to supply confirmation documents as images (photocopies). Each image stored must have had the data associated with it: Kind of document, page#, client ID, date of submission, etc. What we did was that we had basically two access points: one serving metadata as compound document referencing URL's to the images themselves as related documents.
We had to massage the front-end part a bit to make it work the way we wanted, but it was then quite straightforward in the end.

@Doqnach
Copy link

Doqnach commented Aug 12, 2020

The way I see it is using https://tools.ietf.org/html/rfc2387 multipart/related. JSON:API can specify that exactly one 'part' must be application/vnd.api+json (and e.g. type="Application/X-FileUpload"), and some way to refer from the JSON:API objects to documents in other parts. This could be very similar to relationships. I have no idea yet on how that reference could be done though...

To me multipart/related does not have some of the downsides that multipart/form-data has.

@seek-x2y
Copy link

seek-x2y commented Sep 1, 2021

+1

@auvipy
Copy link

auvipy commented Sep 1, 2021

this should be added to a milestone IMHO

@h0rn3t
Copy link

h0rn3t commented Oct 30, 2024

how its going on? ticket created in 2014, but in OPEN status.

@jelhan
Copy link
Contributor

jelhan commented Oct 30, 2024

how its going on? ticket created in 2014, but in OPEN status.

I agree that file uploads is not a concern of the specification. Similar as authentication it is a common requirement of HTTP APIs. But nevertheless not a problem which should be solved by this specification. Different solutions how to integrate file upload with a JSON:API has been discussed in this thread. The specification is not blocking them. Personally, I think this issue can be closed. But I haven't discussed with other editors yet.

@h0rn3t
Copy link

h0rn3t commented Oct 30, 2024

how its going on? ticket created in 2014, but in OPEN status.

I agree that file uploads is not a concern of the specification. Similar as authentication it is a common requirement of HTTP APIs. But nevertheless not a problem which should be solved by this specification. Different solutions how to integrate file upload with a JSON:API has been discussed in this thread. The specification is not blocking them. Personally, I think this issue can be closed. But I haven't discussed with other editors yet.

It would be helpful to add a few lines in the specification to explicitly permit scenarios where a file is uploaded together with meta.json metadata (e.g., for an image cropping service) or with a request.json and digital signature file, esign.p7s. Including this in the spec would clarify that these cases are not restricted, acknowledging their importance and demand.

@jelhan
Copy link
Contributor

jelhan commented Oct 31, 2024

It would be helpful to add a few lines in the specification to explicitly permit scenarios where a file is uploaded

What section of the spec may lead to the wrong impression that the spec restricts APIs from supporting file uploads?

@lode
Copy link
Contributor

lode commented Oct 31, 2024

What section of the spec may lead to the wrong impression that the spec restricts APIs from supporting file uploads?

I think the general way of writing with a set of strict rules gives that feeling, not per se a specific sentence.

Thus a general note on: it is allowed to do another style of API next to this one on the same server, for auth, for uploads, for custom edge cases etc, would help.

@AnrDaemon
Copy link

We all understand that there's a possibility to serve different API's under different paths from the same server.
However, why would I use two different APIs, when I could use one? And given present constraints, it won't be this one.

@jelhan
Copy link
Contributor

jelhan commented Oct 31, 2024

I think the general way of writing with a set of strict rules gives that feeling, not per se a specific sentence.

Thus a general note on: it is allowed to do another style of API next to this one on the same server, for auth, for uploads, for custom edge cases etc, would help.

Thanks for your quick response. I gave it a try in #1772. Would be great if you could review and share your thoughts.

@jelhan
Copy link
Contributor

jelhan commented Oct 31, 2024

However, why would I use two different APIs, when I could use one?

It's not about using two different APIs. It's about composing multiple standards based on the needs of an API. I don't think trying to address all potential needs with one standard is helpful. Instead standards should be combined. This is also what I see in practice: JSON:API is often combined with OAuth 2.0 for authentication and OpenAPI for documentation.

@elb98rm
Copy link
Contributor

elb98rm commented Oct 31, 2024

I've been following this thread for a while, and I'm a bit disappointed by that attitude to be honest.

The fact that OAuth and OpenAPI are seen in the wild alongside JSONAPI is not a statement that this is a good state of affairs. It's not good at all. This reduces standardisation and thus:

  • increases the amount users and developers have to learn
  • increases the complexity of any project

If there was some obscure and custom request, not including it in the JSONAPI spec would be understandable. However, file uploads are fundamental to CRUD. CRUD is effectively the oldest design "pattern/requirement" on the internet and is used the most. I cannot think of a system that would not have some form of use for it. By not supporting it in API spec, you're just marginalising JSONAPI as an approach... The thing designed to stop bikeshedding, literally is leading to bikeshedding.

To be clear, I've had clients refuse the spec because they'd just have to do multiple implementations for file uploads. This is, I repeat for emphasis, one of the most common requirements for a system on the internet.

While this might not seem important to you... Standardising your API approaches is a fundamental idea of an API standard. I'm not being glib when I say "the clue is in the name".

@NuckChorris
Copy link

As others have stated, it would be nice to get an optional jsonapi upload extension after all these years. GraphQL doesn't have an official spec for file uploads but apollo built something and now everybody uses that same protocol. It's clear there's demand for this and I hope it gets done eventually, but I fear it's too little too late for me at least — I've jumped ship to graphql over the decade I was waiting. I highly encourage giving it a try, it's weird but at least they have file uploads and link pagination figured out 🙂

@jelhan
Copy link
Contributor

jelhan commented Oct 31, 2024

it would be nice to get an optional jsonapi upload extension

More community experiments on file upload and other requested features would be great.

Personally I agree with the Apollo team regarding file uploads:

Apollo recommends handling the file upload itself out-of-band of your GraphQL server for the sake of simplicity and security. This "signed URL" approach allows your client to retrieve a URL from your GraphQL server (via S3 or other storage service) and upload the file to the URL rather than to your GraphQL server.

Source: https://www.apollographql.com/docs/apollo-server/v3/data/file-uploads

I think the same applies to any JSON:API.

Nevertheless it would be great to standardize 1) how to request a signed URL for uploads and 2) how to set attributes and relationships for the uploaded file. I think that would be a problem well solving for an extension.

@jelhan jelhan added the extension Related to existing and proposed extensions as well as extensions in general label Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension Related to existing and proposed extensions as well as extensions in general
Projects
None yet
Development

No branches or pull requests