Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify Content-Type and Content-Disposition usage in the media repo #1935

Merged
merged 5 commits into from
Sep 2, 2024

Conversation

zecakeh
Copy link
Contributor

@zecakeh zecakeh commented Aug 20, 2024

Supersedes #1758.

As per MSC2701 and MSC2702.

Pull Request Checklist

Preview: https://pr1935--matrix-spec-previews.netlify.app

Co-authored-by: Travis Ralston <travisr@matrix.org>
Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr>
@zecakeh zecakeh requested a review from a team as a code owner August 20, 2024 14:12
Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr>
Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr>
Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr>
@zecakeh
Copy link
Contributor Author

zecakeh commented Aug 20, 2024

I am not sure how to write up #1758 (comment) as I don't know where it's explained why it is safe.

@tulir
Copy link
Member

tulir commented Aug 22, 2024

My understanding on that topic is:

  • When a Content-Type header is provided, browsers and such will only render the data using that type. For example, if you serve a HTML file with image/png, browsers will just say it's corrupted, they won't render it as a HTML page. Therefore, there's no risk of trusting the user-defined content type, as long as the Content-Disposition is calculated based on the same type.
  • If you were to calculate Content-Disposition with a different type than what is served to downloaders in the Content-Type header, it may open up vulnerabilities: for example, if someone manages to craft a HTML file which your sniffer thinks is a PNG, you'd sniff that it's safe to send as inline. If you then serve that with the uploader's text/html type and inline disposition based on the sniffed content type, browsers will render it as HTML. As long as the Content-Disposition is derived from the type served to clients, it's safe.

@Benjamin-L
Copy link

Benjamin-L commented Aug 22, 2024

edit: I wrote this and submitted it before seeing tulir's comment. We're saying effectively the same thing, and their explanation is clearer

I am not sure how to write up #1758 (comment) as I don't know where it's explained why it is safe.

I don't know of a place where it's written down yet, but have had this conversation on matrix several times with different people. I believe the reason it's safe is that it should not be possible for a file with a Content-Type from the allow-list to trigger XSS, even with Content-Disposition: inline and with arbitrary content.

When I've seem claims that it is not safe, usually these are thinking about a case where a malicious client uploads a XSS-triggering file and then sets the incorrect Content-Type, causing the server to serve Content-Disposition: inline. For example, uploading an SVG containing a script element while setting Content-Type: image/png. This case is, in fact, safe, because the server will also send Content-Type: image/png to the browser and the browser will not interpret the file as an svg.

The thing that is actually unsafe is mixing and matching between the user-provided Content-Type and a sniffed content type. The Content-Disposition value must be chosen based on the same content type used in the Content-Type header. If the server uses a sniffed value for Content-Type in the response but chooses Content-Disposition based on the user-provided content type or vice versa, it allows XSS.

@Benjamin-L
Copy link

This PR doesn't change the Content-Disposition requirements for the federated media endpoints. I assume this is intentional, because the client HS should just treat the value as untrusted anyway and base the disposition on the returned type instead?

@zecakeh
Copy link
Contributor Author

zecakeh commented Aug 23, 2024

This PR doesn't change the Content-Disposition requirements for the federated media endpoints. I assume this is intentional, because the client HS should just treat the value as untrusted anyway and base the disposition on the returned type instead?

The reason that I didn't change it is because the description is vague and says:

[…] using Content-Type and Content-Disposition headers as appropriate;

I took that as "use the same rules for those headers as the CS API", but maybe it needs to be spelled out. In any case the SS endpoint should return the filename in Content-Disposition if there is one, so the client HS can reuse it on the /download endpoint.

@Benjamin-L
Copy link

I'd definitely be in favor of either explicitly stating that the requirements are the same as the CS API or explicitly stating that they are not. If the filename is the only thing that matters the spec should state that so that implementers aren't guessing whether the vaugeness was a mistake.

Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr>
@zecakeh
Copy link
Contributor Author

zecakeh commented Aug 23, 2024

Thanks to the both of you, I whipped-up a paragraph based on your explications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants