Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

media repo serves utf-8 files with no charset info, leading to incorrect in-browser rendering #7043

Closed
TheStranjer opened this issue Mar 6, 2020 · 4 comments · Fixed by #7044
Labels
A-Media-Repository Uploading, downloading images and video, thumbnailing z-p2 (Deprecated Label)

Comments

@TheStranjer
Copy link
Contributor

Description

Certain text files do not render correctly in-browser.

Example being a character sheet I made:
How It Should Render vs How It Actually Renders

This is not a browser error; it happens in both Chrome and in Firefox.

Steps to reproduce

  • Upload a text file with certain UTF8 characters
  • Click on that file
  • Watch it render incorrectly

Version information

This issue has been identified on matrix.org; the above link on how it actually renders is a link to the matrix.org homeserver.

@richvdh richvdh changed the title Certain text files do not render correctly in-browser media repo serves utf-8 files with no charset info, leading to incorrect in-browser rendering Mar 6, 2020
@richvdh
Copy link
Member

richvdh commented Mar 6, 2020

There's an argument to be made that this is a problem with the client used to upload the file: synapse just preserves the content-type it is given. https://matrix.org/_matrix/media/r0/download/matrix.org/pxTOzBJkLzYjwhiTklQoNVli is a copy of your file, uploaded with a correct content-type.

@TheStranjer
Copy link
Contributor Author

TheStranjer commented Mar 6, 2020

This was uploaded via Riot, which is the most common client, which should not break Synapse when used together. The W3 recommends encoding text files like text/html and text/plain in UTF 8. By fixing this bug, it makes Synapse comply with W3 recommendations better without eliminating the possibility that an obscure client might insist on rendering it in UTF-16 (or whatever else).

The W3 states in bright bold how "very important" it is for administrators to always explicitly state the charset and goes on to recommend defaulting to UTF-8. By the time Synapse is serving text files with no charset specified, the issue is no longer with whichever client sent the file to them, but with Synapse itself. This issue is a concern with Synapse's W3 compliance.

Not only that, but a practical example of Synapse's noncompliance has emerged, hence the issue in the first place. I would have never known this problem existed in Synapse had it functioned properly in spite of it.

@richvdh
Copy link
Member

richvdh commented Mar 6, 2020

This was uploaded via Riot, which is the most common client, which should not break Synapse when used together.

while this is true, it doesn't mean that the issue necessarily needs to be fixed in synapse: it may be more appropriate to fix it in Riot.

Still, I think you're probably right that it makes sense for synapse to assume utf-8 for these content-types if it has no information otherwise.

@richvdh
Copy link
Member

richvdh commented Mar 17, 2020

fixed by #7044

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Media-Repository Uploading, downloading images and video, thumbnailing z-p2 (Deprecated Label)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants