Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unconditionally re-encode locally-uploaded images to strip metadata #8714

Merged
merged 1 commit into from
Sep 17, 2018

Conversation

ClearlyClaire
Copy link
Contributor

This strips metadata on file upload by re-encoding the files, at the cost of image quality and processing resources.

@ClearlyClaire ClearlyClaire added the security Security issues and fixes, vulnerabilities label Sep 17, 2018
@nightpool
Copy link
Member

nightpool commented Sep 17, 2018 via email

@ClearlyClaire
Copy link
Contributor Author

@nightpool

  1. Yup, that is dead code, but I kept it because 2.

  2. I'm trying to find solutions that don't sacrifice processing time or image quality, but it's not possible out of the box with Paperclip/ImageMagick

@ClearlyClaire
Copy link
Contributor Author

Updated to keep LazyThumbnail as it has its own resizing logic. However, despite the name, it's not lazy anymore.

@bhtooefr
Copy link

What about making it a two stage operation - strip metadata on everything, then encode if necessary?

@ClearlyClaire
Copy link
Contributor Author

@bhtooefr not possible without extensive changes, nor ImageMagick nor Paperclip provide support for stripping metadata without re-encoding (convert input -strip output will always re-encode)

@ClearlyClaire
Copy link
Contributor Author

The only tool I have found to strip image metadata is exiftool (there is also mat2, which uses exiftool for images)… however, this is a perl dependency that is pretty uncomfortable to use…

@elomatreb
Copy link

ImageMagicks in-place component can strip metadata without reencoding, mogrify -strip image.jpg

@ClearlyClaire
Copy link
Contributor Author

@elomatreb mogrify still re-encodes, even if it does so in-place

@elomatreb
Copy link

Mh, I could have sworn it was smart enough to not touch the image if it doesn't need to. Nevermind then.

@kitlith
Copy link

kitlith commented Sep 17, 2018

If you're looking for something that isn't perl, maybe http://www.exiv2.org/ ?

Also, it may be a good idea to keep the color profile even after all other exif data is removed.

@Gargron
Copy link
Member

Gargron commented Sep 17, 2018

Keep in mind that lazy thumbnail is a result of #6515, which fixed all sorts of visual glitches with GIF avatars from remote users. We don't really care about EXIF metadata in remote images because presumably they already stripped it themselves when the image was turned into the dimensions we expect.

Reverting that fix means not only more computational overhead for ImageMagick stuff (which I consider acceptable because we already dealt with that pre-optimization, not the end of the world), but unfortunately it means the return of something like #3199 (I am not sure if all glitches had corresponding bug reports), which itself is an ImageMagick bug as far as I understand.

The fix in question became a problem specifically in #8083, when the lazy thumbnail processor was hooked up to media attachments. EXIF data in avatars and headers is unlikely to contain sensitive information, but media attachments may obviously come straight from a camera. This is my bad, I had not connected the dots between the different patches.

I suggest that we keep the behaviour for remote images (through some kinda option passed to the processor?) but unconditionally re-encode locally uploaded images.

@ClearlyClaire
Copy link
Contributor Author

@kitlith thanks, exiv2 looks pretty much like what I want! I'm slightly concerned about security, though, that's yet another binary we're going to feed user-provided images to (the same can be said about exiftool or imagemagick, though)

@ClearlyClaire
Copy link
Contributor Author

@Gargron yeah, that's one of the things I'm looking at, but so far I don't know how to process remote and local attachments differently

@Gargron
Copy link
Member

Gargron commented Sep 17, 2018

@ThibG The processor can call methods on the model, so maybe something like attachment.instance.respond_to?(:remote?) && attachment.instance.remote?

This strips metadata on file upload by re-encoding the files, at the cost
of possible slight image quality decrease and processing resources.
@ClearlyClaire
Copy link
Contributor Author

@Gargron updated to do just that!

@ClearlyClaire ClearlyClaire changed the title Unconditionally re-encode images to strip metadata Unconditionally re-encode locally-uploaded images to strip metadata Sep 17, 2018
@Gargron Gargron merged commit 9b32898 into mastodon:master Sep 17, 2018
Gargron pushed a commit that referenced this pull request Oct 7, 2018
…8714)

This strips metadata on file upload by re-encoding the files, at the cost
of possible slight image quality decrease and processing resources.
abcang pushed a commit to pixiv/mastodon that referenced this pull request Oct 9, 2018
…astodon#8714)

This strips metadata on file upload by re-encoding the files, at the cost
of possible slight image quality decrease and processing resources.
lindwurm added a commit to akane-blue/mastodon that referenced this pull request Nov 1, 2018
@ClearlyClaire ClearlyClaire deleted the fixes/exif branch March 14, 2019 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
security Security issues and fixes, vulnerabilities
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants