Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken images when parsing emails #1010

Closed
fernandomm opened this issue Jul 1, 2016 · 4 comments
Closed

Broken images when parsing emails #1010

fernandomm opened this issue Jul 1, 2016 · 4 comments

Comments

@fernandomm
Copy link

I'm using this gem to parse incoming emails and detected that some images were broken. Although other email clients would show those images normally.

By broken, I mean that they can't be opened by any software ( Preview.app, Photoshop etc. ).

After further investigation I detected that the common cause are images using quoted-printable transfer encoding.

Example:

Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline;
    filename="Logo firma.png"
Content-Type: image/png;
    name="Logo firma.png"
Content-Id: <2477BC4D-AE34-4C97-9B0A-****@Home>

This particular email was generated by Apple Mail client.

I also noticed that, if I remove the to_lf call in

::Mail::Utilities.to_lf str.gsub(/(?:=0D=0A|=0D|=0A)\r\n/, "\r\n").unpack("M*").first
, images would show up normally.

I plan to remove to_lf call but i'm thinking: Is there a reason why to_lf is called? Is it trying to fix something else?

@jeremy
Copy link
Collaborator

jeremy commented Jul 1, 2016

Quoted-printable is not binary-safe. It's… complicated 😊

Here's the backstory: #496

We could make a stronger effort at "binary-safe" quoted-printable, but it's tough to make any guarantees, or even round-trip properly (!)

@fernandomm
Copy link
Author

Just giving some feedback.

I've been running this gem without the .to_lf call ( only in decode method ) for almost one month. The software in which it's used process millions of emails per day and so far it didn't caused any issues.

All attachments in quoted printable work and it didn't affected emails with quoted printable content.

@benjamin-thomas
Copy link

I hit this bug. Some PDF attachments were getting corrupted.

If it may be of interest to anyone, I found the following workaround for a quick hack

class AttachmentsDecoder
    def initialize(part)
      @part = part
      @encoding = @part.content_transfer_encoding
    end

    # See original implementation: .gem/ruby/2.3.0/gems/mail-2.6.4/lib/mail/encodings/quoted_printable.rb
    # See related issue: https://github.com/mikel/mail/issues/1010
    def decode
      if @encoding == 'quoted-printable'
        @part.body.raw_source.unpack("M*").first
      else
        @part.decoded
      end
    end
  end
    m = Mail.read(filepath)
    part = m.attachments.find { |att| att.filename == name }

    decoder = AttachmentsDecoder.new(part)
    IO.binwrite('/my/path.pdf', decoder.decode)
  end

Sounds to me like some integration tests are needed, as the extra conversion seems to do more harm than good in some cases.

@jeremy
Copy link
Collaborator

jeremy commented May 22, 2017

Fixed by #1113

@jeremy jeremy closed this as completed May 22, 2017
jeremy added a commit that referenced this issue May 22, 2017
* Omit initial CRLF linefeed conversion since CRLF are required newline
  separators. We shouldn't need to convert bare CR or LF. Update our
  fixture emails to use CRLF throughout. Closes #609. Fixes #408.

* Drop quoted-printable CRLF conversion. This was introduced to
  harmonize with Ruby's \n-based line endings. But this breaks Q-P
  encoding with binary data. It's not *meant* for binary data, but we
  don't yet take adequate measures to use base64 for these cases.
  Reverts #496. Fixes #1010.

Closes #1113
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants