Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

html-only emails allow publishing #690

Closed
teastrainer opened this issue Mar 30, 2023 · 14 comments
Closed

html-only emails allow publishing #690

teastrainer opened this issue Mar 30, 2023 · 14 comments
Labels
enhancement New feature or request

Comments

@teastrainer
Copy link

Emails can be used to publish messages via ntfy. But html-only mails are rejected with the following error message "554 5.0.0 Error: transaction failed, blame it on the weather: unsupported content type (in reply to end of DATA command)"

That's obviously for security reasons, which are understandable (potentially active / malicious code hidden in the text).

IIRC, emails consisting of text and html text are processed by ignoring the html part.

But sometimes we cannot change the structure of an email, especially if we want to use emails from home servers, home automation etc. Fritz Box for example can be used for email alerts, but sends html-only mails.

I think, there could be two ways to solve this problem:

a) fall back solution (as for emails consisting of text only and html-text):
Ignore or delete html part, meaning: delete / forget the body text completely. Then only the subject would be left for additional information. This would be completely acceptable to me. The text could be replaced by a warning that it has been removed (to avoid too many questions as to why this was done).

b) remove html tags with the help of regex.
This could be done by a (multiline-) search for "(?s)<.*?>" and replacing it with nothing.

That would be an operation with a sledgehammer, as the text is not preserved completely, especially references / links are removed, too. But this is quite acceptable for me. I'll attach the source code of an email from my fritz box before and after processing (in geany text editor)

1 email source text for test with ntfy.txt
2 email source text for test with ntfy after processing with regex.txt

@teastrainer teastrainer added the enhancement New feature or request label Mar 30, 2023
@binwiederhier
Copy link
Owner

I researched and dabbled with it a bit, and I even had an (infuriatingly bad) conversation with ChatGPT about it (ha! the new world!), and I have decided that there is no safe and easy way to strip HTML tags using regex or other simple means. ChatGPT gave me a few examples showing how stripping with regex could be dangerous.

Anyway, I looked at bluemonday, and it seems that it doesn't pull in a giant chain of other dependencies, so I think it'll be fine to use it for HTML tag stripping.

I'd be happy to accept PRs and/or may do it myself some day when I am bored.

One important note about potential PRs: I do think we should prefer text/plain emails over text/html+stripping, which will change the parsing logic a little.

@teastrainer
Copy link
Author

I'm fine with it. What about the first option - just ignore / delete body text? Implementation of html tag stripping could then be done later.

@binwiederhier
Copy link
Owner

I have experimented with bluemonday and a few html emails and the results are absolutely terrible. Even with post processing, the result looks something like this:

	+=
        	            	+                        =20
        	            	+                                                                           =
        	            	+                        =20
        	            	+                                                                           =
        	            	+                        =20
        	            	+                                                                           =
        	            	+                        =20
        	            	+                                                                           =
        	            	+                        =20
        	            	+                                                                           =
        	            	+                        =20
        	            	+                                                                           =
        	            	+                        =20
        	            	+                                                                           =
        	            	+                        =20
        	            	+
        	            	+                       =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
        	            	+=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C   
        	            	+
        	            	+  =20
        	            	+  =20
        	            	+  =20
        	            	+  =20
        	            	+  =20
        	            	+  =20
        	            	+  =20
        	            	+  =20
        	            	+  =20
        	            	+  =20
        	            	+  =20
        	            	+  =20
        	            	+
        	            	+=09 
        	            	+=09=09=09     
        	            	+=09 
        	            	+=09 
        	            	+=09=09  Dear Philipp,
        	            	+=09=09   
        	            	+=09=09
        	            	+=09=09
        	            	+=09=09We=E2=80=99re reaching out to you because you haven=E2=80=99t finishe=
        	            	+d filing your tax return with Turbotax. You can complete this by following =
        	            	+these instructions:=20
        	            	+     

Not sure if this is better than having nothing at all.

@binwiederhier
Copy link
Owner

See for yourself: #693.

Your demo email translates to this after my post-processing (ignore the " +" at the beginning of the lines):

        	            	+&lt;=21DOCTYPE html&gt;
        	            	+
        	            	+headertext of table
        	            	+
        	            	+&#34; Very important information about a change in your
        	            	+home automation setup 
        	            	+
        	            	+Now the light is on
        	            	+
        	            	+If you don&#39;t want to recieve this message anymore, stop the push
        	            	+ services in your  FRITZ=21Box =2E 
        	            	+Here you can see the active push services: &#34;System &gt; Push Service&#34;=2E
        	            	+
        	            	+This mail has ben sent by your  FRITZ=21Box  automatically=2E

@teastrainer
Copy link
Author

teastrainer commented Apr 7, 2023

One of the problems seems to be that (at least in my example) the charset is utf-8 but the "Content-Transfer-Encoding: quoted-printable"

As bluemonday seems not to support such encoding, it may be necessary to convert the text to a "clean / full" utf-8 version before processing. E.g. FRITZ=21Box should be converted to FRITZ!Box

And I'm wondering about new characters for the processing of my text. In my example the original text <=21DOCTYPE html> is converted to &lt;=21DOCTYPE #html&gt;. So bluemonday replaces the characters < and > with &lt; and &gt; - which looks strange.

Maybe this can help: How to get a quoted printable string in golang

But I see, converting and sanitizing of html is difficult... As I said before I could live with the complete deletion of body text (if it's html-only).

@binwiederhier
Copy link
Owner

quoted-printable is transparently stripped out by Go before, so it should not ever be visible by the reader. See https://pkg.go.dev/mime/multipart#Reader.NextPart --

As a special case, if the "Content-Transfer-Encoding" header has a value of "quoted-printable", that header is instead hidden and the body is transparently decoded during Read calls.

It is odd that the <=21DOCTYPE html> is not properly stripped out though. Not sure what's happening.

But I see, converting and sanitizing of html is difficult... As I said before I could live with the complete deletion of body text (if it's html-only).

That seems like a possibility. I may dabble with it a little more, and if I can't get anything good out, I'll do the title thing.

@teastrainer
Copy link
Author

Are you sure, bluemonday does detect the content-transfer-encoding (at all)? In my example, none of the quoted-printable codes is correctly decoded. Maybe there should be a dedicated "header" (format / tag / declaration) which is missing (at least) in my example.

@Robert-litts
Copy link

Robert-litts commented Apr 13, 2023

Hi! I just wanted to add to this with my use-case for HTML e-mails and Ntfy. I have been working to convert every conceivable device in my home to use Ntfy as my primary notification service. Unfortunately, several items still rely 100% on e-mail notification as their "only" form of notifications, so the SMTP aspect of Ntfy has been a lifesaver (thanks for all the troubleshooting we did over the past few months in Discord & in quickly tackling #610 !)

The latest one I am trying to work on is my Synology NAS which has e-mail notifications, but use HTML formatted messages and therefore receive the "554 5.0.0 Error: transaction failed, blame it on the weather: unsupported content type" error. I know this was discussed/closed in #623 , but saw this issue/WIP PR #693 and wanted to add the code that I receive from Synology when I debugged Ntfy with an incoming e-mail.

Thanks again and please let me know if there is anything else I can gather that might assist with this.

Mar 13 00:33:26 notification ntfy[4540]: DEBUG MAIL FROM: synology@mydomain.me (smtp_hostname=DiskStation, smtp_mail_from=synology@mydomain.me, smtp_remote_addr=192.168.1.28:53882, tag=smtp)
Mar 13 00:33:26 notification ntfy[4540]: DEBUG RCPT TO: synology@mydomain.me (smtp_hostname=DiskStation, smtp_rcpt_to=synology@mydomain.me, smtp_remote_addr=192.168.1.28:53882, tag=smtp)
Mar 13 00:33:26 notification ntfy[4540]: TRACE DATA (smtp_data=Date: Sun, 12 Mar 2023 20:33:26 -0400
Mar 13 00:33:26 notification ntfy[4540]: From: "=?UTF-8?B?Um9iYmll?=" <synology@mydomain.me>
Mar 13 00:33:26 notification ntfy[4540]: To: <synology@mydomain.me>
Mar 13 00:33:26 notification ntfy[4540]: Message-Id: <640e6f562895d.6c9584bcfa491ac9c546b480b32ffc1d@mydomain.me>
Mar 13 00:33:26 notification ntfy[4540]: MIME-Version: 1.0
Mar 13 00:33:26 notification ntfy[4540]: Subject: =?UTF-8?B?W1N5bm9sb2d5IE5BU10gVGVzdCBNZXNzYWdlIGZyb20gTGl0dHNfTkFT?=
Mar 13 00:33:26 notification ntfy[4540]: Content-Type: text/html; charset=utf-8
Mar 13 00:33:26 notification ntfy[4540]: Content-Transfer-Encoding: 8bit
Mar 13 00:33:26 notification ntfy[4540]: 
Mar 13 00:33:26 notification ntfy[4540]: Congratulations! You have successfully set up the email notification on Synology_NAS.<BR>For further system configurations, please visit http://192.168.1.28:5000/, http://172.16.60.5:5000/.<BR>(If you cannot connect to the server, please contact the administrator.)<BR><BR>From Synology_NAS<BR><BR><BR>
Mar 13 00:33:26 notification ntfy[4540]: , smtp_hostname=DiskStation, smtp_remote_addr=192.168.1.28:53882, tag=smtp)
Mar 13 00:33:26 notification ntfy[4540]: DEBUG Incoming mail error (error=unsupported content type, smtp_hostname=DiskStation, smtp_remote_addr=192.168.1.28:53882, tag=smtp)```

@gedw99
Copy link

gedw99 commented Apr 18, 2023

am curious if you want to use goang templates or some other IDL / AST to produce the email.

i ask because:

  • there are a few around and the exist because as your seeing it hellish to produce emails that are html like.
  • nifty produces to many device types and so if you want to send a single message to many device types you often need an IDL where you describe the "message" with some quasi design aspects. Its is then piped into a per devices type builder that spits out the message for each device archetype.

@liamfleming26
Copy link

Just checking if there's any movement on this or any workarounds. I was ecstatic when I found out about this project which replaced a bunch of telegram bots serving a similar purpose for me.

I know nothing about Go so wouldn't know where to start. I did however use this project in the past to forward SMTP emails to the desired telegram channel. Unsure if it provides any pointers on the ContentType issues folks are facing.

(KostyaEsmukov/smtp_to_telegram)

@binwiederhier
Copy link
Owner

I have decided to merge the original PR and add support for HTML-only emails. It comes up enough to merge in support, even though it is very sub-par. Don't expect too much. But at least mail will not be rejected anymore.

See #693

This will be in the next release.

@binwiederhier
Copy link
Owner

@Robert-litts I used your demo email in a test and it comes out nicely actually: 859a4e4

@Robert-litts
Copy link

@binwiederhier Awesome & appreciate the effort on this one. I'm looking forward to testing this out. Thanks again.

@teastrainer
Copy link
Author

This ist great news, and I can confirm, it works for me, too 🥳 Thank you so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants