Server: Permit simple HTML in chat messages again #1571

hoffie · 2021-04-27T21:45:15Z

PR #939 disabled HTML usage in chat messages by escaping all user input at the server side for security reasons.

There is demand to keep basic text formatting working for sharing lyrics/chords in a sane way.

This PR re-enables certain safe tags again to enable such use cases.

This works by keeping the existing message escaping, but converting selected safe HTML tags back into their parsable form.

To avoid worsening security again, this PR deliberately

does not permit CSS or any color changes which could enable user
impersonation,
does not accept any attributes in HTML tags,
does not accept invalid HTML (e.g. start tags without end tags),
does not accept nested HTML tags (except for  s within <pre>...</pre>).
does not handle uppercase tags or XML-style

References: #1021 #1524

This PR seems to work, but I'm marking it as draft because the topic hasn't been discussed fully yet.

Freel free to post feedback on the code here. Feedback regarding the best approach should rather be posted to the discussion thread.

cc @atsampson

PR jamulussoftware#939 disabled HTML usage in chat messages by escaping all user input at the server side for security reasons. There is demand to keep basic text formatting working for sharing lyrics/chords in a sane way. This PR re-enables certain safe tags again to enable such use cases. This works by keeping the existing message escaping, but converting selected safe HTML tags back into their parsable form. To avoid worsening security again, this PR deliberately - does not permit CSS or any color changes which could enable user impersonation, - does not accept any attributes in HTML tags, - does not accept invalid HTML (e.g. start tags without end tags), - does not accept nested HTML tags (except for s within <pre>...</pre>). References: jamulussoftware#1021 jamulussoftware#1524

pljones · 2021-04-28T18:27:18Z

does not handle uppercase tags or XML-style  

So you want  ? No! ;) Empty tags with <... /> are the way forward... I've trained myself to use them now, you can't go taking them away... :D

Any reason for the lower case requirement?

hoffie · 2021-04-28T19:15:17Z

does not handle uppercase tags or XML-style  

So you want  ?

No,   (no ending ).

Empty tags with <... /> are the way forward... I've trained myself to use them now, you can't go taking them away... :D

I feel the same, but with XHTML having failed and HTML5 being the accepted standard, I think   is the most-accepted form?

Any reason for the lower case requirement?

Both the   limitation and the lower case limitation are just to keep the implementation simple. It's not a requirement. It can easily be extended to suppor that, but it'll need some more lines of code. I can do that if we want to move forward with this.

pljones · 2021-04-28T20:50:13Z

I'm not really fussed... I'd actually rather avoid the feature entirely. Text chat should be plain text. 🤷

reneknuvers · 2021-04-29T11:30:25Z

I'm not really fussed... I'd actually rather avoid the feature entirely. Text chat should be plain text. 🤷

I think having chords in bold or italics is a good thing. Markdown support is over the top, but would be even safer I guess.

HTML export is widely available in different apps. I think it would be good to look at some example output from for example Onsong and other commonly used apps to see how tags are written and which tags are essential to readability of the lyrics and lead sheets.

atsampson · 2021-04-29T22:23:45Z

I was thinking about how to do this the other day, and your approach looks pretty reasonable to me. I'd agree that the set of HTML elements it allows through needs to be really minimal, and forcing correct nesting and blocking attributes entirely are both good ideas - the Qt HTML parser really isn't designed for dealing with untrusted HTML.

I'd suggest adding some more explicit comments in your "unsanitisation" code to make it clear exactly what it's doing and why (basically the same stuff you've said in your pull request message) - to avoid someone coming along in the future and adding back features that can't be handled safely. The use of [^<>] in the middle of the regexps in particular is pretty subtle and would benefit from a comment that it's explicitly there to avoid matching something that's already been expanded.

This is very much the kind of code that'd benefit from some unit tests if anyone ever gets around to adding a test framework to Jamulus...

(One minor concern I'd have is that minimal regexp matching is expensive, so it might be possible for someone to submit a chat message that uses a lot more CPU time than you'd expect to sanitise. Hopefully the message length limit will put a brake on this, though. You could do it without the regexp searches by matching tags exactly and keeping track of the open elements on a stack, if needed.)

pljones · 2021-04-30T07:28:19Z

This is very much the kind of code that'd benefit from some unit tests if anyone ever gets around to adding a test framework to Jamulus...

Something I seem to be thinking nearly every day...

softins · 2021-05-01T17:26:29Z

src/server.cpp

+    foreach ( auto tag, qlPermittedChatTagNames )
+    {
+        QRegExp pattern = QRegExp ( "&lt;(" + tag + ")&gt;([^<>]+)&lt;/" + tag + "&gt;" );
+        pattern.setMinimal ( true ); // non-greedy matching


Suggest it might be a good idea also to add the following:

pattern.setCaseSensitivity(Qt::CaseInsensitive); // allow upper or lower case tags

Just in case the user is pasting in HTML that happens to use upper-case tags.

hoffie · 2021-05-10T20:36:56Z

Thanks for all the technical feedback. It would of course be possible to incorporate it. However, it seems that the demand for actual multi-line, monospace input is higher (#1021) than demand for individual formatting.
In the interest of keeping things simple, I'm therefore closing this PR for now. However, if it turns out that the other approaches are not feasible, we could still follow up with this PR.

hoffie marked this pull request as draft April 27, 2021 21:45

softins reviewed May 1, 2021

View reviewed changes

hoffie closed this May 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Server: Permit simple HTML in chat messages again #1571

Server: Permit simple HTML in chat messages again #1571

hoffie commented Apr 27, 2021 •

edited

Loading

pljones commented Apr 28, 2021 •

edited

Loading

hoffie commented Apr 28, 2021

pljones commented Apr 28, 2021

reneknuvers commented Apr 29, 2021

atsampson commented Apr 29, 2021

pljones commented Apr 30, 2021

softins May 1, 2021 •

edited

Loading

hoffie commented May 10, 2021

Server: Permit simple HTML in chat messages again #1571

Server: Permit simple HTML in chat messages again #1571

Conversation

hoffie commented Apr 27, 2021 • edited Loading

pljones commented Apr 28, 2021 • edited Loading

hoffie commented Apr 28, 2021

pljones commented Apr 28, 2021

reneknuvers commented Apr 29, 2021

atsampson commented Apr 29, 2021

pljones commented Apr 30, 2021

softins May 1, 2021 • edited Loading

Choose a reason for hiding this comment

hoffie commented May 10, 2021

hoffie commented Apr 27, 2021 •

edited

Loading

pljones commented Apr 28, 2021 •

edited

Loading

softins May 1, 2021 •

edited

Loading