Skip to content
This repository was archived by the owner on Apr 12, 2024. It is now read-only.

fix: text that looks like an html tag but is not causes [$sanitize:badparse] error #8193

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions src/ngSanitize/sanitize.js
Original file line number Diff line number Diff line change
Expand Up @@ -277,6 +277,10 @@ function htmlParser( html, handler ) {
html = html.substring( match[0].length );
match[0].replace( START_TAG_REGEXP, parseStartTag );
chars = false;
} else {
// no ending tag found
if (handler.chars) handler.chars( '<' );
html = html.substring(1);
}
}

Expand Down
19 changes: 18 additions & 1 deletion test/ngSanitize/sanitizeSpec.js
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ describe('HTML', function() {

var handler, start, text, comment;
beforeEach(function() {
text = "";
handler = {
start: function(tag, attrs, unary){
start = {
Expand All @@ -35,7 +36,7 @@ describe('HTML', function() {
});
},
chars: function(text_){
text = text_;
text += text_;
},
end:function(tag) {
expect(tag).toEqual(start.tag);
Expand Down Expand Up @@ -81,6 +82,16 @@ describe('HTML', function() {
expect(text).toEqual('text');
});

it('should parse unterminated tags as regular content', function() {
htmlParser('<a text1 text2 <a text1 text2', handler);
expect(text).toEqual('<a text1 text2 <a text1 text2');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is wrong because it breaks html parsing rules. browsers either ignore this and throw it away or try to autocorrect the html.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is correct --- I guess we need a non-alpha character to precede the < (in testing, <ê, <3, <- are all fine). I don't think we can come even close to being as crazy as the actual parsing rules though, that would be way too much code.

  • A sequence of bytes starting with a 0x3C byte (ASCII <), optionally a 0x2F byte (ASCII /), and finally a byte in the range 0x41-0x5A or 0x61-0x7A (an ASCII letter)
    1. Advance the position pointer so that it points at the next 0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0C (ASCII FF), 0x0D (ASCII CR), 0x20 (ASCII space), or 0x3E (ASCII >) byte.
    2. Repeatedly get an attribute until no further attributes can be found, then jump to the step below labeled next byte.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should sanitize convert the < to &lt; in text that looks like unterminated tags?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes (for example, http://jsfiddle.net/fMDyg/)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you take it from here and do the necessary change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can have a go at it :>

});

it('should accept tag delimiters such as "<" inside real tags', function() {
htmlParser('<p> 10 < 100 </p>', handler);
expect(text).toEqual(' 10 < 100 ');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we convert the < to &lt; in this case?

});

it('should parse newlines in tags', function() {
htmlParser('<\ntag\n attr="value"\n>text<\n/\ntag\n>', handler);
expect(start).toEqual({tag:'tag', attrs:{attr:'value'}, unary:false});
Expand Down Expand Up @@ -195,6 +206,12 @@ describe('HTML', function() {
expectHTML('\na\n').toEqual('&#10;a&#10;');
});

it('should accept tag delimiters such as "<" inside real tags (with nesting)', function() {
//this is an integrated version of the 'should accept tag delimiters such as "<" inside real tags' test
expectHTML('<p> 10 < <span>100</span> </p>')
.toEqual('<p> 10 &lt; <span>100</span> </p>');
});

describe('htmlSanitizerWriter', function() {
/* global htmlSanitizeWriter: false */
if (angular.isUndefined(window.htmlSanitizeWriter)) return;
Expand Down