Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

revert libxml unescaped attrs #1746

Merged
merged 4 commits into from
Mar 29, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,22 @@
# unreleased

## Security Notes

[MRI] Behavior in libxml2 has been reverted which caused CVE-2018-8048 (loofah gem), CVE-2018-3740 (sanitize gem), and CVE-2018-3741 (rails-html-sanitizer gem). The commit in question is here:

> https://github.com/GNOME/libxml2/commit/960f0e2

and more information is available about this commit and its impact here:

> https://github.com/flavorjones/loofah/issues/144

This release simply reverts the libxml2 commit in question to protect users of Nokogiri's vendored libraries from similar vulnerabilities.

If you're offended by what happened here, I'd kindly ask that you comment on the upstream bug report here:

> https://bugzilla.gnome.org/show_bug.cgi?id=769760


## Dependencies

* [MRI] libxml2 is updated from 2.9.7 to 2.9.8
Expand Down
2 changes: 2 additions & 0 deletions Manifest.txt
Original file line number Diff line number Diff line change
Expand Up @@ -250,6 +250,7 @@ lib/xalan.jar
lib/xercesImpl.jar
lib/xml-apis.jar
lib/xsd/xmlparser/nokogiri.rb
patches/libxml2/0001-Revert-Do-not-URI-escape-in-server-side-includes.patch
patches/sort-patches-by-date
suppressions/README.txt
suppressions/nokogiri_ruby-2.supp
Expand Down Expand Up @@ -301,6 +302,7 @@ test/html/sax/test_parser.rb
test/html/sax/test_parser_context.rb
test/html/sax/test_parser_text.rb
test/html/sax/test_push_parser.rb
test/html/test_attributes.rb
test/html/test_builder.rb
test/html/test_document.rb
test/html/test_document_encoding.rb
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
From c5538465c08a8ea248a370bf55bc39cd3385e4af Mon Sep 17 00:00:00 2001
From: Mike Dalessio <mike.dalessio@gmail.com>
Date: Thu, 29 Mar 2018 14:09:00 -0400
Subject: [PATCH] Revert "Do not URI escape in server side includes"

This reverts commit 960f0e275616cadc29671a218d7fb9b69eb35588.
---
HTMLtree.c | 49 +++++++++++--------------------------------------
1 file changed, 11 insertions(+), 38 deletions(-)

diff --git a/HTMLtree.c b/HTMLtree.c
index 2fd0c9c..67160c5 100644
--- a/HTMLtree.c
+++ b/HTMLtree.c
@@ -717,49 +717,22 @@ htmlAttrDumpOutput(xmlOutputBufferPtr buf, xmlDocPtr doc, xmlAttrPtr cur,
(!xmlStrcasecmp(cur->name, BAD_CAST "src")) ||
((!xmlStrcasecmp(cur->name, BAD_CAST "name")) &&
(!xmlStrcasecmp(cur->parent->name, BAD_CAST "a"))))) {
+ xmlChar *escaped;
xmlChar *tmp = value;
- /* xmlURIEscapeStr() escapes '"' so it can be safely used. */
- xmlBufCCat(buf->buffer, "\"");

while (IS_BLANK_CH(*tmp)) tmp++;

- /* URI Escape everything, except server side includes. */
- for ( ; ; ) {
- xmlChar *escaped;
- xmlChar endChar;
- xmlChar *end = NULL;
- xmlChar *start = (xmlChar *)xmlStrstr(tmp, BAD_CAST "<!--");
- if (start != NULL) {
- end = (xmlChar *)xmlStrstr(tmp, BAD_CAST "-->");
- if (end != NULL) {
- *start = '\0';
- }
- }
-
- /* Escape the whole string, or until start (set to '\0'). */
- escaped = xmlURIEscapeStr(tmp, BAD_CAST"@/:=?;#%&,+");
- if (escaped != NULL) {
- xmlBufCat(buf->buffer, escaped);
- xmlFree(escaped);
- } else {
- xmlBufCat(buf->buffer, tmp);
- }
-
- if (end == NULL) { /* Everything has been written. */
- break;
- }
-
- /* Do not escape anything within server side includes. */
- *start = '<'; /* Restore the first character of "<!--". */
- end += 3; /* strlen("-->") */
- endChar = *end;
- *end = '\0';
- xmlBufCat(buf->buffer, start);
- *end = endChar;
- tmp = end;
+ /*
+ * the < and > have already been escaped at the entity level
+ * And doing so here breaks server side includes
+ */
+ escaped = xmlURIEscapeStr(tmp, BAD_CAST"@/:=?;#%&,+<>");
+ if (escaped != NULL) {
+ xmlBufWriteQuotedString(buf->buffer, escaped);
+ xmlFree(escaped);
+ } else {
+ xmlBufWriteQuotedString(buf->buffer, value);
}
-
- xmlBufCCat(buf->buffer, "\"");
} else {
xmlBufWriteQuotedString(buf->buffer, value);
}
--
2.9.5

85 changes: 85 additions & 0 deletions test/html/test_attributes.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
require "helper"

module Nokogiri
module HTML
class TestAttr < Nokogiri::TestCase
unless Nokogiri::VersionInfo.instance.libxml2? && Nokogiri::VersionInfo.instance.libxml2_using_system?
#
# libxml2 >= 2.9.2 fails to escape comments within some attributes. It
# wants to ensure these comments can be treated as "server-side includes",
# but as a result fails to ensure that serialization is well-formed,
# resulting in an opportunity for XSS injection of code into a final
# re-parsed document (presumably in a browser).
#
# the offending commit is:
#
# https://github.com/GNOME/libxml2/commit/960f0e2
#
# we'll test this by parsing the HTML, serializing it, then
# re-parsing it to ensure there isn't any ambiguity in the output
# that might allow code injection into a browser consuming
# "sanitized" output.
#
# complaints have been made upstream about this behavior, notably at
#
# https://bugzilla.gnome.org/show_bug.cgi?id=769760
#
# and multiple CVEs have been declared and fixed in downstream
# libraries as a result, a list is being kept up to date here:
#
# https://github.com/flavorjones/loofah/issues/144
#
[
#
# these tags and attributes are determined by the code at:
#
# https://git.gnome.org/browse/libxml2/tree/HTMLtree.c?h=v2.9.2#n714
#
{tag: "a", attr: "href"},
{tag: "div", attr: "href"},
{tag: "a", attr: "action"},
{tag: "div", attr: "action"},
{tag: "a", attr: "src"},
{tag: "div", attr: "src"},
{tag: "a", attr: "name"},
#
# note that div+name is _not_ affected by the libxml2 issue.
# but we test it anyway to ensure our logic isn't modifying
# attributes that don't need modifying.
#
{tag: "div", attr: "name", unescaped: true},
].each do |config|

define_method "test_uri_escaping_of_#{config[:attr]}_attr_in_#{config[:tag]}_tag" do
html = %{<#{config[:tag]} #{config[:attr]}='examp<!--" unsafeattr=unsafevalue()>-->le.com'>test</#{config[:tag]}>}

reparsed = HTML.fragment(HTML.fragment(html).to_html)
attributes = reparsed.at_css(config[:tag]).attribute_nodes

assert_equal [config[:attr]], attributes.collect(&:name)
if Nokogiri::VersionInfo.instance.libxml2?
if config[:unescaped]
#
# this attribute was emitted wrapped in single-quotes, so a double quote is A-OK.
# assert that this attribute's serialization is unaffected.
#
assert_equal %{examp<!--" unsafeattr=unsafevalue()>-->le.com}, attributes.first.value
else
#
# let's match the behavior in libxml < 2.9.2.
# test that this attribute's serialization is well-formed and sanitized.
#
assert_equal %{examp<!--%22%20unsafeattr=unsafevalue()>-->le.com}, attributes.first.value
end
else
#
# yay for consistency in javaland. move along, nothing to see here.
#
assert_equal %{examp<!--%22 unsafeattr=unsafevalue()>-->le.com}, attributes.first.value
end
end
end
end
end
end
end