Skip to content

Commit

Permalink
Merge pull request #68 from gjtorikian/err
Browse files Browse the repository at this point in the history
Stop operating handlers on deleted elements
  • Loading branch information
gjtorikian authored Jul 29, 2024
2 parents 1f0f69c + 9641567 commit 6048adf
Show file tree
Hide file tree
Showing 5 changed files with 177 additions and 19 deletions.
36 changes: 18 additions & 18 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions ext/selma/src/rewriter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -515,6 +515,10 @@ impl SelmaRewriter {
}));
}

if element.removed() {
return Ok(());
}

let rb_element = SelmaHTMLElement::new(element, ancestors);
let rb_result =
rb_handler.funcall::<_, _, Value>(Self::SELMA_HANDLE_ELEMENT, (rb_element,));
Expand Down
2 changes: 1 addition & 1 deletion lib/selma/version.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# frozen_string_literal: true

module Selma
VERSION = "0.4.3"
VERSION = "0.4.4"
end
99 changes: 99 additions & 0 deletions test/fixtures/deleting_content.html

Large diffs are not rendered by default.

55 changes: 55 additions & 0 deletions test/selma_maliciousness_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -223,4 +223,59 @@ def test_rewriter_does_not_halt_on_malformed_html

Selma::Rewriter.new(sanitizer: sanitizer, handlers: [ContentExtractor.new]).rewrite(html)
end

class TagRemover
SELECTOR = Selma::Selector.new(match_element: "*")

def selector
SELECTOR
end

UNNECESSARY_TAGS = [
"pre",
]

CONTENT_TO_KEEP = [
"html",
"body",
]

def handle_element(element)
if UNNECESSARY_TAGS.include?(element.tag_name)
element.remove
elsif CONTENT_TO_KEEP.include?(element.tag_name)
element.remove_and_keep_content
end
end
end

class ContentBreaker
SELECTOR = Selma::Selector.new(match_element: "*")

def selector
SELECTOR
end

def handle_element(element)
if Selma::Sanitizer::Config::DEFAULT[:whitespace_elements].include?(element.tag_name) && !element.removed?
element.append("\n", as: :text)
end
element.remove_and_keep_content
end
end

def test_deleted_content_does_not_segfault
html = load_fixture("deleting_content.html")

sanitizer_config = Selma::Sanitizer::Config::RELAXED.dup.merge({
allow_comments: false,
allow_doctype: false,
})
sanitizer = Selma::Sanitizer.new(sanitizer_config)

selma = Selma::Rewriter.new(sanitizer: sanitizer, handlers: [TagRemover.new, ContentBreaker.new])
10.times do
selma.rewrite(html)
end
end
end

0 comments on commit 6048adf

Please sign in to comment.