Skip to content

Regular expression-based HTML sanitizer and ad remover, geared toward RSS feed descriptions

License

Notifications You must be signed in to change notification settings

danmactough/node-resanitize

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Resanitize - Regular expression-based HTML sanitizer and ad remover, geared toward RSS feed descriptions

This node.js module provides functions for removing unsafe parts and ads from HTML. I am using it for the <description> element of RSS feeds.

Installation

npm install resanitize

Usage

    var resanitize = require('resanitize')
      , html = '<div style="border: 400px solid pink;">Headline</div>'
      ;

    resanitize(html); // => '<div>Headline</div>'

Notes

This module's opinion of "sanitized" might not meet your security requirements. The mere fact that it uses regular expressions should make this disclaimer unnecessary, but just to be clear: if you intend to display arbitrary user input that includes HTML, you're going to want something more robust.

As of v0.3.0, we've added node-validator's XSS filter. It's certainly an improvement, but still -- be careful. Any concerns about XSS attacks should be directered to node-validator's issue tracker.

Note that the stripUnsafeTags method will loop over the strip an arbitrary number of times (2) to try to strip maliciously nested html tags. After the maximum number of iterations is reached, if the string still appears to contain any unsafe tags, it is deemed unsafe and set to an empty string. If this seems unexpected and/or is causing any problems, please raise an issue.

About

Regular expression-based HTML sanitizer and ad remover, geared toward RSS feed descriptions

Resources

License

Stars

Watchers

Forks

Packages

No packages published