Releases: rgrove/sanitize
v5.2.0
Changes
-
The term "whitelist" has been replaced with "allowlist" throughout Sanitize's source and documentation.
While the etymology of "whitelist" may not be explicitly racist in origin or intent, there are inherent racial connotations in the implication that white is good and black (as in "blacklist") is not.
This is a change I should have made long ago, and I apologize for not making it sooner.
-
In transformer input, the
:is_whitelisted
and:node_whitelist
keys are now deprecated. New:is_allowlisted
and:node_allowlist
keys have been added. The old keys will continue to work in order to avoid breaking existing code, but they are no longer documented and may be removed in a future semver major release.
v5.1.0
Features
- Added a
:parser_options
config hash, which makes it possible to pass custom parsing options to Nokogumbo. @austin-wang - #194
Bug Fixes
- Non-characters and non-whitespace control characters are now stripped from HTML input before parsing to comply with the HTML Standard's preprocessing guidelines. Prior to this Sanitize had adhered to older W3C guidelines that have since been withdrawn. #179
v5.0.0
For most users, upgrading from 4.x shouldn't require any changes. However, the minimum required Ruby version has changed, and Sanitize 5.x's HTML output may differ in some small ways from 4.x's output. If this matters to you, please review the changes below carefully.
Potentially Breaking Changes
-
Ruby 2.3.0 is now the oldest officially supported Ruby version. Sanitize may work in older 2.x Rubies, but they aren't actively tested. Sanitize definitely no longer works in Ruby 1.9.x.
-
Upgraded to Nokogumbo 2.x, which fixes various bugs and adds standard-compliant HTML serialization. @stevecheckoway - #189
-
Children of the following elements are now removed by default when these elements are removed, rather than being preserved and escaped:
iframe
noembed
noframes
noscript
script
style
-
Children of whitelisted
iframe
elements are now always removed. In modern HTML,iframe
elements should never have children. In HTML 4 and earlieriframe
elements were allowed to contain fallback content for legacy browsers, but it's been almost two decades since that was useful. -
Fixed a bug that caused
:remove_contents
to behave as if it were set totrue
when it was actually an Array.
v2.1.1
-
CVE-2018-3740: Fixed an HTML injection vulnerability that could allow XSS (backported from Sanitize 4.6.3). @dometto - #188
When Sanitize <= 2.1.0 is used in combination with libxml2 >= 2.9.2, a specially crafted HTML fragment can cause libxml2 to generate improperly escaped output, allowing non-whitelisted attributes to be used on whitelisted elements.
Sanitize now performs additional escaping on affected attributes to prevent this.
Many thanks to the Shopify Application Security Team for responsibly reporting this issue.
v4.6.6
- Improved performance and memory usage by optimizing
Sanitize#transform_node!
@stanhu - #183
v4.6.5
- Improved performance slightly by tweaking the order of built-in transformers. @rafbm - #180
4.6.4 (2018-03-20)
- Fixed: A change introduced in 4.6.2 broke certain transformers that relied on being able to mutate the name of an HTML node. That change has been reverted and a test has been added to cover this case. @zetter - #177
4.6.3 (2018-03-19)
-
CVE-2018-3740: Fixed an HTML injection vulnerability that could allow XSS.
When Sanitize <= 4.6.2 is used in combination with libxml2 >= 2.9.2, a specially crafted HTML fragment can cause libxml2 to generate improperly escaped output, allowing non-whitelisted attributes to be used on whitelisted elements.
Sanitize now performs additional escaping on affected attributes to prevent this.
Many thanks to the Shopify Application Security Team for responsibly reporting this issue.
4.6.2 (2018-03-19)
- Reduced string allocations to optimize memory usage. @janklimo - #175
4.6.1 (2018-03-15)
- Added support for frozen string literals in Ruby 2.4+. @flavorjones - #174