From bd334473846a9e823a623b473040eb2e57d11b9d Mon Sep 17 00:00:00 2001 From: Daniel Vogelheim Date: Tue, 13 Aug 2024 18:42:11 +0200 Subject: [PATCH 1/6] Draft new config extension. --- explainer.md | 38 ++++ index.bs | 584 +++++++++++++++++++++------------------------------ 2 files changed, 273 insertions(+), 349 deletions(-) diff --git a/explainer.md b/explainer.md index c306521..8ead847 100644 --- a/explainer.md +++ b/explainer.md @@ -363,6 +363,44 @@ element.setHTML("XXXXXX", {sanitizer: config_comments}); //
XXXXXX
``` +### Modifying Existing Configurations + +The `Sanitizer` object offers multiple methods to easily modify or tailor +an existing configuration. The query methods (`get()` and `getUnsafe()`) can +be used to retrieve a dictionary representation of a Sanitizer, +for introspection, or for use with the Sanitizer constructor to create a new +Sanitizer. Additionally, there are methods that directly manipulate the filter +functionality of the Sanitizer. + +The following methods are offered on the Sanitizer object: + +- `allow(x, options)` + - `options` is an optional dictionary argument. + Supported keys are: `"attributes":` and `"removeAttributes":.` +- `removeElement(x)` +- `replaceWithChildren(x)` +- `allowAttribute(x)` +- `removeAttribute(x)` + +These correspond 1:1 to the keys in the configuration dictionary. + +Adding an element or attribute to any of the allow- or deny-lists will also +remove that element or attribute from the other lists for its type. E.g., +calling `allow(x)` will also remove `x` from the removeElements and +replaceWithChildrenElements lists. + +Any name can be given as either a string, or a dictionary with name or +namespace, just as with the configuration dictionary. + +```js +const s = new Sanitizer({ elements: ["div", "p", "b"] }); +s.element("span"); +s.removeElement("b"); +s.get(); // { elements: ["div", "p", "span"], removeElements: ["b"] } + // Really, all these entries will be dictionaries with name and + // namespace entries. +``` + ### Configuration Errors The configuration allows expressing redundant or even contradictory options. diff --git a/index.bs b/index.bs index ae81ef6..08e53ca 100644 --- a/index.bs +++ b/index.bs @@ -42,6 +42,18 @@ text: parse HTML from a string; type: dfn; url: https://html.spec.whatwg.org/#pa } } + + # Introduction # {#intro} @@ -181,10 +193,9 @@ The parseHTMLUnsafe(|html|, |options|) method s Note: Since |document| does not have a browsing context, scripting is disabled. 1. Set |document|'s [=allow declarative shadow roots=] to true. 1. [=Parse HTML from a string=] given |document| and |compliantHTML|. -1. Let |config| be the result of calling [=get a sanitizer config from options=] - with |options| and false. -1. If |config| is not [=list/empty=], - then call [=sanitize=] on |document|'s [=tree/root|root node=] with |config|. +1. Let |sanitizer| be the result of calling [=get a sanitizer instance from options=] + with |options|. +1. Call [=sanitize=] on |document|'s [=tree/root|root node=] with |sanitizer|. 1. Return |document|. @@ -198,9 +209,9 @@ The parseHTML(|html|, |options|) method steps a Note: Since |document| does not have a browsing context, scripting is disabled. 1. Set |document|'s [=allow declarative shadow roots=] to true. 1. [=Parse HTML from a string=] given |document| and |html|. -1. Let |config| be the result of calling [=get a sanitizer config from options=] - with |options| and true. -1. Call [=sanitize=] on |document|'s [=tree/root|root node=] with |config|. +1. Let |sanitizer| be the result of calling [=get a sanitizer instance from options=] + with |options|. +1. Call [=sanitize=] on |document|'s [=tree/root|root node=] with |sanitizer|. 1. Return |document|. @@ -223,7 +234,7 @@ lifetime, and can then be used whenever needed. This allows implementations to pre-process configurations. The configuration object is also query-able and can return -[=SanitizerConfig/canonical=] configuration dictionaries, +configuration dictionaries, in both safe and unsafe variants. This allows a page to query and predict what effect a given configuration will have, or to build a new configuration based on an existing one. @@ -232,8 +243,20 @@ to build a new configuration based on an existing one. [Exposed=(Window,Worker)] interface Sanitizer { constructor(optional SanitizerConfig config = {}); + + // Query configurations: SanitizerConfig get(); SanitizerConfig getUnsafe(); + + // Modifying a Sanitizer: + undefined element(SanitizerElementNamespaceWithAttributes element); + undefined removeElement(SanitizerElement element); + undefined replaceWithChildren(SanitizerElement element); + undefined allowAttribute(SanitizerAttribute attribute); + undefined removeAttribute(SanitizerAttribute attribute); + undefined setComment(boolean allow); + undefined setDataAttributes(boolean allow); + undefined setOtherMarkup(boolean allow); }; @@ -241,23 +264,102 @@ interface Sanitizer { The constructor(|config|) method steps are: -1. Store |config| in [=this=]'s [=internal slot=]. +1. [=Set a config|Set=] |config| on [=this=]. + +Issue: This abandons all error handling, because setting a config will + just overwrite contradictory entries. Do we want this?
The get() method steps are: -1. Return the result of [=canonicalize a configuration=] with the value of - [=this=]'s [=internal slot=] and true. +1. Return the result of calling [=safeify=] on the result of + [=Sanitizer/getUnsafe=].
The getUnsafe() method steps are: -1. Return the result of [=canonicalize a configuration=] with the value of - [=this=]'s [=internal slot=] and false. +1. Return the value of [=this=]'s [=internal slot=]. + +
+ +
+The element(|element|) method steps are: + +1. Let |name| be the result of [=canonicalize a sanitizer name=] |element| with [=HTML namespace=] as the default namespace. +1. [=list/Append=] |name| to [=this=]'s [=internal slot=]'s {{SanitizerConfig/elements}} list. +1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s {{SanitizerConfig/removeElements}}. +1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s + {{SanitizerConfig/replaceWithChildrenElements}}. + +ISSUE: This does not handle per-element attribute allow/remove lists. +
+ +
+The removeElement(|element|) method steps are: + +1. Let |name| be the result of [=canonicalize a sanitizer name=] |element| with [=HTML namespace=] as the default namespace. +1. [=list/Append=] |name| from [=this=]'s [=internal slot=]'s {{SanitizerConfig/removeElements}}. +1. [=list/Remove=] |name| to [=this=]'s [=internal slot=]'s {{SanitizerConfig/elements}} list. +1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s + {{SanitizerConfig/replaceWithChildrenElements}}. + +
+ + +
+The replaceWithChildren(|element|) method steps are: + +1. Let |name| be the result of [=canonicalize a sanitizer name=] |element| with [=HTML namespace=] as the default namespace. +1. [=list/Append=] |name| from [=this=]'s [=internal slot=]'s + {{SanitizerConfig/replaceWithChildrenElements}}. +1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s {{SanitizerConfig/removeElements}}. +1. [=list/Remove=] |name| to [=this=]'s [=internal slot=]'s {{SanitizerConfig/elements}} list. + +
+ +
+The allowAttribute(|attribute|) method steps are: + +1. Let |name| be the result of [=canonicalize a sanitizer name=] |attribute| with the `null` as the default namespace. +1. [=list/Append=] |name| from [=this=]'s [=internal slot=]'s + {{SanitizerConfig/attributes}}. +1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s {{SanitizerConfig/removeAttributes}}. + +
+ + +
+The removeAttribute(|attribute|) method steps are: + +1. Let |name| be the result of [=canonicalize a sanitizer name=] |attribute| with the `null` as the default namespace. +1. [=list/Append=] |name| from [=this=]'s [=internal slot=]'s {{SanitizerConfig/removeAttributes}}. +1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s + {{SanitizerConfig/attributes}}. + +
+ +
+The setComment(|allow|) method steps are: + +1. Set [=this=]'s [=internal slot=]'s {{SanitizerConfig/comments}} to |allow|. + +
+ +
+The setDataAttributes(|allow|) method steps are: + +1. Set [=this=]'s [=internal slot=]'s {{SanitizerConfig/dataAttributes}} to |allow|. + +
+ +
+The setOtherMarkup(|allow|) method steps are: + +1. Set [=this=]'s [=internal slot=]'s {{SanitizerConfig/otherMarkup}} to |allow|.
@@ -294,6 +396,7 @@ dictionary SanitizerConfig { boolean comments; boolean dataAttributes; + boolean otherMarkup; }; @@ -308,40 +411,53 @@ To set and filter HTML, given an {{Element}} or {{DocumentFragment}} 1. If |safe| and |contextElement|'s [=Element/local name=] is "`script`" and |contextElement|'s [=Element/namespace=] is the [=HTML namespace=] or the [=SVG namespace=], then return. -1. Let |config| be the result of calling [=get a sanitizer config from options=] - with |options| and |safe|. +1. Let |sanitizer| be the result of calling [=get a sanitizer instance from options=] + with |options|. 1. Let |newChildren| be the result of the HTML [=fragment parsing algorithm steps=] given |contextElement|, |html|, and true. 1. Let |fragment| be a new {{DocumentFragment}} whose [=node document=] is |contextElement|'s [=node document=]. 1. [=list/iterate|For each=] |node| in |newChildren|, [=list/append=] |node| to |fragment|. -1. If |config| is not [=list/empty=], then run [=sanitize=] on |fragment| using |config|. +1. Run [=sanitize=] on |fragment| using |sanitizer| and |safe|. 1. [=Replace all=] with |fragment| within |target|.
-To get a sanitizer config from options for -an options dictionary |options| and a boolean |safe|, do: +To get a sanitizer instance from options for +an options dictionary |options|, do: 1. Assert: |options| is a [=dictionary=]. -1. If |options|["`sanitizer`"] doesn't [=map/exist=], then return undefined. +1. If |options|["`sanitizer`"] doesn't [=map/exist=], + then return new {{Sanitizer}}(). 1. Assert: |options|["`sanitizer`"] is either a {{Sanitizer}} instance or a [=dictionary=]. 1. If |options|["`sanitizer`"] is a {{Sanitizer}} instance: - 1. Then let |config| be the value of |options|["`sanitizer`"]'s [=internal slot=]. - 1. Otherwise let |config| be the value of |options|["`sanitizer`"]. -1. Return the result of calling [=canonicalize a configuration=] on - |config| and |safe|. + Then return |options|["`sanitizer`"]. +1. Assert: |options|["`sanitizer`"] is a [=dictionary=]. +1. Return new {{Sanitizer}}(|options|["`sanitizer`"]).
## Sanitization Algorithms ## {#sanitization} -
+
For the main sanitize operation, using a {{ParentNode}} |node|, a -[=SanitizerConfig/canonical=] {{SanitizerConfig}} |config|, run these steps: +{{Sanitizer}} |sanitizer| and a [=boolean=] |safe|, run these steps: + +1. Let |config| be the value of |sanitizer|'s [=internal slot=]. +1. If |safe|, let |config| be the result of calling [=safeify=] on |config|. +1. Call [=sanitize core=] on |node|, |config|, and |safe| (as value for + handling javascript navigation urls). + +
+ +
+The sanitize core operation, +using a {{ParentNode}} |node|, a {{SanitizerConfig}} |config|, and a +[=boolean=] |handle javascript navigation urls|, iterates over the DOM tree +beginning with |node|, and may recurse to handle some special cases (e.g. +template contents). It consistes of these steps: -1. [=Assert=]: |config| is [=SanitizerConfig/canonical=]. 1. Let |current| be |node|. 1. [=list/iterate|For each=] |child| in |current|'s [=tree/children=]: 1. [=Assert=]: |child| [=implements=] {{Text}}, {{Comment}}, or {{Element}}. @@ -358,324 +474,87 @@ For the main sanitize operation, using a {{ParentNode}} |node|, a 1. else: 1. Let |elementName| be a {{SanitizerElementNamespace}} with |child|'s [=Element/local name=] and [=Element/namespace=]. - 1. If |config|["{{SanitizerConfig/elements}}"] exists and - |config|["{{SanitizerConfig/elements}}"] does not [=SanitizerConfig/contain=] - [|elementName|]: + 1. If |config|["{{SanitizerConfig/removeElements}}"] [=SanitizerConfig/contains=] |elementName|, or if |config|["{{SanitizerConfig/elements}}"] does not [=SanitizerConfig/contain=] |elementName| and |config|["{{SanitizerConfig/otherMarkup}}"] is false: 1. [=/remove=] |child|. - 1. else if |config|["{{SanitizerConfig/removeElements}}"] exists and - |config|["{{SanitizerConfig/removeElements}}"] [=SanitizerConfig/contains=] - [|elementName|]: - 1. [=/remove=] |child|. - 1. If |config|["{{SanitizerConfig/replaceWithChildrenElements}}"] exists and |config|["{{SanitizerConfig/replaceWithChildrenElements}}"] [=SanitizerConfig/contains=] |elementName|: - 1. Call [=sanitize=] on |child| with |config|. + 1. If |config|["{{SanitizerConfig/replaceWithChildrenElements}}"] [=SanitizerConfig/contains=] |elementName|: + 1. Call [=sanitize core=] on |child| with |config| and + |handle javascript navigation urls|. 1. Call [=replace all=] with |child|'s [=tree/children=] within |child|. 1. If |elementName| [=equals=] «[ "`name`" → "`template`", "`namespace`" → [=HTML namespace=] ]» - 1. Then call [=sanitize=] on |child|'s [=template contents=] with |config|. + 1. Then call [=sanitize core=] on |child|'s [=template contents=] with + |config| and |handle javascript navigation urls|. 1. If |child| is a [=shadow host=]: - 1. Then call [=sanitize=] on |child|'s [=Element/shadow root=] with |config|. - 1. [=list/iterate|For each=] |attr| in |current|'s [=Element/attribute list=]: + 1. Then call [=sanitize core=] on |child|'s [=Element/shadow root=] with + |config| and |handle javascript navigation urls|. + 1. [=list/iterate|For each=] |attr| in |child|'s [=Element/attribute list=]: 1. Let |attrName| be a {{SanitizerAttributeNamespace}} with |attr|'s [=Attr/local name=] and [=Attr/namespace=]. - 1. If |config|["{{SanitizerConfig/attributes}}"] exists and - |config|["{{SanitizerConfig/attributes}}"] does not [=SanitizerConfig/contain=] - |attrName|: - 1. If "data-" is a [=code unit prefix=] of [=Attr/local name=] and - if [=Attr/namespace=] is `null` and - if |config|["{{SanitizerConfig/dataAttributes}}"] exists and is false: - 1. Remove |attr| from |child|. - 1. else if |config|["{{SanitizerConfig/removeAttributes}}"] exists and - |config|["{{SanitizerConfig/removeAttributes}}"] [=SanitizerConfig/contains=] - |attrName|: - 1. Remove |attr| from |child|. - 1. If |config|["{{SanitizerConfig/elements}}"][|elementName|] exists, - and if - |config|["{{SanitizerConfig/elements}}"][|elementName|]["{{SanitizerElementNamespaceWithAttributes/attributes}}"] - exists, and if - |config|["{{SanitizerConfig/elements}}"][|elementName|]["{{SanitizerElementNamespaceWithAttributes/attributes}}"] - does not [=SanitizerConfig/contain=] |attrName|: + 1. If |config|["{{SanitizerConfig/removeAttributes}}"] + [=SanitizerConfig/contains=] |attrName|: 1. Remove |attr| from |child|. - 1. If |config|["{{SanitizerConfig/elements}}"][|elementName|] exists, - and if - |config|["{{SanitizerConfig/elements}}"][|elementName|]["{{SanitizerElementNamespaceWithAttributes/removeAttributes}}"] - exists, and if - |config|["{{SanitizerConfig/elements}}"][|elementName|]["{{SanitizerElementNamespaceWithAttributes/removeAttributes}}"] - [=SanitizerConfig/contains=] |attrName|: + 1. If |config|["{{SanitizerConfig/elements}}"]["{{SanitizerElementNamespaceWithAttributes/removeAttributes}}"] + [=SanitizerConfig/contains=] |attrName|: 1. Remove |attr| from |child|. - 1. If «[|elementName|, |attrName|]» matches an entry in the + + 1. If all of the following are false, then remove |attr| from |child|. + - |config|["{{SanitizerConfig/attributes}}"] + [=SanitizerConfig/contains=] |attrName| + - |config|["{{SanitizerConfig/elements}}"]["{{SanitizerElementNamespaceWithAttributes/attributes}}"] + [=SanitizerConfig/contains=] |attrName| + - "data-" is a [=code unit prefix=] of [=Attr/local name=] and + [=Attr/namespace=] is `null` and + |config|["{{SanitizerConfig/dataAttributes}}"] is true + - |config|["{{SanitizerConfig/otherMarkup}}"] + 1. If |handle javascript navigation urls|and «[|elementName|, |attrName|]» matches an entry in the [=navigating URL attributes list=], and if |attr|'s [=protocol=] is "`javascript:`": 1. Then remove |attr| from |child|. - 1. Call [=sanitize=] on |child|'s [=Element/shadow root=] with |config|. - 1. else: - 1. [=/remove=] |child|.
## Configuration Processing ## {#configuration-processing}
-A |config| is valid if all these conditions are met: - -1. |config| is a [=dictionary=] -1. |config|'s [=map/keys|key set=] does not [=list/contain=] both - "{{SanitizerConfig/elements}}" and "{{SanitizerConfig/removeElements}}" -1. |config|'s [=map/keys|key set=] does not [=list/contain=] both - "{{SanitizerConfig/removeAttributes}}" and "{{SanitizerConfig/attributes}}". -1. [=list/iterate|For any=] |key| of «[ - "{{SanitizerConfig/elements}}", - "{{SanitizerConfig/removeElements}}", - "{{SanitizerConfig/replaceWithChildrenElements}}", - "{{SanitizerConfig/attributes}}", - "{{SanitizerConfig/removeAttributes}}" - ]» where |config|[|key|] [=map/exists=]: - 1. |config|[|key|] is [=SanitizerNameList/valid=]. -1. If |config|["{{SanitizerConfig/elements}}"] exists, then - [=list/iterate|for any=] |element| in |config|[|key|] that is a [=dictionary=]: - 1. |element| does not [=list/contain=] both - "{{SanitizerElementNamespaceWithAttributes/attributes}}" and - "{{SanitizerElementNamespaceWithAttributes/removeAttributes}}". - 1. If either |element|["{{SanitizerElementNamespaceWithAttributes/attributes}}"] - or |element|["{{SanitizerElementNamespaceWithAttributes/removeAttributes}}"] - [=map/exists=], then it is [=SanitizerNameList/valid=]. - 1. Let |tmp| be a [=dictionary=], and for any |key| «[ - "{{SanitizerConfig/elements}}", - "{{SanitizerConfig/removeElements}}", - "{{SanitizerConfig/replaceWithChildrenElements}}", - "{{SanitizerConfig/attributes}}", - "{{SanitizerConfig/removeAttributes}}" - ]» |tmp|[|key|] is set to the result of [=canonicalize a sanitizer - element list=] called on |config|[|key|], and [=HTML namespace=] as default - namespace for the element lists, and `null` as default namespace for the - attributes lists. - - Note: The intent here is to assert about list elements, but without regard - to whether the string shortcut syntax or the explicit dictionary - syntax is used. For example, having "img" in `elements` and - `{ name: "img" }` in `removeElements`. An implementation might well - do this without explicitly canonicalizing the lists at this point. - - 1. Given theses canonicalized name lists, all of the following conditions hold: - - 1. The [=set/intersection=] between - |tmp|["{{SanitizerConfig/elements}}"] and - |tmp|["{{SanitizerConfig/removeElements}}"] - is [=set/empty=]. - 1. The [=set/intersection=] between - |tmp|["{{SanitizerConfig/removeElements}}"] - |tmp|["{{SanitizerConfig/replaceWithChildrenElements}}"] - is [=set/empty=]. - 1. The [=set/intersection=] between - |tmp|["{{SanitizerConfig/replaceWithChildrenElements}}"] and - |tmp|["{{SanitizerConfig/elements}}"] - is [=set/empty=]. - 1. The [=set/intersection=] between - |tmp|["{{SanitizerConfig/attributes}}"] and - |tmp|["{{SanitizerConfig/removeAttributes}}"] - is [=set/empty=]. - - 1. Let |tmpattrs| be |tmp|["{{SanitizerConfig/attributes}}"] if it exists, - and otherwise [=built-in default config=]["{{SanitizerConfig/attributes}}"]. - 1. [=list/iterate|For any=] |item| in |tmp|["{{SanitizerConfig/elements}}"]: - 1. If either |item|["{{SanitizerElementNamespaceWithAttributes/attributes}}"] - or |item|["{{SanitizerElementNamespaceWithAttributes/removeAttributes}}"] - exists: - 1. Then the [=set/difference=] between it and |tmpattrs| is [=set/empty=]. - -
- -
-A |list| of names is valid if all these -conditions are met: - -1. |list| is a [=/list=]. -1. [=list/iterate|For all=] of its members |name|: - 1. |name| is a {{string}} or a [=dictionary=]. - 1. If |name| is a [=dictionary=]: - 1. |name|["{{SanitizerElementNamespace/name}}"] [=map/exists=] and is a {{string}}. - -
- -
-A |config| is canonical if all these conditions are met: - -1. |config| is [=SanitizerConfig/valid=]. -1. |config|'s [=map/keys|key set=] is a [=set/subset=] of - «[ - "{{SanitizerConfig/elements}}", - "{{SanitizerConfig/removeElements}}", - "{{SanitizerConfig/replaceWithChildrenElements}}", - "{{SanitizerConfig/attributes}}", - "{{SanitizerConfig/removeAttributes}}", - "{{SanitizerConfig/comments}}", - "{{SanitizerConfig/dataAttributes}}" - ]» -1. |config|'s [=map/keys|key set=] [=list/contains=] either: - 1. both "{{SanitizerConfig/elements}}" and "{{SanitizerConfig/attributes}}", - but neither of - "{{SanitizerConfig/removeElements}}" or "{{SanitizerConfig/removeAttributes}}". - 1. or both - "{{SanitizerConfig/removeElements}}" and "{{SanitizerConfig/removeAttributes}}", - but neither of - "{{SanitizerConfig/elements}}" or "{{SanitizerConfig/attributes}}". -1. For any |key| of «[ - "{{SanitizerConfig/replaceWithChildrenElements}}", - "{{SanitizerConfig/removeElements}}", - "{{SanitizerConfig/attributes}}", - "{{SanitizerConfig/removeAttributes}}" - ]» where |config|[|key|] [=map/exists=]: - 1. |config|[|key|] is [=SanitizerNameList/canonical=]. -1. If |config|["{{SanitizerConfig/elements}}"] [=map/exists=]: - 1. |config|["{{SanitizerConfig/elements}}"] is [=SanitizerNameWithAttributesList/canonical=]. -1. For any |key| of «[ - "{{SanitizerConfig/comments}}", - "{{SanitizerConfig/dataAttributes}}" - ]»: - 1. if |config|[|key|] [=map/exists=], |config|[|key|] is a {{boolean}}. - -
- -
-A |list| of names is canonical if all these -conditions are met: - -1. |list|[|key|] is a [=/list=]. -1. [=list/iterate|For all=] of its |list|[|key|]'s members |name|: - 1. |name| is a [=dictionary=]. - 1. |name|'s [=map/keys|key set=] [=set/equals=] «[ - "{{SanitizerElementNamespace/name}}", "{{SanitizerElementNamespace/namespace}}" - ]» - 1. |name|'s [=map/values=] are [=string=]s. - -
- -
-A |list| of names is canonical -if all these conditions are met: - -1. |list|[|key|] is a [=/list=]. -1. [=list/iterate|For all=] of its |list|[|key|]'s members |name|: - 1. |name| is a [=dictionary=]. - 1. |name|'s [=map/keys|key set=] [=set/equals=] one of: - 1. «[ - "{{SanitizerElementNamespace/name}}", - "{{SanitizerElementNamespace/namespace}}" - ]» - 1. «[ - "{{SanitizerElementNamespace/name}}", - "{{SanitizerElementNamespace/namespace}}", - "{{SanitizerElementNamespaceWithAttributes/attributes}}" - ]» - 1. «[ - "{{SanitizerElementNamespace/name}}", - "{{SanitizerElementNamespace/namespace}}", - "{{SanitizerElementNamespaceWithAttributes/removeAttributes}}" - ]» - 1. |name|["{{SanitizerElementNamespace/name}}"] and - |name|["{{SanitizerElementNamespace/namespace}}"] are [=string=]s. - 1. |name|["{{SanitizerElementNamespaceWithAttributes/attributes}}"] and - |name|["{{SanitizerElementNamespaceWithAttributes/removeAttributes}}"] - are [=SanitizerNameList/canonical=] if they [=map/exist=]. - -
- - -
-To canonicalize a configuration |config| with a [=boolean=] |safe|: - -Note: The initial set of [=assert=]s assert properties of the built-in - constants, like the [=built-in default config|defaults=] and - the lists of known [=known elements|elements=] and - [=known attributes|attributes=]. - -1. [=Assert=]: [=built-in default config=] is [=SanitizerConfig/canonical=]. -1. [=Assert=]: [=built-in default config=]["elements"] is a [=subset=] of [=known elements=]. -1. [=Assert=]: [=built-in default config=]["attributes"] is a [=subset=] of [=known attributes=]. -1. [=Assert=]: «[ - "elements" → [=known elements=], - "attributes" → [=known attributes=], - ]» is [=SanitizerConfig/canonical=]. -1. If |config| is [=list/empty=] and not |safe|, then return «[]» -1. If |config| is not [=SanitizerConfig/valid=], then [=throw=] a {{TypeError}}. -1. Let |result| be a new [=dictionary=]. -1. For each |key| of «[ - "{{SanitizerConfig/elements}}", - "{{SanitizerConfig/removeElements}}", - "{{SanitizerConfig/replaceWithChildrenElements}}" ]»: - 1. If |config|[|key|] exists, set |result|[|key|] to the result of running - [=canonicalize a sanitizer element list=] on |config|[|key|] with - [=HTML namespace=] as the default namespace. -1. For each |key| of «[ - "{{SanitizerConfig/attributes}}", - "{{SanitizerConfig/removeAttributes}}" ]»: - 1. If |config|[|key|] exists, set |result|[|key|] to the result of running - [=canonicalize a sanitizer element list=] on |config|[|key|] with `null` as - the default namespace. -1. Set |result|["{{SanitizerConfig/comments}}"] to - |config|["{{SanitizerConfig/comments}}"]. -1. Let |default| be the result of [=canonicalizing a configuration=] for the - [=built-in default config=]. -1. If |safe|: - 1. If |config|["{{SanitizerConfig/elements}}"] [=map/exists=]: - 1. Let |elementBlockList| be the [=set/difference=] between - [=known elements=] |default|["{{SanitizerConfig/elements}}"]. - - Note: The "natural" way to enforce the default element list would be - to intersect with it. But that would also eliminate any unknown - (i.e., non-HTML supplied element, like <foo>). So we - construct this helper to be able to use it to subtract any "unsafe" - elements. - 1. Set |result|["{{SanitizerConfig/elements}}"] to the - [=set/difference=] of |result|["{{SanitizerConfig/elements}}"] and - |elementBlockList|. - 1. If |config|["{{SanitizerConfig/removeElements}}"] [=map/exists=]: - 1. Set |result|["{{SanitizerConfig/elements}}"] to the - [=set/difference=] of |default|["{{SanitizerConfig/elements}}"] - and |result|["{{SanitizerConfig/removeElements}}"]. - 1. [=set/Remove=] "{{SanitizerConfig/removeElements}}" from |result|. - 1. If neither |config|["{{SanitizerConfig/elements}}"] nor - |config|["{{SanitizerConfig/removeElements}}"] [=map/exist=]: - 1. Set |result|["{{SanitizerConfig/elements}}"] to - |default|["{{SanitizerConfig/elements}}"]. - 1. If |config|["{{SanitizerConfig/attributes}}"] [=map/exists=]: - 1. Let |attributeBlockList| be the [=set/difference=] between - [=known attributes=] and |default|["{{SanitizerConfig/attributes}}"]; - 1. Set |result|["{{SanitizerConfig/attributes}}"] to the - [=set/difference=] of |result|["{{SanitizerConfig/attributes}}"] and - |attributeBlockList|. - 1. If |config|["{{SanitizerConfig/removeAttributes}}"] [=map/exists=]: - 1. Set |result|["{{SanitizerConfig/attributes}}"] to the - [=set/difference=] of |default|["{{SanitizerConfig/attributes}}"] - and |result|["{{SanitizerConfig/removeAttributes}}"]. - 1. [=set/Remove=] "{{SanitizerConfig/removeAttributes}}" from |result|. - 1. If neither |config|["{{SanitizerConfig/attributes}}"] nor - |config|["{{SanitizerConfig/removeAttributes}}"] [=map/exist=]: - 1. Set |result|["{{SanitizerConfig/attributes}}"] to - |default|["{{SanitizerConfig/attributes}}"]. -1. Else (if not |safe|): - 1. If neither |config|["{{SanitizerConfig/elements}}"] nor - |config|["{{SanitizerConfig/removeElements}}"] [=map/exist=]: - 1. Set |result|["{{SanitizerConfig/elements}}"] to - |default|["{{SanitizerConfig/elements}}"]. - 1. If neither |config|["{{SanitizerConfig/attributes}}"] nor - |config|["{{SanitizerConfig/removeAttributes}}"] [=map/exist=]: - 1. Set |result|["{{SanitizerConfig/attributes}}"] to - |default|["{{SanitizerConfig/attributes}}"]. -1. [=Assert=]: |result| is [=SanitizerConfig/valid=]. -1. [=Assert=]: |result| is [=SanitizerConfig/canonical=]. +To safeify a |config|, do this: + +1. [=Assert=]: The [=built-in safe baseline config=] has + {{SanitizerConfig/removeElements}} and {{SanitizerConfig/removeAttributes}} + keys set, but not {{SanitizerConfig/elements}}, + {{SanitizerConfig/replaceWithChildrenElements}}, or + {{SanitizerConfig/attributes}}. +1. Let |result| be a copy of |config|. +1. [=list/For each=] |elem| in + [=built-in safe baseline config=][{{SanitizerConfig/removeElements}}]: + 1. Call |result|.removeElement(|elem|) +1. [=list/For each=] |attr| in + [=built-in safe baseline config=][{{SanitizerConfig/removeAttributes}}]: + 1. Call |result|.removeAttributes(|attr|) 1. Return |result|.
-In order to canonicalize a sanitizer element list |list|, with a -default namespace |defaultNamespace|, run the following steps: - -1. Let |result| be a new [=ordered set=]. -2. [=list/iterate|For each=] |name| in |list|, call - [=canonicalize a sanitizer name=] on |name| with |defaultNamespace| and - [=set/append=] to |result|. -3. Return |result|. +To set a config |config| on a {{Sanitizer}} |sanitizer|, do this: + +1. [=Assert=]: |config| is a [=dictionary=]. +1. [=list/iterate|For each=] |item| of |config|[{{SanitizerConfig/elements}}] do: + 1. Call |sanitizer|.element(|item|). +1. [=list/iterate|For each=] |item| of |config|[{{SanitizerConfig/removeElements}}] do: + 1. Call |sanitizer|.removeElement(|item|). +1. [=list/iterate|For each=] |item| of |config|[{{SanitizerConfig/replaceWithChildrenElements}}] do: + 1. Call |sanitizer|.replaceWithChildren(|item|). +1. [=list/iterate|For each=] |item| of |config|[{{SanitizerConfig/attributes}}] do: + 1. Call |sanitizer|.attribute(|item|). +1. [=list/iterate|For each=] |item| of |config|[{{SanitizerConfig/removeAttributes}}] do: + 1. Call |sanitizer|.removeAttributes(|item|). +1. Call |sanitizer|.comments(|config|[{{SanitizerConfig/comments}}]). +1. Call |sanitizer|.dataAttributes(|config|[{{SanitizerConfig/dataAttributes}}]). +1. Call |sanitizer|.otherMarkup(|config|[{{SanitizerConfig/otherMarkup}}]). + +Note: Previous versions of this spec had elaborate definitions of how to + canonicalize a config. This has now effectively been moved into the method + definitions.
@@ -688,7 +567,7 @@ namespace |defaultNamespace|, run the following steps: 1. [=Assert=]: |name| is a [=dictionary=] and |name|["name"] [=map/exists=]. 1. Return «[
"`name`" → |name|["name"],
- "`namespace`" → |name|["namespace"] if it [=map/exists=], otherwise |defaultNamespace|
+ "`namespace`" → ( |name|["namespace"] if it [=map/exists=], otherwise |defaultNamespace| )
]».
@@ -730,38 +609,45 @@ regard to order: ## Defaults ## {#sanitization-defaults} -Note: The defaults should follow a certain form, which is checked for at the - beginning of [=canonicalize a configuration=]. +There are four builtins: + +* The [=built-in safe default config=], +* the [=built-in unsafe default config=], +* the [=built-in safe baseline config=], and +* the [=navigating URL attributes list=]. -The built-in default config is as follows: +The built-in safe default config is the same as the [=built-in safe baseline config=]. + +ISSUE: Determine if this actually holds. + + +The built-in unsafe default config is meant to allow anything. +It is as follows: ``` { - elements: [....], - attributes: [....], + allow: [], + removeElements: [], + attributes: [], + removeAttributes: [], comments: true, + otherMarkup: true, } ``` -The known elements are as follows: -``` -[ - { name: "div", namespace: "http://www.w3.org/1999/xhtml" }, - ... -] -``` - -The known attributes are as follows: +The built-in safe baseline config is meant to block only +script-content, and nothing else. It is as follows: ``` -[ - { name: "class", namespace: null }, - ... -] +{ + removeElements: [ + { name: "script", namespace: "http://www.w3.org/1999/xhtml" }, + { name: "script", namespace: "http://www.w3.org/2000/svg" } + ], + removeAttributes: [....], + comments: true, + otherMarkup: true +} ``` -Note: The [=known elements=] and [=known attributes=] should be derived from the - HTML5 specification, rather than being explicitly listed here. Currently, - there are no mechanics to do so. -
The navigating URL attributes list, for which "`javascript:`" navigations are unsafe, are as follows: @@ -769,27 +655,27 @@ navigations are unsafe, are as follows: «[
[ - { "`name`" → "`a`", "`namespace`" → "[=HTML namespace=]" }, + { "`name`" → "`a`", "`namespace`" → [=HTML namespace=] }, { "`name`" → "`href`", "`namespace`" → `null` } ],
[ - { "`name`" → "`area`", "`namespace`" → "[=HTML namespace=]" }, + { "`name`" → "`area`", "`namespace`" → [=HTML namespace=] }, { "`name`" → "`href`", "`namespace`" → `null` } ],
[ - { "`name`" → "`form`", "`namespace`" → "[=HTML namespace=]" }, + { "`name`" → "`form`", "`namespace`" → [=HTML namespace=] }, { "`name`" → "`action`", "`namespace`" → `null` } ],
[ - { "`name`" → "`input`", "`namespace`" → "[=HTML namespace=]" }, + { "`name`" → "`input`", "`namespace`" → [=HTML namespace=] }, { "`name`" → "`formaction`", "`namespace`" → `null` } ],
[ - { "`name`" → "`button`", "`namespace`" → "[=HTML namespace=]" }, + { "`name`" → "`button`", "`namespace`" → [=HTML namespace=] }, { "`name`" → "`formaction`", "`namespace`" → `null` } ],
From d86d83b57384d785a211d37380d92d043beeba7e Mon Sep 17 00:00:00 2001 From: Daniel Vogelheim Date: Fri, 13 Sep 2024 17:07:44 +0200 Subject: [PATCH 2/6] Review feedback --- index.bs | 159 ++++++++++++++++++++++++++++++++----------------------- 1 file changed, 93 insertions(+), 66 deletions(-) diff --git a/index.bs b/index.bs index 08e53ca..b31739f 100644 --- a/index.bs +++ b/index.bs @@ -195,7 +195,7 @@ The parseHTMLUnsafe(|html|, |options|) method s 1. [=Parse HTML from a string=] given |document| and |compliantHTML|. 1. Let |sanitizer| be the result of calling [=get a sanitizer instance from options=] with |options|. -1. Call [=sanitize=] on |document|'s [=tree/root|root node=] with |sanitizer|. +1. Call [=sanitize=] on |document|'s [=tree/root|root node=] with |sanitizer| and false. 1. Return |document|.
@@ -211,7 +211,7 @@ The parseHTML(|html|, |options|) method steps a 1. [=Parse HTML from a string=] given |document| and |html|. 1. Let |sanitizer| be the result of calling [=get a sanitizer instance from options=] with |options|. -1. Call [=sanitize=] on |document|'s [=tree/root|root node=] with |sanitizer|. +1. Call [=sanitize=] on |document|'s [=tree/root|root node=] with |sanitizer| and true. 1. Return |document|. @@ -228,38 +228,45 @@ dictionary SetHTMLOptions { The {{Sanitizer}} configuration object encapsulates a filter configuration. -The same config can be used with both safe or unsafe methods. The intent is +The same config can be used with both safe or unsafe methods, where the safe +methods perform an implicit {{removeUnsafe}} operation. The intent is that one (or a few) configurations will be built-up early on in a page's lifetime, and can then be used whenever needed. This allows implementations to pre-process configurations. -The configuration object is also query-able and can return -configuration dictionaries, -in both safe and unsafe variants. This allows a -page to query and predict what effect a given configuration will have, or -to build a new configuration based on an existing one. +The configuration object can be queried to return a configuration dictionary. +It can also be modified directly.
 [Exposed=(Window,Worker)]
 interface Sanitizer {
   constructor(optional SanitizerConfig config = {});
 
-  // Query configurations:
+  // Query configuration:
   SanitizerConfig get();
-  SanitizerConfig getUnsafe();
 
-  // Modifying a Sanitizer:
-  undefined element(SanitizerElementNamespaceWithAttributes element);
+  // Modify a Sanitizer's lists and fields:
+  undefined allowElement(SanitizerElementNamespaceWithAttributes element);
   undefined removeElement(SanitizerElement element);
-  undefined replaceWithChildren(SanitizerElement element);
+  undefined replaceWithChildrenElement(SanitizerElement element);
   undefined allowAttribute(SanitizerAttribute attribute);
   undefined removeAttribute(SanitizerAttribute attribute);
   undefined setComment(boolean allow);
   undefined setDataAttributes(boolean allow);
   undefined setOtherMarkup(boolean allow);
+
+  // Remove markup that executes script. May modify multiple lists:
+  undefined removeUnsafe();
 };
 
+ISSUE(238): Final naming TBD. + +ISSUE(240): "other markup" TBD. + +ISSUE: Should these be setter methods -- particularly the setXXX(boolean) -- + or setters or properties or somesuch? +
The constructor(|config|) method steps are: @@ -274,50 +281,42 @@ Issue: This abandons all error handling, because setting a config will
The get() method steps are: -1. Return the result of calling [=safeify=] on the result of - [=Sanitizer/getUnsafe=]. - -
- -
-The getUnsafe() method steps are: - 1. Return the value of [=this=]'s [=internal slot=].
-The element(|element|) method steps are: +The allowElement(|element|) method steps are: 1. Let |name| be the result of [=canonicalize a sanitizer name=] |element| with [=HTML namespace=] as the default namespace. -1. [=list/Append=] |name| to [=this=]'s [=internal slot=]'s {{SanitizerConfig/elements}} list. +1. [=SanitizerConfig/Add=] |name| to [=this=]'s [=internal slot=]'s {{SanitizerConfig/elements}}. +1. [=Merge attribute lists=] of |element| to [=this=]'s [=internal slot=]'s {{SanitizerConfig/elements}}. 1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s {{SanitizerConfig/removeElements}}. -1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s +1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s {{SanitizerConfig/replaceWithChildrenElements}}. -ISSUE: This does not handle per-element attribute allow/remove lists.
The removeElement(|element|) method steps are: 1. Let |name| be the result of [=canonicalize a sanitizer name=] |element| with [=HTML namespace=] as the default namespace. -1. [=list/Append=] |name| from [=this=]'s [=internal slot=]'s {{SanitizerConfig/removeElements}}. -1. [=list/Remove=] |name| to [=this=]'s [=internal slot=]'s {{SanitizerConfig/elements}} list. -1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s +1. [=SanitizerConfig/Add=] |name| to [=this=]'s [=internal slot=]'s {{SanitizerConfig/removeElements}}. +1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s {{SanitizerConfig/elements}} list. +1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s {{SanitizerConfig/replaceWithChildrenElements}}.
-The replaceWithChildren(|element|) method steps are: +The replaceWithChildrenElement(|element|) method steps are: 1. Let |name| be the result of [=canonicalize a sanitizer name=] |element| with [=HTML namespace=] as the default namespace. -1. [=list/Append=] |name| from [=this=]'s [=internal slot=]'s +1. [=SanitizerConfig/Add=] |name| to [=this=]'s [=internal slot=]'s {{SanitizerConfig/replaceWithChildrenElements}}. 1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s {{SanitizerConfig/removeElements}}. -1. [=list/Remove=] |name| to [=this=]'s [=internal slot=]'s {{SanitizerConfig/elements}} list. +1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s {{SanitizerConfig/elements}} list.
@@ -325,19 +324,18 @@ The replaceWithChildren(|element|) method step The allowAttribute(|attribute|) method steps are: 1. Let |name| be the result of [=canonicalize a sanitizer name=] |attribute| with the `null` as the default namespace. -1. [=list/Append=] |name| from [=this=]'s [=internal slot=]'s +1. [=SanitizerConfig/Add=] |name| to [=this=]'s [=internal slot=]'s {{SanitizerConfig/attributes}}. 1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s {{SanitizerConfig/removeAttributes}}. -
The removeAttribute(|attribute|) method steps are: 1. Let |name| be the result of [=canonicalize a sanitizer name=] |attribute| with the `null` as the default namespace. -1. [=list/Append=] |name| from [=this=]'s [=internal slot=]'s {{SanitizerConfig/removeAttributes}}. -1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s +1. [=SanitizerConfig/Add=] |name| to [=this=]'s [=internal slot=]'s {{SanitizerConfig/removeAttributes}}. +1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s {{SanitizerConfig/attributes}}.
@@ -363,6 +361,14 @@ The setOtherMarkup(|allow|) method steps are: +
+The removeUnsafe() method steps are: + +1. Update [=this=]'s [=internal slot=] with the result of calling [=remove unsafe=] + on [=this=]'s [=internal slot=]. + +
+ ## The Configuration Dictionary ## {#config}
@@ -426,14 +432,14 @@ To set and filter HTML, given an {{Element}} or {{DocumentFragment}}
 To get a sanitizer instance from options for
 an options dictionary |options|, do:
 
-1. Assert: |options| is a [=dictionary=].
+1. [=Assert=]: |options| is a [=dictionary=].
 1. If |options|["`sanitizer`"] doesn't [=map/exist=],
    then return new {{Sanitizer}}().
-1. Assert: |options|["`sanitizer`"] is either a {{Sanitizer}} instance
+1. [=Assert=]: |options|["`sanitizer`"] is either a {{Sanitizer}} instance
    or a [=dictionary=].
 1. If |options|["`sanitizer`"] is a {{Sanitizer}} instance:
    Then return  |options|["`sanitizer`"].
-1. Assert: |options|["`sanitizer`"] is a [=dictionary=].
+1. [=Assert=]: |options|["`sanitizer`"] is a [=dictionary=].
 1. Return new {{Sanitizer}}(|options|["`sanitizer`"]).
 
 
@@ -445,7 +451,7 @@ For the main sanitize operation, using a {{ParentNode}} |node|, a
 {{Sanitizer}} |sanitizer| and a [=boolean=] |safe|, run these steps:
 
 1. Let |config| be the value of |sanitizer|'s [=internal slot=].
-1. If |safe|, let |config| be the result of calling [=safeify=] on |config|.
+1. If |safe|, let |config| be the result of calling [=remove unsafe=] on |config|.
 1. Call [=sanitize core=] on |node|, |config|, and |safe| (as value for
     handling javascript navigation urls).
 
@@ -455,7 +461,7 @@ For the main sanitize operation, using a {{ParentNode}} |node|, a
 The sanitize core operation,
 using a {{ParentNode}} |node|, a {{SanitizerConfig}} |config|, and a
 [=boolean=] |handle javascript navigation urls|, iterates over the DOM tree
-beginning with |node|, and may recurse to handle some special cases (e.g. 
+beginning with |node|, and may recurse to handle some special cases (e.g.
 template contents). It consistes of these steps:
 
 1. Let |current| be |node|.
@@ -506,7 +512,7 @@ template contents). It consistes of these steps:
             [=Attr/namespace=] is `null` and
             |config|["{{SanitizerConfig/dataAttributes}}"] is true
          - |config|["{{SanitizerConfig/otherMarkup}}"]
-      1. If |handle javascript navigation urls|and «[|elementName|, |attrName|]» matches an entry in the
+      1. If |handle javascript navigation urls| and «[|elementName|, |attrName|]» matches an entry in the
          [=navigating URL attributes list=], and if |attr|'s [=protocol=] is
          "`javascript:`":
          1. Then remove |attr| from |child|.
@@ -516,7 +522,14 @@ template contents). It consistes of these steps:
 ## Configuration Processing ## {#configuration-processing}
 
 
-To safeify a |config|, do this: + +Note: While this algorithm is called [=remove unsafe=], we use + the term "unsafe" strictly in the sense + of this spec, to denote content that will + execute JavaScript when inserted into the document. In other words, this + method will remove oportunities for XSS. + +To remove unsafe from a |config|, do this: 1. [=Assert=]: The [=built-in safe baseline config=] has {{SanitizerConfig/removeElements}} and {{SanitizerConfig/removeAttributes}} @@ -526,10 +539,10 @@ To safeify a |config|, do this: 1. Let |result| be a copy of |config|. 1. [=list/For each=] |elem| in [=built-in safe baseline config=][{{SanitizerConfig/removeElements}}]: - 1. Call |result|.removeElement(|elem|) + 1. Call |result|.{{Sanitizer/removeElement()|removeElement}}(|elem|) 1. [=list/For each=] |attr| in [=built-in safe baseline config=][{{SanitizerConfig/removeAttributes}}]: - 1. Call |result|.removeAttributes(|attr|) + 1. Call |result|.{{Sanitizer/removeAttribute()|removeAttribute}}(|attr|) 1. Return |result|.
@@ -539,18 +552,18 @@ To set a config |config| on a {{Sanitizer}} |sanitize 1. [=Assert=]: |config| is a [=dictionary=]. 1. [=list/iterate|For each=] |item| of |config|[{{SanitizerConfig/elements}}] do: - 1. Call |sanitizer|.element(|item|). + 1. Call |sanitizer|.{{Sanitizer/allowElement()|allowElement}}(|item|). 1. [=list/iterate|For each=] |item| of |config|[{{SanitizerConfig/removeElements}}] do: - 1. Call |sanitizer|.removeElement(|item|). + 1. Call |sanitizer|.{{Sanitizer/removeElement()|removeElement}}(|item|). 1. [=list/iterate|For each=] |item| of |config|[{{SanitizerConfig/replaceWithChildrenElements}}] do: - 1. Call |sanitizer|.replaceWithChildren(|item|). + 1. Call |sanitizer|.{{Sanitizer/replaceWithChildrenElement()|replaceWithChildrenElement}}(|item|). 1. [=list/iterate|For each=] |item| of |config|[{{SanitizerConfig/attributes}}] do: - 1. Call |sanitizer|.attribute(|item|). + 1. Call |sanitizer|.{{Sanitizer/allowAttribute()|allowAttribute}}(|item|). 1. [=list/iterate|For each=] |item| of |config|[{{SanitizerConfig/removeAttributes}}] do: - 1. Call |sanitizer|.removeAttributes(|item|). -1. Call |sanitizer|.comments(|config|[{{SanitizerConfig/comments}}]). -1. Call |sanitizer|.dataAttributes(|config|[{{SanitizerConfig/dataAttributes}}]). -1. Call |sanitizer|.otherMarkup(|config|[{{SanitizerConfig/otherMarkup}}]). + 1. Call |sanitizer|.{{Sanitizer/removeAttribute()|removeAttribute}}(|item|). +1. Call |sanitizer|.{{Sanitizer/setComment()|setComment}}(|config|[{{SanitizerConfig/comments}}]). +1. Call |sanitizer|.{{Sanitizer/setDataAttributes()|setDataAttributes}}(|config|[{{SanitizerConfig/dataAttributes}}]). +1. Call |sanitizer|.{{Sanitizer/setOtherMarkup()|setOtherMarkup}}(|config|[{{SanitizerConfig/otherMarkup}}]). Note: Previous versions of this spec had elaborate definitions of how to canonicalize a config. This has now effectively been moved into the method @@ -572,13 +585,40 @@ namespace |defaultNamespace|, run the following steps: -## Supporting Algorithms ## {#alg-support} +
+To add a |name| to a |list|, where |name| is +[=canonicalize a sanitizer name|canonicalized=] and |list| is an [=ordered map=]: + +1. If |list| [=SanitizerConfig/contains=] |name|, then return. +1. [=list/Append=] |name| to |list|. + +
+To merge attribute lists of an |element| to a +|list|, run these steps: + +1. [=Assert=]: |element| is an {{SanitizerElementNamespaceWithAttributes}}. +1. [=Assert=]: |list| [=SanitizerConfig/contains=] |element|. +1. [=list/iterate|For each=] |attr| in + |element|[{{SanitizerElementNamespaceWithAttributes/attributes}}], + [=SanitizerConfig/add=] |attr| to + |list|[|element|][{{SanitizerElementNamespaceWithAttributes/attributes}}]. +1. [=list/iterate|For each=] |attr| in + |element|[{{SanitizerElementNamespaceWithAttributes/removeAttributes}}], + [=SanitizerConfig/add=] |attr| to + |list|[|element|][{{SanitizerElementNamespaceWithAttributes/removeAttributes}}]. + +
+ +## Supporting Algorithms ## {#alg-support} + For the [=canonicalize a sanitizer name|canonicalized=] {{SanitizerElementNamespace|element}} and {{SanitizerAttributeNamespace|attribute name}} lists used in this spec, list membership is based on matching both "`name`" and "`namespace`" entries: + +
A Sanitizer name |list| contains an |item| if there exists an |entry| of |list| that is an [=ordered map=], and where |item|["name"] [=equals=] |entry|["name"] and @@ -586,19 +626,6 @@ if there exists an |entry| of |list| that is an [=ordered map=], and where
-
-Set difference (or set subtraction) is a clone of a set A, but with all members -removed that occur in a set B: -To compute the difference of two [=ordered sets=] |A| and |B|: - -1. Let |set| be a new [=ordered set=]. -1. [=list/iterate|For each=] |item| of |A|: - 1. If |B| does not [=set/contain=] |item|, then [=set/append=] |item| - to |set|. -1. Return |set|. - -
-
Equality for [=ordered sets=] is equality of its members, but without regard to order: From 764d394ba117f590308c4cbf665ab7ce2c2f1cca Mon Sep 17 00:00:00 2001 From: Daniel Vogelheim Date: Wed, 18 Sep 2024 16:06:22 +0200 Subject: [PATCH 3/6] Remove otherMarkup --- index.bs | 25 ++++++++++--------------- 1 file changed, 10 insertions(+), 15 deletions(-) diff --git a/index.bs b/index.bs index b31739f..975381e 100644 --- a/index.bs +++ b/index.bs @@ -253,7 +253,6 @@ interface Sanitizer { undefined removeAttribute(SanitizerAttribute attribute); undefined setComment(boolean allow); undefined setDataAttributes(boolean allow); - undefined setOtherMarkup(boolean allow); // Remove markup that executes script. May modify multiple lists: undefined removeUnsafe(); @@ -264,9 +263,16 @@ ISSUE(238): Final naming TBD. ISSUE(240): "other markup" TBD. +ISSUE: Can a missing dict value and a dict entry with an empty sequence be + treated differently? + ISSUE: Should these be setter methods -- particularly the setXXX(boolean) -- or setters or properties or somesuch? +ISSUE: Should the modifier methods return a reference to [=this=], so that you + can 'chain' methods? + (e.g. `sanitizer.allowElement("a").allowElement("span")`). +
The constructor(|config|) method steps are: @@ -354,13 +360,6 @@ The setDataAttributes(|allow|) method steps ar
-
-The setOtherMarkup(|allow|) method steps are: - -1. Set [=this=]'s [=internal slot=]'s {{SanitizerConfig/otherMarkup}} to |allow|. - -
-
The removeUnsafe() method steps are: @@ -402,10 +401,10 @@ dictionary SanitizerConfig { boolean comments; boolean dataAttributes; - boolean otherMarkup; };
+ISSUE: Sould members be required, or have declared defaults? # Algorithms # {#algorithms} @@ -480,7 +479,7 @@ template contents). It consistes of these steps: 1. else: 1. Let |elementName| be a {{SanitizerElementNamespace}} with |child|'s [=Element/local name=] and [=Element/namespace=]. - 1. If |config|["{{SanitizerConfig/removeElements}}"] [=SanitizerConfig/contains=] |elementName|, or if |config|["{{SanitizerConfig/elements}}"] does not [=SanitizerConfig/contain=] |elementName| and |config|["{{SanitizerConfig/otherMarkup}}"] is false: + 1. If |config|["{{SanitizerConfig/removeElements}}"] [=SanitizerConfig/contains=] |elementName|, or if |config|["{{SanitizerConfig/elements}}"] is not [=list/empty=] and does not [=SanitizerConfig/contain=] |elementName|: 1. [=/remove=] |child|. 1. If |config|["{{SanitizerConfig/replaceWithChildrenElements}}"] [=SanitizerConfig/contains=] |elementName|: 1. Call [=sanitize core=] on |child| with |config| and @@ -504,14 +503,13 @@ template contents). It consistes of these steps: 1. Remove |attr| from |child|. 1. If all of the following are false, then remove |attr| from |child|. - - |config|["{{SanitizerConfig/attributes}}"] + - |config|["{{SanitizerConfig/attributes}}"] [=list/exists=] and [=SanitizerConfig/contains=] |attrName| - |config|["{{SanitizerConfig/elements}}"]["{{SanitizerElementNamespaceWithAttributes/attributes}}"] [=SanitizerConfig/contains=] |attrName| - "data-" is a [=code unit prefix=] of [=Attr/local name=] and [=Attr/namespace=] is `null` and |config|["{{SanitizerConfig/dataAttributes}}"] is true - - |config|["{{SanitizerConfig/otherMarkup}}"] 1. If |handle javascript navigation urls| and «[|elementName|, |attrName|]» matches an entry in the [=navigating URL attributes list=], and if |attr|'s [=protocol=] is "`javascript:`": @@ -563,7 +561,6 @@ To set a config |config| on a {{Sanitizer}} |sanitize 1. Call |sanitizer|.{{Sanitizer/removeAttribute()|removeAttribute}}(|item|). 1. Call |sanitizer|.{{Sanitizer/setComment()|setComment}}(|config|[{{SanitizerConfig/comments}}]). 1. Call |sanitizer|.{{Sanitizer/setDataAttributes()|setDataAttributes}}(|config|[{{SanitizerConfig/dataAttributes}}]). -1. Call |sanitizer|.{{Sanitizer/setOtherMarkup()|setOtherMarkup}}(|config|[{{SanitizerConfig/otherMarkup}}]). Note: Previous versions of this spec had elaborate definitions of how to canonicalize a config. This has now effectively been moved into the method @@ -657,7 +654,6 @@ It is as follows: attributes: [], removeAttributes: [], comments: true, - otherMarkup: true, } ``` @@ -671,7 +667,6 @@ script-content, and nothing else. It is as follows: ], removeAttributes: [....], comments: true, - otherMarkup: true } ``` From c5c79ab5dfc191a67e47d00d3f671839c687c27c Mon Sep 17 00:00:00 2001 From: Daniel Vogelheim Date: Thu, 19 Sep 2024 15:24:11 +0200 Subject: [PATCH 4/6] Review feedback. --- index.bs | 42 ++++++++++++++++++++++-------------------- 1 file changed, 22 insertions(+), 20 deletions(-) diff --git a/index.bs b/index.bs index 975381e..5cbf4ca 100644 --- a/index.bs +++ b/index.bs @@ -246,7 +246,7 @@ interface Sanitizer { SanitizerConfig get(); // Modify a Sanitizer's lists and fields: - undefined allowElement(SanitizerElementNamespaceWithAttributes element); + undefined allowElement(SanitizerElementWithAttributes element); undefined removeElement(SanitizerElement element); undefined replaceWithChildrenElement(SanitizerElement element); undefined allowAttribute(SanitizerAttribute attribute); @@ -295,12 +295,31 @@ The get() method steps are: The allowElement(|element|) method steps are: 1. Let |name| be the result of [=canonicalize a sanitizer name=] |element| with [=HTML namespace=] as the default namespace. -1. [=SanitizerConfig/Add=] |name| to [=this=]'s [=internal slot=]'s {{SanitizerConfig/elements}}. -1. [=Merge attribute lists=] of |element| to [=this=]'s [=internal slot=]'s {{SanitizerConfig/elements}}. +1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s {{SanitizerConfig/elements}}. +1. [=list/Append=] |name| to [=this=]'s [=internal slot=]'s {{SanitizerConfig/elements}}. +1. [=list/iterate|For each=] |attr| in + |element|[{{SanitizerElementNamespaceWithAttributes/attributes}}], + [=SanitizerConfig/add=] |attr| to [=this=]'s [=internal slot=]'s + {{SanitizerConfig/elements}}[|name|][{{SanitizerElementNamespaceWithAttributes/attributes}}]. +1. [=list/iterate|For each=] |attr| in + |element|[{{SanitizerElementNamespaceWithAttributes/removeAttributes}}], + [=SanitizerConfig/add=] |attr| to [=this=]'s [=internal slot=]'s + {{SanitizerConfig/elements}}[|name|][{{SanitizerElementNamespaceWithAttributes/removeAttributes}}]. 1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s {{SanitizerConfig/removeElements}}. 1. [=list/Remove=] |name| from [=this=]'s [=internal slot=]'s {{SanitizerConfig/replaceWithChildrenElements}}. +NOTE: Handling of [=allowElement=] is a little more complicated than the other + methods, because the element allow list can have per-element allow- and + remove-attribute lists. We first remove the given element from the list + before then adding it, which has the effect of re-setting (rather than + merging or elsehow modifying) the per-element list to whatever is passed + in. In other words, the per-element allow- and remove-lists can only be + set as a whole. + + + +
@@ -591,23 +610,6 @@ To add a |name| to a |list|, where |name| is
-
-To merge attribute lists of an |element| to a -|list|, run these steps: - -1. [=Assert=]: |element| is an {{SanitizerElementNamespaceWithAttributes}}. -1. [=Assert=]: |list| [=SanitizerConfig/contains=] |element|. -1. [=list/iterate|For each=] |attr| in - |element|[{{SanitizerElementNamespaceWithAttributes/attributes}}], - [=SanitizerConfig/add=] |attr| to - |list|[|element|][{{SanitizerElementNamespaceWithAttributes/attributes}}]. -1. [=list/iterate|For each=] |attr| in - |element|[{{SanitizerElementNamespaceWithAttributes/removeAttributes}}], - [=SanitizerConfig/add=] |attr| to - |list|[|element|][{{SanitizerElementNamespaceWithAttributes/removeAttributes}}]. - -
- ## Supporting Algorithms ## {#alg-support} For the [=canonicalize a sanitizer name|canonicalized=] From 9533b47efb617fbe1292e9240a98220613dbba1b Mon Sep 17 00:00:00 2001 From: Daniel Vogelheim Date: Thu, 19 Sep 2024 17:02:41 +0200 Subject: [PATCH 5/6] More feedback. Revive error checking for the constructor. --- index.bs | 37 ++++++++++++++++++++++++++++++------- 1 file changed, 30 insertions(+), 7 deletions(-) diff --git a/index.bs b/index.bs index 5cbf4ca..3832a3c 100644 --- a/index.bs +++ b/index.bs @@ -263,9 +263,6 @@ ISSUE(238): Final naming TBD. ISSUE(240): "other markup" TBD. -ISSUE: Can a missing dict value and a dict entry with an empty sequence be - treated differently? - ISSUE: Should these be setter methods -- particularly the setXXX(boolean) -- or setters or properties or somesuch? @@ -277,10 +274,8 @@ ISSUE: Should the modifier methods return a reference to [=this=], so that you The constructor(|config|) method steps are: -1. [=Set a config|Set=] |config| on [=this=]. - -Issue: This abandons all error handling, because setting a config will - just overwrite contradictory entries. Do we want this? +1. Let |valid| be the return value of [=set a config|Set=] |config| on [=this=]. +1. If not |valid|, throw a {{TypeError}1. If not |valid|, throw a {{TypeError}}} @@ -580,11 +575,39 @@ To set a config |config| on a {{Sanitizer}} |sanitize 1. Call |sanitizer|.{{Sanitizer/removeAttribute()|removeAttribute}}(|item|). 1. Call |sanitizer|.{{Sanitizer/setComment()|setComment}}(|config|[{{SanitizerConfig/comments}}]). 1. Call |sanitizer|.{{Sanitizer/setDataAttributes()|setDataAttributes}}(|config|[{{SanitizerConfig/dataAttributes}}]). +1. Return whether all of the following are true: + - [=list/size=] of |config|[{{SanitizerConfig/elements}}] equals + [=list/size=] of [=this=]'s [=internal slot=]'s {{SanitizerConfig/elements}}. + - [=list/size=] of |config|[{{SanitizerConfig/removeElements}}] equals + [=list/size=] of [=this=]'s [=internal slot=]'s {{SanitizerConfig/removeElements}}. + - [=list/size=] of |config|[{{SanitizerConfig/replaceWithChildrenElements}}] equals + [=list/size=] of [=this=]'s [=internal slot=]'s {{SanitizerConfig/replaceWithChildrenElements}}. + - [=list/size=] of |config|[{{SanitizerConfig/attributes}}] equals + [=list/size=] of [=this=]'s [=internal slot=]'s {{SanitizerConfig/attributes}}. + - [=list/size=] of |config|[{{SanitizerConfig/removeAttributes}}] equals + [=list/size=] of [=this=]'s [=internal slot=]'s {{SanitizerConfig/removeAttributes}}. + - Either |config|[{{SanitizerConfig/elements}}] or + |config|[{{SanitizerConfig/removeElements}}] [=map/exist=], + or neither, but not both. + - Either |config|[{{SanitizerConfig/attributes}}] or + |config|[{{SanitizerConfig/removeAttributes}}] [=map/exist=], + or neither, but not both. Note: Previous versions of this spec had elaborate definitions of how to canonicalize a config. This has now effectively been moved into the method definitions. +Note: This operation is defined in terms of the manipulation methods on the + {{Sanitizer}}. Those methods remove matching entries from other lists. + The size equality steps in the last step would then catch this. + For example: + `{ allow: ["div", "div"] }` would create a Sanitizer with one element in + the allow list. The final test would then return false, which would cause + the caller to throw an exception. + +Issue: This is still missing error checks for the per-element attribute lists + and syntax errors. +
From 45d201f3bcdc156bb62778aeecdfd271c834e837 Mon Sep 17 00:00:00 2001 From: Daniel Vogelheim Date: Thu, 17 Oct 2024 14:03:12 +0200 Subject: [PATCH 6/6] Extend explainer --- explainer.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/explainer.md b/explainer.md index 8ead847..d6b1c8c 100644 --- a/explainer.md +++ b/explainer.md @@ -381,6 +381,8 @@ The following methods are offered on the Sanitizer object: - `replaceWithChildren(x)` - `allowAttribute(x)` - `removeAttribute(x)` +- `comments(bool)` +- `dataAttributes(bool)` These correspond 1:1 to the keys in the configuration dictionary. @@ -401,6 +403,23 @@ s.get(); // { elements: ["div", "p", "span"], removeElements: ["b"] } // namespace entries. ``` +If one wishes to modify the element-dependent attributes, then `allow` is +the way to do this, with a dictionary as argument. This allows `"attributes"` +and `"removeAttributes"` keys, like the configuration dictionary. These +element-dependent attributes are set, meaning they overwrite any previously +set values, rather than some sort of merger operation. + +```js +const s = new Sanitizer(); +s.element({name: "div", attributes: ["id", "class"]}); +s.element({name: "div", attributes: ["style"]}); +// s now allows
, but will drop the id= from
+``` + +Since the configuration is mutable, passing around a pre-configured Sanitizer +can be used to let other callers modify its configuration. The "safe" methods +(`setHTML` and `parseHTML`) will still guarantee XSS safety. + ### Configuration Errors The configuration allows expressing redundant or even contradictory options.