First pass at an explainer.

w3c · Sep 15, 2017 · 4e64299 · vrana · Sep 28, 2017 · koto
commit 4e64299
Showing 1 changed file with 225 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -0,0 +1,225 @@
+# Explainer: Trusted Types for DOM Manipulation                                                                                    
+
+## The Problem
+
+As described in Christoph Kern's "[Securing the Tangled Web](https://research.google.com/pubs/pub42934.html),
+Google has been fairly successful at combating DOM-based XSS attacks by relying on a set of
+[typed objects](https://github.com/google/safe-html-types/blob/master/doc/index.md) instead of
+strings to represent HTML snippets, URLs, etc. Compilation-time analysis ensures that only these
+types can be assigned to various DOM APIs that can be used as DOM-based XSS sinks (`el.innerHTML`,
+`location.href`, and so on). These types do not mitigate XSS in themselves, but instead aim for a
+state where security reviewers don't need to deeply understand and review each and every usage of
+a given sink, but can instead focus their efforts on the code that generates the typed objects. As
+long as these "trusted" types are always generated by safe templating libraries, sanitizers,
+constants, and so on, developers can have a high degree of confidence that the risk of DOM-based
+XSS remains low.
+
+Google's internal implementation has a number of bells and whistles (and makes a number of
+assumptions about requirements) that probably aren't suitable for the world at large. It would be
+interesting to explore how we might extract some more generic version of this concept from those
+internal tools in order to bring this kind of functionality to the web in a generic fashion.
+For example, different applications might have different opinions about what makes a particular
+HTML snippet "safe", but regardless of the definition, it seems clear that the browser is
+well-positioned to enforce type constraints dynamically at runtime. That would be a substantial
+improvement over the tight link between the type system and the compiler.
+
+## A Possible Approach
+
+While we could jam all sorts of sanitization functionality into such a system, it seems reasonable
+to start small until we know how existing templating systems and sanitizers will layer any
+primitives we introduce into their existing systems. The following three-pronged approach seems
+compelling as a first step:
+
+1.  Introduce a number of types that correspond to the XSS sinks we wish to protect. For example,
+    we could define a `TrustedHTML` object that would automatically escape interesting characters,
+    making it suitable for injection via `innerHTML`.
+
+    These types should be pretty minimal in nature, making them polyfillable in browsers that don't
+    support them natively.
+
+2.  Enumerate all the XSS sinks we wish to protect, and overload each of them with a variant that
+    accepts a safe type. For example, `Element.innerHTML`'s setter could accept `(DOMString or TrustedHTML)`,
+    and we could overload `document.write(DOMString)` with `document.write(TrustedHTML)`.
+
+    As above, this mechanism should be polyfillable; the polyfilled types could define stringifiers
+    which would enable them to be automatically cast into strings when called on existing setters.
+
+3.  Introduce a mechanism for disabling the raw string version of each of the sinks identified
+    above. For example, something like a theoretical `Content-Security-Policy: require-trusted-types`
+    header could cause the `innerHTML` setter to throw a `TypeError` if a raw string was passed in.
+
+    This is a little more difficult to polyfill, but should be possible for many (all?) setters and
+    methods that aren't marked as [`[Unforgeable]`](https://heycam.github.io/webidl/#Unforgeable).
+
+### Trusted Types
+
+*   **TrustedHTML**: This type would be used to represent a trusted snippet that could be passed
+    into an HTML context.
+
+    ```
+    interface TrustedHTML {
+      static TrustedHTML escape(DOMString html);
+      static TrustedHTML unsafelyCreate(DOMString html);
+
+      stringifier;
+    }
+    ```
+    
+    *   The static `escape` method would produce a `TrustedHTML` object that neutered the string
+        provided by entity-encoding all instances of `&`, `<`, `>`, `"`, and `'`.
+
+    *   The static `unsafelyCreate` method would produce a `TrustedHTML` object that accepted the
+        provided string as-is.
+
+*   **TrustedURL**: This type would be used to represent a trusted URL that could be used to load
+    resources or navigate a frame.
+
+    ```
+    interface TrustedURL {
+      static TrustedURL sanitize(DOMString url);
+      static TrustedURL unsafelyCreate(DOMString url);
+
+      stringifier;
+    }
+    ```
+
+    *   The static `sanitize` method would produce a `TrustedURL` object that would resolve the
+        given string against the document's base URL, and ensure that result was a valid URL, and
+        that it had an `http` or `https` scheme (blocking things like `javascript:` or external
+        protocol handlers). String that didn't make the cut would be replaced with `about:invalid`.
+
+    *   The static `unsafelyCreate` method would produce a `TrustedURL` object that accepted the
+        provided string as-is, producing a URL by resolving the given string against the document's
+        base URL.
+
+*   **TrustedTODO**: TODO(koto@)
+
+### DOM Sinks
+
+*   **HTML Contexts**: Given something like `typedef (DOMString or TrustedHTML) HTMLString`, we'd
+    poke at a number of methods and attribute setters to accept the new type:
+
+    ```
+    partial interface Element {
+        attribute HTMLString innerHTML;
+        attribute HTMLString outerHTML;
+        void insertAdjacentHTML(DOMString position, HTMLString text);
+    };
+    ```
+
+    ```
+    partial interface Document {
+        void write(HTMLString text);
+        void writeln(HTMLString text);
+    };
+    ```
+
+    ```
+    partial interface DOMParser {
+        Document parseFromString(HTMLString str, SupportedType type);
+    };
+    ```
+
+    ```
+    partial interface Range {
+        DocumentFragment createContextualFragment(HTMLString fragment);
+    };
+    ```
+
+    ```
+    partial interface HTMLIFrameElement {
+         DOMString srcdoc;
+    };
+    ```
+
+*   **URL Contexts**: Given something like `typedef (USVString or TrustedURL) URLString`, we'd poke
+    at a number of methods and attribute setters to accept the new type:
+
+    ```
+    partial interface Location {
+        stringifier attribute URLString href;
+        void assign(URLString url);
+        void replace(URLString url);
+
+        // (These aren't `URLString`, but they should be something)
+        DOMString pathname;
+        DOMString search;
+    };
+    ```
+
+    ```
+    // A few element types go here. `HTMLBaseElement`, `HTMLLinkElement` 
+    // `HTMLHyperlinkElementUtils` from a quick skim through HTML.
+    partial interface HTMLXXXElement : HTMLElement {
+        attribute URLString href;
+    };
+    ```                                                                                                                            
+
+    ```
+    // A few element types go here. `HTMLSourceElement`, `HTMLImageElement`,
+    // `HTMLIFrameElement`, `HTMLEmbedElement`, `HTMLTrackElement`,
+    // `HTMLMediaElement`, `HTMLInputElement`, `HTMLScriptElement`, `HTMLFrameElement`
+    // from a quick skim through HTML.
+    //
+    // The same applies to their SVG variants.
+    partial interface HTMLXXXElement : HTMLElement {
+        attribute URLString src;
+        attribute URLString srcset; // Only `HTMLSourceElement` and `HTMLImageElement`
+    };
+    ```
+
+    ```
+    partial interface HTMLObjectElement : HTMLElement {
+        attribute URLString data;
+        attribute URLString codebase;
+    };
+    ```
+    ```
+    partial interface Document {
+        attribute URLString location;
+    };
+    ```
+
+    ```
+    partial interface Window {
+        attribute URLString location;
+        void open(URLString location);
+    };
+    ```
+
+    ```
+    partial interface WorkerGlobalScope {
+        void importScripts(URLString... urls);
+    };
+    ```
+
+*   **JavaScript Contexts**: Replace `DOMString` in the following with something
+    reasonable.
+
+    ```
+    partial interface Window {
+        void eval(DOMString code);
+        void setTimeout(DOMString code, int timeout);
+        void setInterval(DOMString code, int timeout);
+    };
+    ```
+
+    ```
+    partial interface HTMLScriptElement : HTMLElement {
+        attribute DOMString innerText;
+        attribute DOMString text;
+        attribute DOMString textContent;
+    };
+    ```
+
+## Open Questions
+
+1.  Sebastian doesn't like `Content-Security-Policy`, so maybe we should spell the flag in #3 above
+    differently. He proposed `Disable-Unsafe-APIs: True`.
+
+2.  Artur and Koto suggest that we'll need something more granular than the global flag, however
+    we spell it, in order to deal with piecemeal migrations.
+
+3.  Define more types.
+
+4.  Document more sinks.