From 18dcc3ccad482e4d82ff51c11562dfcb8c085213 Mon Sep 17 00:00:00 2001
From: Andreu Botella <abb@randomunok.com>
Date: Wed, 13 Jan 2021 21:10:10 +0100
Subject: [PATCH 1/6] Add a final newline normalization for form payloads

When entries are added to a form's entry list through the "append an
entry" algorithm, their newlines are normalized, but entries can be
added to an entry list through other means. This change adds a final
newline normalization before serializing the form payload, since "append
an entry" cannot be changed because its results are observable through
the `FormData` object or through the `formdata` event.

This change additionally changes the input passed to the
`application/x-www-form-urlencoded` and `text/plain` serializers to be a
list of name-value pairs, where the values are strings rather than
`File` objects. This simplifies the serializer algorithms.

Closes #6247. Closes whatwg/url#562.
---
 source | 113 ++++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 95 insertions(+), 18 deletions(-)
diff --git a/source b/source
index accbd2780ff..05d55b0dcd2 100644
--- a/source
+++ b/source
@@ -56069,9 +56069,12 @@ fur
     <dl>
      <dt><dfn data-x="submit-mutate-action">Mutate action URL</dfn>
      <dd>
+      <p>Let <var>pairs</var> be the result of <span data-x="convert to a list of name-value
+      pairs">converting to a list of name-value pairs</span> with <var>entry list</var>.</p>
+
       <p>Let <var>query</var> be the result of running the
-      <span><code>application/x-www-form-urlencoded</code> serializer</span> with <var>entry
-      list</var> and <var>encoding</var>.</p>
+      <span><code>application/x-www-form-urlencoded</code> serializer</span> with <var>pairs</var>
+      and <var>encoding</var>.</p>
 
       <p>Set <var>parsed action</var>'s <span data-x="concept-url-query">query</span>
       component to <var>query</var>.</p>
@@ -56087,9 +56090,12 @@ fur
        <dt><code data-x="attr-fs-enctype-urlencoded">application/x-www-form-urlencoded</code></dt>
 
        <dd>
+        <p>Let <var>pairs</var> be the result of <span data-x="convert to a list of name-value
+        pairs">converting to a list of name-value pairs</span> with <var>entry list</var>.</p>
+
         <p>Let <var>body</var> be the result of running the
-        <span><code>application/x-www-form-urlencoded</code> serializer</span> with <var>entry
-        list</var> and <var>encoding</var>.</p>
+        <span><code>application/x-www-form-urlencoded</code> serializer</span> with <var>pairs</var>
+        and <var>encoding</var>.</p>
 
         <p>Set <var>body</var> to the result of <span data-x="UTF-8 encode">encoding</span>
         <var>body</var>.</p>
@@ -56100,6 +56106,24 @@ fur
        <dt><code data-x="attr-fs-enctype-formdata">multipart/form-data</code></dt>
 
        <dd>
+        <p>For each <span data-x="formdata-entry">entry</span> in <var>entry list</var>:</p>
+
+        <ol>
+         <li><p>Replace every occurrence of U+000D (CR) not followed by U+000A (LF), and every
+         occurrence of U+000A (LF) not preceded by U+000D (CR), in the entry's name, by a string
+         consisting of a U+000D (CR) and U+000A (LF).</p></li>
+         <li><p>If the entry's value is not a <code>File</code> object, replace every occurrence of
+         U+000D (CR) not followed by U+000A (LF), and every occurrence of U+000A (LF) not preceded
+         by U+000D (CR), in the entry's value, by a string consisting of a U+000D (CR) and U+000A
+         (LF).</p></li>
+        </ol>
+
+        <p class="note">These newline conversions in this algorithm are necessary because not all
+        names and string values in entry lists reaching this point need have been previously
+        normalized when <span data-x="append an entry">appending the entry</span>. That
+        normalization is idempotent, so implementations are allowed to keep track of which names and
+        values have been previously normalized in order to skip them in this algorithm.</p>
+
         <p>Let <var>body</var> be the result of running the <span><code
         data-x="">multipart/form-data</code> encoding algorithm</span> with <var>entry list</var>
         and <var>encoding</var>.</p>
@@ -56114,8 +56138,11 @@ fur
        <dt><code data-x="attr-fs-enctype-text">text/plain</code></dt>
 
        <dd>
+        <p>Let <var>pairs</var> be the result of <span data-x="convert to a list of name-value
+        pairs">converting to a list of name-value pairs</span> with <var>entry list</var>.</p>
+
         <p>Let <var>body</var> be the result of running the <span><code data-x="">text/plain</code>
-        encoding algorithm</span> with <var>entry list</var>.</p>
+        encoding algorithm</span> with <var>pairs</var>.</p>
 
         <p>Set <var>body</var> to the result of <span data-x="encode">encoding</span>
         <var>body</var> using <var>encoding</var>.</p>
@@ -56141,9 +56168,12 @@ fur
 
      <dt><dfn data-x="submit-mailto-headers">Mail with headers</dfn>
      <dd>
+      <p>Let <var>pairs</var> be the result of <span data-x="convert to a list of name-value
+      pairs">converting to a list of name-value pairs</span> with <var>entry list</var>.</p>
+
       <p>Let <var>headers</var> be the result of running the
-      <span><code>application/x-www-form-urlencoded</code> serializer</span> with <var>entry
-      list</var> and <var>encoding</var>.</p>
+      <span><code>application/x-www-form-urlencoded</code> serializer</span> with <var>pairs</var>
+      and <var>encoding</var>.</p>
 
       <p>Replace occurrences of U+002B PLUS SIGN characters (+) in <var>headers</var> with
       the string "<code data-x="">%20</code>".</p>
@@ -56156,6 +56186,9 @@ fur
 
      <dt><dfn data-x="submit-mailto-body">Mail as body</dfn>
      <dd>
+      <p>Let <var>pairs</var> be the result of <span data-x="convert to a list of name-value
+      pairs">converting to a list of name-value pairs</span> with <var>entry list</var>.</p>
+
       <p>Switch on <var>enctype</var>:
 
       <dl class="switch">
@@ -56163,7 +56196,7 @@ fur
 
        <dd>
         <p>Let <var>body</var> be the result of running the <span><code data-x="">text/plain</code>
-        encoding algorithm</span> with <var>entry list</var>.</p>
+        encoding algorithm</span> with <var>pairs</var>.</p>
 
         <p>Set <var>body</var> to the result of running <span>UTF-8 percent-encode</span> on
         <var>body</var> using the <span>default encode set</span>. <ref spec=URL></p>
@@ -56172,8 +56205,8 @@ fur
        <dt>Otherwise</dt>
 
        <dd><p>Let <var>body</var> be the result of running the
-       <span><code>application/x-www-form-urlencoded</code> serializer</span> with <var>entry
-       list</var> and <var>encoding</var>.</p></dd>
+       <span><code>application/x-www-form-urlencoded</code> serializer</span> with <var>pairs</var>
+       and <var>encoding</var>.</p></dd>
       </dl>
 
       <p>If <var>parsed action</var>'s <span data-x="concept-url-query">query</span> is null, then
@@ -56514,6 +56547,53 @@ fur
 
   </div>
 
+  <div w-nodev>
+
+  <h5>Converting an entry list to a list of name-value pairs</h5>
+
+  <p>The <code>application/x-www-form-urlencoded</code> and <code data-x="text/plain encoding
+  algorithm">text/plain</code> encoding algorithms take a list of name-value pairs, where the values
+  must be strings, rather than an entry list where the value can be a <code>File</code>. The
+  following algorithm performs the conversion.</p>
+
+  <p>To <dfn>convert to a list of name-value pairs</dfn> an entry list <var>entry list</var>, run
+  these steps:</p>
+
+  <ol>
+   <li><p>Let <var>list</var> be an empty <span>list</span> of name-value pairs.</p></li>
+
+   <li>
+    <p>For each <span data-x="formdata-entry">entry</span> in <var>entry list</var>:</p>
+
+    <ol>
+     <li><p>Let <var>name</var> be the entry's name, with every occurrence of U+000D (CR) not
+     followed by U+000A (LF), and every occurrence of U+000A (LF) not preceded by U+000D (CR),
+     replaced by a string consisting of U+000D (CR) and U+000A (LF).</p></li>
+
+     <li><p>If the entry's value is a <code>File</code>, then let <var>value</var> be the entry's
+     value's <code data-x="dom-file-name">name</code>. Otherwise, let <var>value</var> be the
+     entry's value, with every occurrence of U+000D (CR) not followed by U+000A (LF), and every
+     occurrence of U+000A (LF) not preceded by U+000D (CR), replaced by a string consisting of
+     U+000D (CR) and U+000A (LF).</p></li>
+
+     <li><p><span data-x="list append">Append</span> to <var>list</var> a new name-value pair whose
+     name is <var>name</var> and whose value is <var>value</var>.</p></li>
+    </ol>
+   </li>
+
+   <li><p>Return <var>list</var>.</p></li>
+  </ol>
+
+  <p class="note">The newline conversions in this algorithm are necessary because not all names and
+  string values reaching the <code>application/x-www-form-urlencoded</code> or <code
+  data-x="text/plain encoding algorithm">text/plain</code> serializers need have been previously
+  normalized when <span data-x="append an entry">appending the entry</span>, and in fact no
+  filenames have. That normalization is idempotent, so implementations are allowed to keep track of
+  which names and string values have been previously normalized in order to skip them in this
+  algorithm.</p>
+
+  </div>
+
 
   <h5>URL-encoded form data</h5>
 
@@ -56576,24 +56656,21 @@ fur
 
   <div w-nodev>
 
-  <p>The <dfn><code data-x="">text/plain</code> encoding algorithm</dfn>, given an <var>entry
-  list</var>, is as follows:</p>
+  <p>The <dfn><code data-x="">text/plain</code> encoding algorithm</dfn>, given a list of name-value
+  pairs <var>pairs</var>, is as follows:</p>
 
   <ol>
    <li><p>Let <var>result</var> be the empty string.</p></li>
 
    <li>
-    <p>For each <span data-x="formdata-entry">entry</span> in <var>entry list</var>:</p>
+    <p>For each <var>pair</var> in <var>pairs</var>:</p>
 
     <ol>
-     <li><p>If the entry's value is a <code>File</code> object, then set its value to the
-     <code>File</code> object's <code data-x="dom-file-name">name</code>.</p></li>
-
-     <li><p>Append the entry's name to <var>result</var>.</p></li>
+     <li><p>Append <var>pair</var>'s name to <var>result</var>.</p></li>
 
      <li><p>Append a single U+003D EQUALS SIGN character (=) to <var>result</var>.</p></li>
 
-     <li><p>Append the entry's value to <var>result</var>.</p></li>
+     <li><p>Append <var>pair</var>'s value to <var>result</var>.</p></li>
 
      <li><p>Append a U+000D CARRIAGE RETURN (CR) U+000A LINE FEED (LF) character pair to <var>result</var>.</p></li>
     </ol>

From 98abf83ea563e1e9299957ea32b1bd07b9114ec2 Mon Sep 17 00:00:00 2001
From: Andreu Botella <abb@randomunok.com>
Date: Mon, 18 Jan 2021 13:16:54 +0100
Subject: [PATCH 2/6] Move the multipart/form-data normalizations to the
 multipart/form-data algorithm section.

---
 source | 38 ++++++++++++++++++++------------------
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/source b/source
index 71905cf6f72..083ba50eaec 100644
--- a/source
+++ b/source
@@ -56098,24 +56098,6 @@ fur
        <dt><code data-x="attr-fs-enctype-formdata">multipart/form-data</code></dt>
 
        <dd>
-        <p>For each <span data-x="formdata-entry">entry</span> in <var>entry list</var>:</p>
-
-        <ol>
-         <li><p>Replace every occurrence of U+000D (CR) not followed by U+000A (LF), and every
-         occurrence of U+000A (LF) not preceded by U+000D (CR), in the entry's name, by a string
-         consisting of a U+000D (CR) and U+000A (LF).</p></li>
-         <li><p>If the entry's value is not a <code>File</code> object, replace every occurrence of
-         U+000D (CR) not followed by U+000A (LF), and every occurrence of U+000A (LF) not preceded
-         by U+000D (CR), in the entry's value, by a string consisting of a U+000D (CR) and U+000A
-         (LF).</p></li>
-        </ol>
-
-        <p class="note">These newline conversions in this algorithm are necessary because not all
-        names and string values in entry lists reaching this point need have been previously
-        normalized when <span data-x="append an entry">appending the entry</span>. That
-        normalization is idempotent, so implementations are allowed to keep track of which names and
-        values have been previously normalized in order to skip them in this algorithm.</p>
-
         <p>Let <var>body</var> be the result of running the <span><code
         data-x="">multipart/form-data</code> encoding algorithm</span> with <var>entry list</var>
         and <var>encoding</var>.</p>
@@ -56598,6 +56580,26 @@ fur
   list</var> and <var>encoding</var>, is as follows:</p>
 
   <ol>
+   <li>
+    <p>For each <span data-x="formdata-entry">entry</span> in <var>entry list</var>:</p>
+
+    <ol>
+     <li><p>Replace every occurrence of U+000D (CR) not followed by U+000A (LF), and every
+     occurrence of U+000A (LF) not preceded by U+000D (CR), in the entry's name, by a string
+     consisting of a U+000D (CR) and U+000A (LF).</p></li>
+     <li><p>If the entry's value is not a <code>File</code> object, replace every occurrence of
+     U+000D (CR) not followed by U+000A (LF), and every occurrence of U+000A (LF) not preceded
+     by U+000D (CR), in the entry's value, by a string consisting of a U+000D (CR) and U+000A
+     (LF).</p></li>
+    </ol>
+
+    <p class="note">These newline conversions in this algorithm are necessary because not all
+    names and string values in entry lists reaching this point need have been previously
+    normalized when <span data-x="append an entry">appending the entry</span>. That
+    normalization is idempotent, so implementations are allowed to keep track of which names and
+    values have been previously normalized in order to skip them in this algorithm.</p>
+   </li>
+
    <li>
     <p>Return the byte sequence resulting from encoding the <var>entry list</var> using the rules
     described by RFC 7578, <cite>Returning Values from Forms: <code

From fdcfcb068d72ee482aee21c6ca01b0f9a7736778 Mon Sep 17 00:00:00 2001
From: Andreu Botella <abb@randomunok.com>
Date: Mon, 18 Jan 2021 13:21:45 +0100
Subject: [PATCH 3/6] Reword the notes about double normalization.

---
 source | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/source b/source
index 083ba50eaec..19c97b69ea2 100644
--- a/source
+++ b/source
@@ -56549,12 +56549,12 @@ fur
    <li><p>Return <var>list</var>.</p></li>
   </ol>
 
-  <p class="note">The newline conversions in this algorithm are necessary because not all names and
-  string values reaching the <code>application/x-www-form-urlencoded</code> or <code
-  data-x="text/plain encoding algorithm">text/plain</code> serializers need have been previously
-  normalized when <span data-x="append an entry">appending the entry</span>, and in fact no
-  filenames have. That normalization is idempotent, so implementations are allowed to keep track of
-  which names and string values have been previously normalized in order to skip them in this
+  <p class="note">The newline conversions in this algorithm are necessary because some of the names
+  and string values reaching the <code>application/x-www-form-urlencoded</code> or <code
+  data-x="text/plain encoding algorithm">text/plain</code> serializers might not have been
+  previously normalized in the <span data-x="append an entry">appending the entry</span> algorithm,
+  and in fact no filenames have. That normalization is idempotent, so implementations are allowed to
+  keep track of which entries have been previously normalized in order to skip them in this
   algorithm.</p>
 
   </div>
@@ -56593,11 +56593,11 @@ fur
      (LF).</p></li>
     </ol>
 
-    <p class="note">These newline conversions in this algorithm are necessary because not all
-    names and string values in entry lists reaching this point need have been previously
-    normalized when <span data-x="append an entry">appending the entry</span>. That
-    normalization is idempotent, so implementations are allowed to keep track of which names and
-    values have been previously normalized in order to skip them in this algorithm.</p>
+    <p class="note">These newline conversions in this algorithm are necessary because some of the
+    names and string values in entry lists reaching this encoding algorithm might not have been
+    previously normalized in the <span data-x="append an entry">appending the entry</span>
+    algorithm. That normalization is idempotent, so implementations are allowed to keep track of
+    which entries have been previously normalized in order to skip them in this algorithm.</p>
    </li>
 
    <li>

From 99a4be8b02612e5d8497504331b50daf4badf560 Mon Sep 17 00:00:00 2001
From: Anne van Kesteren <annevk@annevk.nl>
Date: Mon, 26 Apr 2021 17:31:31 +0200
Subject: [PATCH 4/6] nits

---
 source | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/source b/source
index 19c97b69ea2..a0aec6a7945 100644
--- a/source
+++ b/source
@@ -56528,18 +56528,18 @@ fur
    <li><p>Let <var>list</var> be an empty <span>list</span> of name-value pairs.</p></li>
 
    <li>
-    <p>For each <span data-x="formdata-entry">entry</span> in <var>entry list</var>:</p>
+    <p><span data-x="list iterate">For each</span> <var>entry</var> of <var>entry list</var>:</p>
 
     <ol>
-     <li><p>Let <var>name</var> be the entry's name, with every occurrence of U+000D (CR) not
+     <li><p>Let <var>name</var> be <var>entry</var>'s name, with every occurrence of U+000D (CR) not
      followed by U+000A (LF), and every occurrence of U+000A (LF) not preceded by U+000D (CR),
      replaced by a string consisting of U+000D (CR) and U+000A (LF).</p></li>
 
-     <li><p>If the entry's value is a <code>File</code>, then let <var>value</var> be the entry's
-     value's <code data-x="dom-file-name">name</code>. Otherwise, let <var>value</var> be the
-     entry's value, with every occurrence of U+000D (CR) not followed by U+000A (LF), and every
-     occurrence of U+000A (LF) not preceded by U+000D (CR), replaced by a string consisting of
-     U+000D (CR) and U+000A (LF).</p></li>
+     <li><p>If <var>entry</var>'s value is a <code>File</code> object, then let <var>value</var> be
+     <var>entry</var>'s value's <code data-x="dom-file-name">name</code>. Otherwise, let
+     <var>value</var> be <var>entry</var>'s value, with every occurrence of U+000D (CR) not followed
+     by U+000A (LF), and every occurrence of U+000A (LF) not preceded by U+000D (CR), replaced by a
+     string consisting of U+000D (CR) and U+000A (LF).</p></li>
 
      <li><p><span data-x="list append">Append</span> to <var>list</var> a new name-value pair whose
      name is <var>name</var> and whose value is <var>value</var>.</p></li>
@@ -56581,16 +56581,17 @@ fur
 
   <ol>
    <li>
-    <p>For each <span data-x="formdata-entry">entry</span> in <var>entry list</var>:</p>
+    <p><span data-x="list iterate">For each</span> <var>entry</var> of <var>entry list</var>:</p>
 
     <ol>
      <li><p>Replace every occurrence of U+000D (CR) not followed by U+000A (LF), and every
-     occurrence of U+000A (LF) not preceded by U+000D (CR), in the entry's name, by a string
+     occurrence of U+000A (LF) not preceded by U+000D (CR), in <var>entry</var>'s name, by a string
      consisting of a U+000D (CR) and U+000A (LF).</p></li>
-     <li><p>If the entry's value is not a <code>File</code> object, replace every occurrence of
-     U+000D (CR) not followed by U+000A (LF), and every occurrence of U+000A (LF) not preceded
-     by U+000D (CR), in the entry's value, by a string consisting of a U+000D (CR) and U+000A
-     (LF).</p></li>
+
+     <li><p>If <var>entry</var>'s value is not a <code>File</code> object, then replace every
+     occurrence of U+000D (CR) not followed by U+000A (LF), and every occurrence of U+000A (LF) not
+     preceded by U+000D (CR), in <var>entry</var>'s value, by a string consisting of a U+000D (CR)
+     and U+000A (LF).</p></li>
     </ol>
 
     <p class="note">These newline conversions in this algorithm are necessary because some of the

From b96d053957e97957e504d04eeb40719863799710 Mon Sep 17 00:00:00 2001
From: Andreu Botella <abb@randomunok.com>
Date: Mon, 3 May 2021 10:26:58 +0200
Subject: [PATCH 5/6] Delete notes about double normalization.

See https://github.com/whatwg/html/pull/6624#pullrequestreview-648240331
---
 source | 14 --------------
 1 file changed, 14 deletions(-)

diff --git a/source b/source
index a0aec6a7945..5e4a988e057 100644
--- a/source
+++ b/source
@@ -56549,14 +56549,6 @@ fur
    <li><p>Return <var>list</var>.</p></li>
   </ol>
 
-  <p class="note">The newline conversions in this algorithm are necessary because some of the names
-  and string values reaching the <code>application/x-www-form-urlencoded</code> or <code
-  data-x="text/plain encoding algorithm">text/plain</code> serializers might not have been
-  previously normalized in the <span data-x="append an entry">appending the entry</span> algorithm,
-  and in fact no filenames have. That normalization is idempotent, so implementations are allowed to
-  keep track of which entries have been previously normalized in order to skip them in this
-  algorithm.</p>
-
   </div>
 
 
@@ -56593,12 +56585,6 @@ fur
      preceded by U+000D (CR), in <var>entry</var>'s value, by a string consisting of a U+000D (CR)
      and U+000A (LF).</p></li>
     </ol>
-
-    <p class="note">These newline conversions in this algorithm are necessary because some of the
-    names and string values in entry lists reaching this encoding algorithm might not have been
-    previously normalized in the <span data-x="append an entry">appending the entry</span>
-    algorithm. That normalization is idempotent, so implementations are allowed to keep track of
-    which entries have been previously normalized in order to skip them in this algorithm.</p>
    </li>
 
    <li>

From 33c99316dd5cd6bba91ee4102ae1b0df01ccebd9 Mon Sep 17 00:00:00 2001
From: Andreu Botella <abb@randomunok.com>
Date: Fri, 7 May 2021 09:29:38 +0200
Subject: [PATCH 6/6] Don't forget to normalize filenames in urlencoded and
 text/plain.

---
 source | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/source b/source
index 5e4a988e057..08d96b29a74 100644
--- a/source
+++ b/source
@@ -56537,9 +56537,11 @@ fur
 
      <li><p>If <var>entry</var>'s value is a <code>File</code> object, then let <var>value</var> be
      <var>entry</var>'s value's <code data-x="dom-file-name">name</code>. Otherwise, let
-     <var>value</var> be <var>entry</var>'s value, with every occurrence of U+000D (CR) not followed
-     by U+000A (LF), and every occurrence of U+000A (LF) not preceded by U+000D (CR), replaced by a
-     string consisting of U+000D (CR) and U+000A (LF).</p></li>
+     <var>value</var> be <var>entry</var>'s value.</p></li>
+
+     <li><p>Replace every occurrence of U+000D (CR) not followed by U+000A (LF), and every occurrence
+     of U+000A (LF) not preceded by U+000D (CR), in <var>value</var>, by a string consisting of
+     U+000D (CR) and U+000A (LF).</p></li>
 
      <li><p><span data-x="list append">Append</span> to <var>list</var> a new name-value pair whose
      name is <var>name</var> and whose value is <var>value</var>.</p></li>