-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a final newline normalization for form payloads #6287
Conversation
When entries are added to a form's entry list through the "append an entry" algorithm, their newlines are normalized, but entries can be added to an entry list through other means. This change adds a final newline normalization before serializing the form payload, since "append an entry" cannot be changed because its results are observable through the `FormData` object or through the `formdata` event. This change additionally changes the input passed to the `application/x-www-form-urlencoded` and `text/plain` serializers to be a list of name-value pairs, where the values are strings rather than `File` objects. This simplifies the serializer algorithms. Closes whatwg#6247. Closes whatwg/url#562.
3a328cf
to
18dcc3c
Compare
This looks solid editorially, but I haven't been following a lot of the discussion. Tests, which help generate a list of ways browsers deviate from the tests, seem like a good next step. I'll also note that we would probably benefit from factoring out a "normalize newlines to CRLF" operation, either in HTML (if we consider that sort of thing kinda legacy) or Infra (if we anticipate it being used more widely). I'm pretty sure that beyond just this section, there are other parts of HTML that would use it. But that could be done as a followup. |
I've now linked to web-platform-tests/wpt#26740, which used to contain the tests for whatwg/url#562 before it grew in scope. Now I've added a lot of tests that should cover this change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I was thinking that we should check for the submission type when constructing an entry list and flatten things there inline. I guess that is not possible because of the formdata
event? It would be a lot nicer if we can find an algorithm that does not have as much duplication.
cc @whatwg/forms
You could do the normalization right after step 7 in constructing the entry list so it wouldn't affect the form.addEventListener("formdata", evt => {
evt.formData.append("a", "b\nc");
});
const formData = new FormData(form);
assert_equals(formData.get("a"), "b\nc"); // in all browsers (except for Webkit, that doesn't support the formdata event) You could do the normalization at some point after constructing the entry list in the form submission algorithm, but I chose to do it in the different behaviors in step 23 because enctype isn't the only piece of data that determines the serialization type. But we could test for enctype == |
I just realized that |
If only Safari does it that's not necessarily what we want to do though, right? It makes some sense to do newline normalization for user input (i.e., through |
So, for implementer feedback, this is the current state of things across browsers, and the way this PR changes it. See #6247 for an explanation of the terms "String/
*. Firefox turns newlines in |
@mfreed7 @whatwg/forms @cdumez @tkent-google thoughts on the summary above? I still like the idea of trying not to touch |
I just realized that this might not be simpler spec-wise for the cases of The distinction between String/ But I opened whatwg/url#562 because (for String/ I suspect implementing this in Chromium would have the same issue about keeping track of which entries were normalized. *. Although we might also want to change what's observable from the |
Thanks, as in #6247, for the great summary of the state of things. I'm in favor of the proposal as written, which is effectively to normalize everything except for filenames on @annevk, about your suggestion to leave |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
After whatwg/html#6287, no callers are left which invoke the `application/x-www-form-urlencoded` serializer with file values.
(Apologies for the earlier reply, I had not properly paged this back in.) I am mainly bothered by the inelegance of normalizing twice. Also, if you use |
You can observe the results of constructing the entry list with In any case, this PR can be merged now until we decide whether to change the behavior of Edit: Marking Firefox as interested in this PR, since I'll be implementing it. |
The WPT PR (web-platform-tests/wpt#26740) is now approved, and I asked @annevk to merge it, believing this PR to have some level of consensus. I now realize that's not as clear as I thought. Let me know if I should revert it in the meantime. In any case, I think this change is ready for another look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed some nits that I hope are acceptable. There are further things that could be cleaned up, but they affect a large part of the existing text as well and are probably best tackled separately (e.g., referencing the members of an entry explicitly).
Let's figure out how to land this with #6624 and then land this next week now there's implementer buy-in.
I've removed the various notes about double normalization so it's easier to land this PR and #6624 together. See #6624 (review) |
I hadn't realized that the "convert to a list of name-value pairs" algorithm as I'd written it leaves filenames unnormalized in |
Complements whatwg#6287 and whatwg#6624. Fixes whatwg#6647. See also whatwg#6662 for further cleanup on the textarea data model.
As of #6287 newlines are normalized when form data is serialized. This removes the (mostly redundant) normalization in constructing the entry list. Tests: web-platform-tests/wpt#28798. Follow-up: #6697. Fixes #6469.
After whatwg/html#6287 no callers are left which invoke the application/x-www-form-urlencoded serializer with file values. Closes #562.
Chrome used to perform newline normalization in form payloads early into the form submission algorithm –in particular, during the entry list construction–, along with an additional normalization at the point of serializing the form data into `application/x-www-form-urlencoded` and `text/plain` form payloads. The early normalization was spec compliant – the late one wasn't, but was required because servers expected it. This change implements some recent changes in the HTML spec that have standardized this newline normalization behavior across browsers, removing the early normalization entirely and replacing it with a late normalization at the point of serializing, even for the `multipart/form-data` enctype. This change does not affect the submitted form payload for normal user-triggered form submissions. It only changes how form entry lists are inspected from Javascript using the `FormData` API and the `formdata` event, and how form entries added from Javascript are serialized (by fetching with a `FormData` body, or by modifying a form submission through form-associated custom elements or through the `formdata` event). As such, this change is implemented behind a default-on `LateFormNewlineNormalization` feature flag. This change implements the following HTML spec PRs: whatwg/html#6287 whatwg/html#6624 whatwg/html#6697 See also: https://blog.whatwg.org/newline-normalizations-in-form-submission Intent to Ship: https://groups.google.com/a/chromium.org/g/blink-dev/c/XULXQrbFznw Fixed: 1167095 Change-Id: I0a28369aefe052c413a427733ef33d70988a13c1 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3226394 Reviewed-by: Kent Tamura <tkent@chromium.org> Reviewed-by: Mason Freed <masonf@chromium.org> Commit-Queue: Kent Tamura <tkent@chromium.org> Cr-Commit-Position: refs/heads/main@{#936197}
Chrome used to perform newline normalization in form payloads early into the form submission algorithm –in particular, during the entry list construction–, along with an additional normalization at the point of serializing the form data into `application/x-www-form-urlencoded` and `text/plain` form payloads. The early normalization was spec compliant – the late one wasn't, but was required because servers expected it. This change implements some recent changes in the HTML spec that have standardized this newline normalization behavior across browsers, removing the early normalization entirely and replacing it with a late normalization at the point of serializing, even for the `multipart/form-data` enctype. This change does not affect the submitted form payload for normal user-triggered form submissions. It only changes how form entry lists are inspected from Javascript using the `FormData` API and the `formdata` event, and how form entries added from Javascript are serialized (by fetching with a `FormData` body, or by modifying a form submission through form-associated custom elements or through the `formdata` event). As such, this change is implemented behind a default-on `LateFormNewlineNormalization` feature flag. This change implements the following HTML spec PRs: whatwg/html#6287 whatwg/html#6624 whatwg/html#6697 See also: https://blog.whatwg.org/newline-normalizations-in-form-submission Intent to Ship: https://groups.google.com/a/chromium.org/g/blink-dev/c/XULXQrbFznw Fixed: 1167095 Change-Id: I0a28369aefe052c413a427733ef33d70988a13c1 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3226394 Reviewed-by: Kent Tamura <tkent@chromium.org> Reviewed-by: Mason Freed <masonf@chromium.org> Commit-Queue: Kent Tamura <tkent@chromium.org> Cr-Commit-Position: refs/heads/main@{#936197} NOKEYCHECK=True GitOrigin-RevId: 3a41107f23310a86c7fa4557be14e90b867baa01
When entries are added to a form's entry list through the "append an entry" algorithm, their newlines are normalized, but entries can be added to an entry list through other means. This change adds a final newline normalization before serializing the form payload, since "append an entry" cannot be changed because its results are observable through the
FormData
object or through theformdata
event.This change additionally changes the input passed to the
application/x-www-form-urlencoded
andtext/plain
serializers to be a list of name-value pairs, where the values are strings rather thanFile
objects. This simplifies the serializer algorithms.Closes #6247. Closes whatwg/url#562.
FormData
object as the body of aRequest
orResponse
, since it doesn't support FACEs or theformdata
event).(See WHATWG Working Mode: Changes for more details.)
/form-control-infrastructure.html ( diff )
/index.html ( diff )