Skip to content

Conversation

@annevk
Copy link
Member

@annevk annevk commented Jan 24, 2026

Fixes #895.

Copy link

@ChALkeR ChALkeR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, %1B(J%5C%1B(B matches what I'm getting there
And the diff \ -> %5C matches what I expect as \ should be escaped

Thanks!

@ChALkeR
Copy link

ChALkeR commented Jan 24, 2026

Side note: having this as a test in WPT could be helpful to ensure that impls didn't use the wrong test
There are some tests for iso-2022-jp percent-encode, but not with %5C

Upd: ah, there is a full-range test for form/url, it just doesn't cover userinfo, but that's likely unreachable

Also give them a proper type.

Fixes #895.
@annevk annevk force-pushed the annevk/userinfo-backslash branch from 5c50135 to e52054e Compare January 24, 2026 11:19
@annevk annevk changed the title Correct example that did not encode backslash in userinfo Document invariants of percent-encode sets better Jan 24, 2026
@annevk
Copy link
Member Author

annevk commented Jan 24, 2026

That's a very good point. Which is probably also why I got the example wrong in the first place. I revised this PR with a more drastic change.

@annevk annevk requested review from ChALkeR, domenic and rmisev and removed request for ChALkeR January 24, 2026 11:23
url.bs Outdated
<td><a for=string>Percent-encode after encoding</a> with <a>Shift_JIS</a>, <var>input</var>, and
the <a><code>application/x-www-form-urlencoded</code> percent-encode set</a>
<td>"<code>1+1 ≡ 2%20‽</code>"
<td>"<code>1+1+%81%DF+2%20%26%238253%3B</code>"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems wrong

Copy link

@ChALkeR ChALkeR Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

application/x-www-form-urlencoded percent-encode set includes component percent-encode set which includes U+002B (+) and U+0025 (%)

  1. The first + must be escaped (but not the second and third as they come from spaces)
  2. The original % is also escaped.

I get 1%2B1+%81%DF+2%2520%26%238253%3B in my impl

Copy link

@ChALkeR ChALkeR Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chrome also says 1%2B1+%81%DF+2%2520%26%238253%3B (pls recheck me)

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, thanks for being on top of it! I double checked in Safari modulo the Shift_JIS aspect now.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This thread is resolved / can be closed, I just don't have the button to label it as such.

U+005B ([) to U+005D (]), inclusive, and U+007C (|).
<p>The <dfn oldids=userinfo-encode-set>userinfo percent-encode set</dfn> is a
<a>percent-encode set</a> consisting of the <a>path percent-encode set</a> and U+002F (/),
U+003A (:), U+003B (;), U+003D (=), U+0040 (@), U+005B ([) to U+005D (]), inclusive, and U+007C (|).
Copy link

@ChALkeR ChALkeR Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this could have been explicit "5B, 5C, 5D" instead of "5B to 5D, inclusive"
that would make missing 5C less likely

<p>The <dfn export>component percent-encode set</dfn> is the <a>userinfo percent-encode set</a> and
U+0024 ($) to U+0026 (&amp;), inclusive, U+002B (+), and U+002C (,).
<p>The <dfn export>component percent-encode set</dfn> is a <a>percent-encode set</a> consisting of
the <a>userinfo percent-encode set</a> and U+0024 ($) to U+0026 (&amp;), inclusive, U+002B (+), and
Copy link

@ChALkeR ChALkeR Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: same here, could make missing 25 less likely

inclusive, and U+007E (~).
<p>The <dfn><code>application/x-www-form-urlencoded</code> percent-encode set</dfn> is a
<a>percent-encode set</a> consisting of the <a>component percent-encode set</a> and U+0021 (!),
U+0027 (') to U+0029 RIGHT PARENTHESIS, inclusive, and U+007E (~).
Copy link

@ChALkeR ChALkeR Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: same here

Copy link

@ChALkeR ChALkeR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Examples LGTM, algo changes too

@ChALkeR
Copy link

ChALkeR commented Feb 1, 2026

Side note: I added more cross-tests
Firefox, Chromium, Servo seem to pass, WebKit fails on a number of encodings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

Percent encoding examples seem to be wrong

2 participants