Description
=== IGNORE THIS, SEE COMMENTS DIRECTLY AFTERWARD ===
Imagine you have this URL:
$url = "https://myusername:mypassword@example.com/some/page";
As you can see, it's using the following userinfo credentials in the URL:
- username: myusername
- password: mypassword
ProcessWire's URL sanitizer thinks that's invalid, however PHP's FILTER_VALIDATE_URL correctly accepts it.
echo filter_var($url, FILTER_VALIDATE_URL) ? 'valid' : 'invalid'; // valid (correct)
echo "\n";
echo $sanitizer->url($url) ? 'valid' : 'invalid'; // invalid (incorrect)
The spec for how URLs can be formatted is here:
https://datatracker.ietf.org/doc/html/rfc3986
The part about userinfo is here:
https://datatracker.ietf.org/doc/html/rfc3986#section-3.2.1
Note how it says:
Use of the format "user:password" in the userinfo field is
deprecated. Applications should not render as clear text any data
after the first colon (":") character found within a userinfo
subcomponent unless the data after the colon is the empty string
(indicating no password). Applications may choose to ignore or
reject such data when it is received as part of a reference and
should reject the storage of such data in unencrypted form. The
passing of authentication information in clear text has proven to be
a security risk in almost every case where it has been used.
This came up because I have to store a URL with userinfo in it as part of a webhook / callback with a 3rd party service. I am using a URL field, but unfortunately it won't save the value since it has userinfo in it. I could switch it to a regular Text field, but I'd prefer not to.
If you decide to support this, please make sure this more advanced URL works (ie, having %40 as part of the username or password):
$url = "https://myusername%40mydomain.com:mypassword@example.com/foo-%40-bar";