Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attribute escaping #3

Open
weierophinney opened this issue Dec 31, 2019 · 6 comments
Open

Attribute escaping #3

weierophinney opened this issue Dec 31, 2019 · 6 comments

Comments

@weierophinney
Copy link
Member

Which requires escaping a large number of characters in attributes? [^a-z0-9,\.\-_]
URL's in html looks ugly and are larger than possible

<a href="https&#x3A;&#x2F;&#x2F;www.example.com&#x2F;">
<a href="https://www.example.com/">

Originally posted by @autowp at zendframework/zend-escaper#21

@weierophinney
Copy link
Member Author

"Ugly" is not the problem when security-sensitive contexts. Also, most source viewers will already make these attributes simple to read (Firefox does, for example).

As for the size, gzip compression generally deals with it.


Originally posted by @Ocramius at zendframework/zend-escaper#21 (comment)

@weierophinney
Copy link
Member Author

That not easy to understand where is security improvements here.

For example, why "dot" is secure character but "semicolon" is not?

As for the size: On my example cyrillic page where escapeHtmlAttr partially used:
68988 bytes - escaped only quotes and angle brackets
83611 bytes - escaped by escapeHtmlAttr (+20%)

Same with gzip
11116 bytes
11790 bytes (+6%)

Indeed, the size is not crucial.


Originally posted by @autowp at zendframework/zend-escaper#21 (comment)

@weierophinney
Copy link
Member Author

Are you asking to add more characters to the whitelist, so they don't get encoded?

Maybe you could argue that certain characters like ":" don't need to be escaped, but it's easier to have a very small white-list of "known good" characters ([^a-z0-9,\.\-_]), than trying to work out which characters are allowed in each context.


For anyone not familiar with the background... the reason escapeHtmlAttr() encodes more aggressively than escapeHtml() is for non-quoted attributes.

Lets say someone did:

$url = 'https://www.example.com/';
<a href=<?= $escaper->escapeHtmlAttr($url) ?>>

Notice that it does not include quote marks.

This creates the fairly "ugly" output:

<a href=https&#x3A;&#x2F;&#x2F;www.example.com&#x2F;>

What happens if $url was provided by the user (maybe a link to their website), and they set it to:

$url = 'https://www.example.com/ onclick=do_evil_thing';

Without using escapeHtmlAttr(), it would create the perfectly valid:

<a href=https://www.example.com/ onclick=do_evil_thing>

This means they can create an onclick event handler on your website :-)


You could still use escapeHtml() or htmlspecialchars(), but you must make sure your attributes are quoted.

<a href="<?= $escaper->escapeHtml($url) ?>">

So that it creates:

<a href="https://www.example.com/">

Or, if you want to use htmlspecialchars(), don't forget to use it in full:

htmlspecialchars($url, ENT_QUOTES | ENT_SUBSTITUTE, 'utf-8')

PS: Have a look at adding a CSP (Content Security Policy), and set it so that it does not allow unsafe-inline for scripts or styles. This will probably require you to make some changes, but it adds a second line of defence against this problem, where any attributes like onclick would be blocked by the browser.


Originally posted by @craigfrancis at zendframework/zend-escaper#21 (comment)

@weierophinney
Copy link
Member Author

@craigfrancis
Thanks for your explanation! I think, this could improve the documentation.


Originally posted by @froschdesign at zendframework/zend-escaper#21 (comment)

@xelax90
Copy link

xelax90 commented Jun 17, 2020

We recently ran into issues with some browsers and escaped forward slashes in URLs. What is the security reasoning behind escaping forward slashes in HTML attributes and would it be possible to add it to the allowed character list?

@froschdesign
Copy link
Member

@xelax90

What is the security reasoning behind escaping forward slashes in HTML attributes…

For more informations see: "OWASP – Cross Site Scripting Prevention Cheat Sheet"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants