Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XSS content check for 'invalid_protocols' can trigger false-positives #3298

Open
w00fz opened this issue Mar 19, 2021 · 3 comments
Open

XSS content check for 'invalid_protocols' can trigger false-positives #3298

w00fz opened this issue Mar 19, 2021 · 3 comments
Assignees
Labels

Comments

@w00fz
Copy link
Member

w00fz commented Mar 19, 2021

If in your content you have a legitimate text that includes a protocol, it gets flagged as false-positive. Example:

Pre-publication image data: management and processing

This triggers invalid_protocols, caused by the default data protocol value

w00fz-2021-03-19_@_16 46 20@2x

One solution is to ensure the protocol is followed by at least one non-space character. If there is a space, it should be ignored.

@mahagr mahagr transferred this issue from getgrav/grav-plugin-admin Mar 31, 2021
@mahagr mahagr added the bug label Mar 31, 2021
@mahagr
Copy link
Member

mahagr commented Mar 31, 2021

But you can have space for example in javascript:.. Also, the method removes all the space characters for some reason.

It may be better if we improved the detection to find HTML tags and only look into the attributes...

@mahagr
Copy link
Member

mahagr commented Mar 31, 2021

Related to #3175.

I think we need to change the regexps so that there's regexp that just finds the tags and calls a callback method for determining the details. In most cases, we need to find only the opening tag -- with the exception of some special ones such as code block and script/style tag.

@rhukster
Copy link
Member

I did add the whitespace check but it doesn't fully resolve the problem. It just allows a way around the issue of a false positive.

Checking inside HTML tags is not trivial as it will basically require the use of a parser to accurately determine the scope of the tags. The other downside is that it will be MUCH slower than the regex alone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants