-
Notifications
You must be signed in to change notification settings - Fork 762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad regex in filters.txt #8280
Comments
Example ? |
There is a regex in filters.txt on line 24380 and it's not clear what the regex filter is doing but it is randomly blocking legit scripts on different websites that I visit.
reason: easylist/easylist#6476 Provide ALL the pages you found where legitimate stuff is blocked |
Okay, here's one where there should be an ecard widget displaying on the page but it's being blocked arbitrarily by this mess of a regex. There will be many unintended consequences for a regex like this. Here's the site: |
@Yuki2718 first FP |
@timsayshey Sincere apology for inconvenience. It was added to beat revolving ad scripts but I now see it should be adjusted more carefully. |
@timsayshey Hi, can you give us some more examples of breakage? Fixing the regex to avoid the breakage on americanpatriotsunsung will be possible, but I want to ensure the fix doesn't break other pages. The thing is these ad server the filter targeted changes so frequently that chasing them is not trivial. |
@Yuki2718 test this 1:
|
@mapx- Sorry, I have a working solution of |
@mapx- You're right, and apparently the reason is somehow uBO's regex interpretation is case-insensitive. I also observed the case insensitivity on easylist/easylist#6537 Pinging @gorhill for a possible bug. |
So, again, did you test my filter ? |
Yes, that in turn misses some ad scripts. e.g. |
it seems yours is good, some bug in uBO, tested your on regex101 is ok, see the internal discussion |
It's not a bug, that's by design. All matching is meant to be case insensitive. There used to be a |
So what is the purpose of that rather broad filter? If it's meant to address something on |
There are legit scripts (all chars lowercase) and "bad" ones containing lowercase, uppercase and numeric chars (there is some case with only lowercase and uppercase). The logger is presenting the real situation. The filter above is correct for the real world but not in the case of first replacing upper by lower case (as in uBO's case). When - for example - we have (bad script) lower+upper case will have no means to distinguish legit by crap scripts (all lowercase) |
But which sites are using these bad ones? Surely we shouldn't assume they can be present on all sites? |
It's a large category of such crap:
Last case without numeric chars |
It's still a limited set of sites, there are filters with |
Now regarding the case insensitivity issue, if it's something really needed, then this should go into a new issue -- I may choose to support a |
Source: https://issues.adblockplus.org/ticket/7318/ But I see now that it seems they never went ahead with this change, so apparently ABP still support |
I won't open issue at least for this, since there's no guarantee that the case sensitive filter doesn't cause any FP I think the mentioned approach of listing all the domains the filter is useful will be safer. But then we have denyallow, so probably no need for regex. It was worth trying though, as these sites come and go one after another. However, EasyList issue 6537 will be worth remembered. TL;DR is that an EL filter |
I decided I will add |
`match-case` ------------ Related issue: - uBlockOrigin/uAssets#8280 (comment) The new filter option `match-case` can be used only for regex-based filters. Using `match-case` with any other sort of filters will cause uBO to discard the filter. `redirect=` ----------- Related issue: - uBlockOrigin/uBlock-issues#1366 `redirect=` filters with unresolvable resource token at runtime will be discarded. Additionally, the implicit priority is now set to 1 (was 0). The idea is to allow custom `redirect=` filters to be used strictly as fallback `redirect=` filters in case another `redirect=` filter is not picked up. For example, one might create a `redirect=click2load.html:0` filter, to be taken if and only if the blocked resource is not already being redirected by another "official" filter in one of the enabled filter lists.
There is a regex in filters.txt on line 24380 and it's not clear what the regex filter is doing but it is randomly blocking legit scripts on different websites that I visit.
There is no comment explaining what it does or why it is there.
I propose that it be removed.
Please advise.
The text was updated successfully, but these errors were encountered: