-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DNR] Add support for wildcards in initiatorDomains and excludedInitiatorDomains fields #394
Comments
To avoid syntax ambiguity, could you please describe the desired wildcard syntax? Specifically:
|
Note there is some existing wildcard support for domain matching in DNR implementations:
|
To answer your queries: 1, 2, 3, 4 - From what I have observed, there isn't a significant need for wildcard support outside of TLDs. However, there are developments being made toward supporting regex, which might prove helpful in more complex scenarios. For instance, UBlock recently implemented support for regex, as shown in this commit: gorhill/uBlock@b1de8d3. |
I've added labels to reflect the positions from the meeting notes (pending to be merged in #397). The
opposed: firefox
|
During the recent W3C call, I was asked to provide more examples. Let's go back to the rule I first presented. I selected a domain (kickass.*) from this rule and ran a search for related issues here: https://github.com/AdguardTeam/AdguardFilters. The search returned 42 issues. issues
I have selected unique domain names from these issues. There's potential for more to be found. domains
Next, I searched for rules that use these domains and identified 5. rules
There are 20 tlds extracted from the rule. tlds
All unique domain names mentioned in the rule are currently down. There are 38 such domains in total. domain names
The calculation 38 * 20 equals 760, which illustrates how quickly the number of domains in the initiatorDomains field can grow. And note that this doesn't even include all the popular TLDs. You'll find more examples below: more examplesrules
domains
issues
rules domains
issues
rules
domains
issues
rules
domains
issues
rules
domains
issues
rules
domains
issues
|
Concerning uBlock Origin, at the moment, I count 269 filters which must be thrown away when converting current filter lists to DNR rules. Of course, those 269 filters represent at least twice the number of MV3 DNR rules that would be created otherwise since the purpose of an entity is to match more than one domain name. [1] "Entity-based" is the name uBO uses to refer to domain name entries for which the TLD is replaced by |
Filter lists are maiatained by small numbers of people (mostly volunteers) but sites who are most serious about circumventing blocker have long been changing their domain, often TLD, rapidly. Although domain wildcard in cosmetic or scriptlet filters are far more often used, there is still need for wildcard in network filter; for example, |
It still is an open issue in ABP: https://gitlab.com/eyeo/adblockplus/abc/adblockpluscore/-/issues/123. |
More examples:
The main purpose of using |
Considering an example e.g.
See also: tld service for webextensions |
This https://github.com/gorhill/uBlock/wiki/Static-filter-syntax#entity |
@maximtop and others, thanks for the examples. Those are all really helpful. https://gitlab.com/eyeo/adblockplus/abc/adblockpluscore/-/issues/123 (linked above) has some really interesting discussion on the pros/cons of this.
I missed this the first time (and it's an important distinction I think). The ask is not actually for a true wildcard - just the ability to match arbitrary TLDs. Which could open the possibility of exploring other syntax for this where you just list the pre-TLD part of the domain and don't actually use an asterisk character. One thing I'm still not clear about is why creating an entirely new domain is any more friction to changing the TLD. If we implement this, what's to stop websites evading blocking tools by just changing the domain entirely? I definitely still have some concerns around the number of sites intended to be matched vs. the number that would actually be matched with a wildcard. I think this could very easily encourage lazy rule creation which ends up blocking genuine sites that happen to share the domain (but not TLD) of something else. |
Everything content blockers do can be evaded, and yet we are still here. In the end the way we design our content blockers is to make it as easy as possible for filter list maintainers to do their task, to avoid hardship. The wildcard for public suffix has proven useful in reducing hardship after years of usage.
Both AdGuard and uBO have public filter list issue trackers, and I can't remember a case of false positive caused by filters with wildcarded public suffix in their |
As said, the feature was added because many sites, as a fact, change only TLDs. For cosmetic/scriptlet fiters we have countless examples: |
In any case, we (the authors of the filters) try to make rules that carry a minimum risk of breakage. Not all of these sites change TLDs to bypass ad blocking. Many are blocked for violating copyrights or censoring in countries with a target audience. |
For scriptlet and denyallow filters we had uBlockOrigin/uAssets@47ad111 and uBlockOrigin/uAssets@8f66870 |
Thanks all. Definitely not indicating a decision here, just wanted to mention something that we should decide if we're concerned about as we discuss this more :) |
As stated in the meeting, Chrome and Safari will follow up with the engineering teams to determine the feasibility of implementing this. |
I also missed this originally and I agree that this is an important distinction. The use of I don't currently have any concerns with this request. In abstract I'm a bit concerned about the theoretical possibility of false positives, but this comment by @gorhill addresses those worries.
|
From what I am reading in the last meeting minutes this proposal in its current form this proposal causes a lot of confusion. May I suggest an alternative approach? Instead of extending UPD: |
The issue was discussed during the WECG in-person meeting. In order to choose the more appropriate way forward it is necessary to figure out what is the current situation with filtering rules.
|
Hi, just two cents from someone who could benefit from a feature potentially implemented in this discussion. I'd like to see My extension allows users to redirect any web page to another using user-defined rules, similar to Redirector. I'm currently trying to adopt the declarativeNetRequest API for this functionality. At present, I use |
This is a common scenario where a domain has multiple variations of the same domain name in different domain zones.
I've counted over 700 rules in AdGuard filters that follow this pattern. Here's an example of such a rule:
Introducing wildcards would reduce the size of DNR rulesets and make life easier for filter developers.
The rule in the ruleset could look like this
The text was updated successfully, but these errors were encountered: