Skip to content
This repository has been archived by the owner on Aug 8, 2023. It is now read-only.

Adguard Home support - opportunity to contribute #223

Closed
adworacz opened this issue Jan 30, 2020 · 18 comments
Closed

Adguard Home support - opportunity to contribute #223

adworacz opened this issue Jan 30, 2020 · 18 comments

Comments

@adworacz
Copy link

Since I started running Adguard Home, and I really appreciate the work you've done by building and optimizing this blocklist, I'd like to contribute back by adding output supported natively by Adguard Home.

They use a simplified version of the Adblock syntax, detailed here: https://github.com/AdguardTeam/AdGuardHome/wiki/Hosts-Blocklists#adblock-style

It appears that this repo is intended to hold the scripts that build the blocklist, but the official script hasn't been committed yet.

Would it be possible to see it committed so that I can put a pull request together to start generating AGH compatible blocklists?

I can always do this at home using a quick AWK script, but I figured it would be better to contribute back to the project, if possible.

@scafroglia93
Copy link

Why adguard ?

Is a Russian based service located in Cipro.

I prefer use nextdns, usa service located in california who have to respect the new CCPA regulation for privacy

Russia is very dangerous at the moment

@adworacz
Copy link
Author

adworacz commented Jan 31, 2020

It's not the Adguard service, it's the Adguard Home open-source project: https://github.com/AdguardTeam/AdGuardHome

Think of it like Pihole.

@notracking
Copy link
Owner

AdguardHome uses identical syntax as the standard adblock rules. I will create a combined list in this syntax as it will also enable the use of the notracking lists in (most) browser based adblockers.

@adworacz
Copy link
Author

Awesome to hear! I’ve been running a script locally to do this, so glad it will have first level support.

@notracking
Copy link
Owner

Regarding the scripts, see: notracking/hosts-blocklists-scripts#2

adblock list is being tested now..

@notracking
Copy link
Owner

notracking commented Jan 31, 2020

hostname filters (subdomains) should be translated into:

||ads.goodreads.com^
@@||*.ads.goodreads.com^
||ads.google.com^
@@||*.ads.google.com^

This will make a proper adblockplus based list slightly larger
For domains it will be one line per block ||domain.com^

@adworacz
Copy link
Author

adworacz commented Jan 31, 2020

@@||...

is used to create a whilelist or exception rule.

For hostnames (where one doesn't want to match subdomains), the following syntax works.

|ads.goodreads.com^

This will only match "ads.goodreads.com" exactly. "test.ads.goodreads.com" will not match the rule.

So, if I were to use your example, and just match hostnames (not subdomains), I would just write:

|ads.goodreads.com^
|ads.google.com^

@notracking
Copy link
Owner

I cannot find that syntax in the adblockplus documentation, testing in uBlock also shows that the single pipe variant is not working.

The option with whitelisting is also not desirable, not sure if i can get the same coverage with the adblockplus standard..

@adworacz
Copy link
Author

adworacz commented Feb 1, 2020

The syntax I posted was based off of the Adguard Home documentation

However, I did a bit of research, and found documentation on Adblock Plus' website that suggests the syntax is close to what I posted, albeit different by one character:

|example.com|

So a "pipe sandwhich", basically.

@notracking
Copy link
Owner

notracking commented Feb 1, 2020

That's for address matching only, the address also includes the protocol etc, eg: http://bla.test.com/ |bla.test.com| will not work. For this syntax the closest legit filter would be: |http://bla.test.com/|
There are all sorts of issues with that approach..

So far only the ||test.com^ format can be used for domain name blocking, no option available in Adblock Plus for hostname blocking. If this is not possible I will revert from supporting.

@notracking
Copy link
Owner

notracking commented Feb 2, 2020

The adblockers that do support mixed content filters, threat 0.0.0.0 sub.host.com style filters as an adblock plus filter ||sub.host.com^. I cannot support this format as it would impact the whitelistings for the dnsmasq style of separating hostnames from domain blacklists (which is the Notracking standard format).

For now I will only add support for tools that separate hostname from (wildcard)domain base filtering (like dnsmasq / dnscrypt-proxy).

I'm not saying that the Adblock Plus standard, or branches of that are worse forms of implementing a filtering syntax, but they do serve a different purpose being browser based extensions. Browser based adblockers have way more context (actual request source etc.) on the requested host queries and are therefore more precise in using of exception rules ($document, $third-party etc.).

For tooling that provides a DNS service I do find it a bit silly if they choose to implement anything other than having a domain en hostname separated syntax. Using AdblockPlus syntax for dns based filtering is like using a spoon to cut steak...

@notracking notracking added wontfix and removed todo labels Feb 2, 2020
@curiosityseeker
Copy link

curiosityseeker commented Feb 3, 2020

First of all, thank you very much for your lists which I've been using for quite some time.

However - and please pardon my ignorance - I'm having trouble to see the problem here.

Using your example from above, ||sub.host.com^ would block:

http://sub.host.com
https://sub.host.com

We agree that this is what we want.

But you're saying that it is not okay that, e.g.,

http(s)://xyz.sub.host.com

is blocked as well. Quite frankly, I have a hard time to understand why it would be a realistic scenario that sub.host.comshould be blocked but xyz.sub.host.com should not.

I've had this problem of understanding since using your lists. Why does it make sense that (using 2 examples from your hostnames.txt), e.g.,

000lp59.wcomhost.com
0utl00kmaintenanc2018.editor.multiscreensite.com

are blocked but

abc.000lp59.wcomhost.com
blabla.0utl00kmaintenanc2018.editor.multiscreensite.com

are not? In other words, I doubt that the distinction between domains.txt and hostnames.txt is based on a realistic scenario. Why is the distinction between domains and sub-domains within one list not sufficient? But perhaps I'm missing something ...

FWIW, the AdGuardHome list compiled by mmotti is exclusively using the ||what.ever^ approach.

@notracking
Copy link
Owner

You raise a very valid question. Though technically it is still better to separate the hostnames.

I must agree that the likelihood of running in to problems using only a domain style list is small, but I do need to run some tests to make sure.

You made me reconsider merging everything over to domain style lists only.
To be continued...

@notracking notracking reopened this Feb 5, 2020
@notracking notracking added todo and removed wontfix labels Feb 5, 2020
@notracking
Copy link
Owner

AdblockPlus list has been added:
https://github.com/notracking/hosts-blocklists/blob/master/adblock/adblock.txt

Enjoy!

@adworacz
Copy link
Author

adworacz commented Feb 8, 2020

Thank you! Running in my Adguard Home install, and testing in uBlock Origin right now.

@curiosityseeker
Copy link

I'm using the list with AdGuard Pro on my iPhone and iPad, so far without any problems!

Thank you very much!

@curiosityseeker
Copy link

I'm using the list with AdGuard Pro on my iPhone and iPad, so far without any problems!

Just for info: Unfortunately, this doesn't work any more as the list has grown too large causing a tunnel crash.

This doesn't affect AdGuard Home, though.

@notracking
Copy link
Owner

That's a shame, since network filters (unlike regex or wildcard) are extremely resource efficient if implemented correctly. 100k network filters is almost nothing (5MB of memory using the AdblockPlus libs (see nBlock source code).

There is still some room to reduce this version of the blocklist as there are some dupes in there because of the change to domain only filtering. I always try to keep it as small as possible, but full coverage will trump that.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants