Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing the problematic lists #476

Closed
HellboyPI opened this issue Feb 18, 2023 · 10 comments
Closed

Removing the problematic lists #476

HellboyPI opened this issue Feb 18, 2023 · 10 comments
Assignees
Labels
question Further information is requested

Comments

@HellboyPI
Copy link

Hello! Recently You got a lot of whitelist requests. To make your job easier, would it be better to just remove the problematic list(s) (source)? Maybe there are a lot more false positive domains.

@HellboyPI HellboyPI added the question Further information is requested label Feb 18, 2023
@dazzah87
Copy link

Agreed. This looks like it will be several hundred domains by the time the person is done with it. Maybe not worth the effort or just include it in Ultimate and people can add domains to their Allowlist as they see fit. Up to you, of course.

@hagezi
Copy link
Owner

hagezi commented Feb 18, 2023

I will look at the whitelist requests all, they were created based on the Ultimate - no panic. ;)

@hagezi
Copy link
Owner

hagezi commented Feb 18, 2023

See: #362

@hagezi
Copy link
Owner

hagezi commented Feb 18, 2023

If I remove all the "problematic" lists, not much remains, also the OISD would then have to remove. A bit absurd. ;)
If someone wants to help with the analysis, always gladly, otherwise it just takes a while, the colleague has not yet arrived at Z ...

@hagezi hagezi pinned this issue Feb 18, 2023
@devipasigner
Copy link

If I remove all the "problematic" lists, not much remains, also the OISD would then have to remove. A bit absurd. ;) If someone wants to help with the analysis, always gladly, otherwise it just takes a while, the colleague has not yet arrived at Z ...

I can definitely help sometimes..

@HellboyPI
Copy link
Author

I understand. Here is a radical idea 😃 --> Don't use OISD, 1Hosts, Notracking, StevenBlack.... lists as a source, because they are list aggregators. Cherry-pick the lists they use as sources. This would give you more control over your lists and could reduce the false-positive rate.

Or, only use lite/ small versions of their lists as a base/ foundation (OISD small, 1Hosts Lite...) and then cherry-pick the sources they use for bigger versions.

I believe, AdGuard DNS list could also be used as base, because AdGuard uses it for their public DNS servers. Millions of people use these DNS servers. So, the list must be conservative in order to minimize the percentage of false positive domains.

@devipasigner
Copy link

devipasigner commented Feb 18, 2023

I understand. Here is a radical idea 😃 --> Don't use OISD, 1Hosts, Notracking, StevenBlack.... lists as a source, because they are list aggregators. Cherry-pick the lists they use as sources. This would give you more control over your lists and could reduce the false-positive rate.

Or, only use lite/ small versions of their lists as a base/ foundation (OISD small, 1Hosts Lite...) and then cherry-pick the sources they use for bigger versions.

I believe, AdGuard DNS list could also be used as base, because AdGuard uses it for their public DNS servers. Millions of people use these DNS servers. So, the list must be conservative in order to minimize the percentage of false positive domains.

I’ve always thought about that idea too. He would then have to check the oisd whitelist with his and see what remaining whitelist domains should be used

@hagezi
Copy link
Owner

hagezi commented Feb 18, 2023

It mainly affects the aggressive lists, especially the Ultimate, which was to be expected. And if someone uses such lists, he should be aware of that. After all, there is a "sticker" on it ;)

The Ultimate in particular still needs some care. It must also be said that these are not popular domains that are reported ...

@hagezi hagezi unpinned this issue Feb 18, 2023
@hagezi
Copy link
Owner

hagezi commented Feb 18, 2023

@devipasigner Thanks for the help with the "thumbs up" on the Toplist Issues. ;)

@hagezi
Copy link
Owner

hagezi commented Feb 18, 2023

Most of them came from 1Hosts Pro, an aggressive source, which was to be expected.
As you can see from my whitelist, I've already sorted out a lot from the Umbrella Toplist during my test against 6000 websites. But I didn't focus on sites in the "500000" toplist range, of course. They are flying around my ears now. Is so, tomorrow C comes to it ... ;)

@hagezi hagezi closed this as completed Feb 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants