Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source suggestion #24

Closed
jarelllama opened this issue Apr 14, 2024 · 14 comments
Closed

Source suggestion #24

jarelllama opened this issue Apr 14, 2024 · 14 comments
Assignees
Labels
enhancement New feature or request

Comments

@jarelllama
Copy link

jarelllama commented Apr 14, 2024

Hi, I'm the maintainer of https://github.com/jarelllama/Scam-Blocklist , a blocklist for newly created scam and phishing domains automatically retrieved using Google Search API, automated NRD detection, and other public sources.

Seems to fall under your categories of

Malicious
Phishing
Fraud
Scam 

As part of my filtering process, a list of parked domains is generated as a byproduct:
https://raw.githubusercontent.com/jarelllama/Scam-Blocklist/main/data/parked_domains.txt
This list seems to fit your categories of

Hate & junk
Useless websites

This parked domains list is capped at 7000 entries and is updated daily (unparked sites are automatically removed too)

@sefinek sefinek added the enhancement New feature or request label Apr 14, 2024
@sefinek
Copy link
Owner

sefinek commented Apr 14, 2024

Hello,
I checked https://raw.githubusercontent.com/jarelllama/Scam-Blocklist/main/data/parked_domains.txt and noticed that most of these sites either display a white screen, a file explorer, or messages such as Welcome to our website Coming Soon and Is this your domain? Get it online with cloud-based Shared Hosting, complete with high-performance servers, scalable plans, and free SSL, along with other details about domain sales by GoDaddy. I found no content related to hate or junk, so I definitely plan to categorize your list under Useless websites. Let me know if you consent to the addition, or present any other suggestions if you have them.

For other block lists, please provide a specific GitHub link and describe the appropriate categories for the list. I want to mention that I am generally disinclined to add lists that are forks of existing lists.

Thanks (:

@jarelllama
Copy link
Author

jarelllama commented Apr 14, 2024

Thanks for the swift reply! My bad, I misunderstood the Hate & junk category. I agree that the parked domains list fits the Useless websites category better.

Regarding my main Scam Blocklist, here are the formats it is available in:

Format Syntax
Adblock Plus ||scam.com^
Dnsmasq local=/scam.com/
Unbound local-zone: "scam.com." always_nxdomain
Wildcard Asterisk *.scam.com
Wildcard Domains scam.com

Note that the blocklist does not source from any other existing GitHub blocklist, instead, I implemented my own sourcing such as via Google Search API, scam reporting sites like scamadviser.com, and a malicious NRD detector.

Here is the current list of sources implemented:

Google Search
Regex matching for malicious NRDs
aa419.org
dnstwist matching for malicious NRDs
guntab.com
petscams.com
scam.directory
scamadviser.com
stopgunscams.com

These sources were chosen because I have yet to see any other blocklist implement them.

The domains are retrieved from these sources automatically and daily using the open-source scripts in my repository.

Regarding categories, these few seem to fit the intentions of the blocklist:

Malicious
Phishing
Fraud
Scam

@sefinek
Copy link
Owner

sefinek commented Apr 16, 2024

Thanks. I will get to it soon. I am currently working on another project of mine. I will get back to you soon (:

@jarelllama
Copy link
Author

Thanks for the update!

@sefinek
Copy link
Owner

sefinek commented Apr 16, 2024

Hello again,

added in the commit: fc83700

I will soon add your lists to the list in markdown files and on the sefinek.net website. I want to check them more thoroughly.

You really did a great job, I admire it. It would be great if you could expand the repository, for example by enriching it with domains dedicated to cracking softwares or simply pirating. There are many possibilities (:

@jarelllama
Copy link
Author

jarelllama commented Apr 16, 2024

Thanks for the kind words! It means a lot. Do note the dead domains and parked domains file are both capped at the 8000 and 7000 newest domains respectively.

I also noticed you're using NoTracking as source. It recently got archived with more info here: notracking/hosts-blocklists#900

Regarding other areas of blocking, I did entertain the idea of blocking NSFW sites using the Google Search API. It was rather easy to retrieve hundreds of sites just using a few common search terms. However, I'm currently on the free tier of the API which limits me to 100 queries a day (I even created a second Google account so I can use two API keys to get pass the rate limit).

I would certainly not have enough API queries or the personal time to maintain more blocklists sadly.

Thanks again for the support!

jarelllama added a commit to jarelllama/Scam-Blocklist that referenced this issue Apr 16, 2024
@sefinek
Copy link
Owner

sefinek commented Apr 16, 2024

Regarding the notracking/hosts-blocklists, indeed. Thanks for bringing that up. I will soon remove their lists.

It would be really beneficial to implement larger block lists. I understand that API limitations are a major issue, as well as time, of course... Therefore, if you're interested, I think it would be a good idea to merge your repository with mine. I have no objections to this. I would add you as a collaborator here, and together we could manage the lists. There is strength in collaboration :) I believe this approach would significantly enhance user security for those utilizing the lists. My lists (blocklist.sefinek.net) receive a substantial number of server queries daily. Please check the statistics at the bottom of this page if you're interested.

I look forward to your reply and your decision.

@jarelllama
Copy link
Author

Thanks for the offer! But I doubt I could be of much help. I see most of your repository is in JavaScript, which I can't contribute too. I've already briefly reviewed your bash scripts and they're all well done.

Having my scam blocklist used as a source seems to be the biggest contribution I can offer right now.

@sefinek
Copy link
Owner

sefinek commented Apr 17, 2024

Alright, if you need anything, feel free to write

@jarelllama
Copy link
Author

Would it be possible to also add my list under phishing? Although most of my sources focus on scam sites, the sources retrieving from the NRD feed tend to be phishing domains (googgle.com, whattsapp.com, and such).

@sefinek
Copy link
Owner

sefinek commented Apr 17, 2024

I think I might just leave it as it is for now, because I'm going to sleep now <:

@jarelllama
Copy link
Author

No pressure!

@sefinek
Copy link
Owner

sefinek commented Apr 22, 2024

Hi again, I just added your lists to the generator (;
https://sefinek.net/blocklist-generator/pihole

And here: e98f475

@jarelllama
Copy link
Author

Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants