The PSLPrivateDomainsProcessor
is a Python script designed to fetch data from the Public Suffix List (PSL) and check the domain status, expiry dates, and _psl
TXT records of the private section domains.
It performs WHOIS checks on these domains and saves the results into CSV files for manual review.
- Python 3.x
requests
pandas
whoisdomain
You can install the required packages using pip:
pip install -r requirements.txt
Ensure that whois
is installed on your operating system.
sudo apt install whois # Debian/Ubuntu
sudo yum install whois # Fedora/Centos/Rocky
PSLPrivateDomainsProcessor.py
: The main script containing the PSLPrivateDomainsProcessor
class and functions for DNS and WHOIS checks.
Run the script using Python:
cd private_domains_checker
mkdir data
python PSLPrivateDomainsProcessor.py
make_dns_request(domain, record_type)
: Makes DNS requests to both Google and Cloudflare DNS APIs.check_dns_status(domain)
: Checks the DNS status of a domain using Google and Cloudflare DNS APIs.get_whois_data(domain)
: Retrieves WHOIS data for a domain using the whoisdomain package.check_psl_txt_record(domain)
: Checks the_psl
TXT record for a domain using Google and Cloudflare DNS APIs.
fetch_psl_data()
: Fetches the PSL data from the specified URL.parse_domain(domain)
: Parses and normalizes a domain.parse_psl_data(psl_data)
: Parses the fetched PSL data and separates ICANN and private domains.process_domains(raw_domains, domains)
: Processes each domain, performing DNS, WHOIS, and_psl
TXT record checks.save_results()
: Saves all processed domain data todata/all.csv
.save_invalid_results()
: Saves domains with invalid DNS or expired WHOIS data todata/nxdomain.csv
anddata/expired.csv
.save_hold_results()
: Saves domains with WHOIS status containing any form of "hold" todata/hold.csv
.save_missing_psl_txt_results()
: Saves domains with invalid_psl
TXT records todata/missing_psl_txt.csv
.save_expiry_less_than_2yrs_results()
: Saves domains with WHOIS expiry date less than 2 years from now todata/expiry_less_than_2yrs.csv
.run()
: Executes the entire processing pipeline.
The script generates the following CSV files in the data
directory:
all.csv
: Contains all processed domain data.nxdomain.csv
: Contains domains that could not be resolved (NXDOMAIN
).expired.csv
: Contains domains with expired WHOIS records.hold.csv
: Contains domains with WHOIS status indicating any kind of "hold".missing_psl_txt.csv
: Contains domains with invalid_psl
TXT records.expiry_less_than_2yrs.csv
: Contains domains with WHOIS expiry date less than 2 years from now.
An example CSV entry:
psl_entry | top_level_domain | dns_status | whois_status | whois_domain_expiry_date | whois_domain_status | psl_txt_status | expiry_check_status |
---|---|---|---|---|---|---|---|
example.com | example.com | ok | ok | 2024-12-31 | "clientTransferProhibited" | "valid" | ok |
The script determines the publicly registrable namespace from private domains by using the ICANN section.
Here's how it works:
- ICANN Domains Set: ICANN domains are stored in a set for quick lookup.
- Domain Parsing: For each private domain, the script splits the domain into parts. It then checks if any suffix of these parts exists in the ICANN domains set.
- Normalization: The private domain is normalized to its publicly registrable form using the ICANN domains set.
Examples:
-
Input: PSL private domain entry
"*.example.com"
- Process:
- Remove leading
'*.'
:"example.com"
- Check
"com"
against the ICANN domains set: Found
- Remove leading
- Output:
"example.com"
- Process:
-
Input: PSL private domain entry
"sub.example.co.uk"
- Process:
- Check
"example.co.uk"
against the ICANN domains set: Not found - Check
"co.uk"
against the ICANN domains set: Found
- Check
- Output:
"example.co.uk"
- Process:
The output is then used for checking WHOIS data.
This tool is licensed under the MIT License.