-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PSL Private Section Domains WHOIS Checker #2014
Conversation
I know this would be a big ask but would you be interested in trying to integrating this with the Go validator in https://github.com/publicsuffix/list/tree/master/tools/internal/parser? The idea being, that this can eventually be used to automatically add DNS & whois information to PRs with a Github action. However, that requires a little bit more effort from the parser to determine which sections have changed so only problems in the relevant section can be displayed. But integration would probably just mean providing a function that takes a URL and returns a summary of the status, like "expires in >2y/has expired/...". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The author of the whois package has declared it unsupported, is there an alternative?
Separate from the dialog here, I noted in an issue in @groundcat 's repo that identifying the [client|server]Hold status domain names into a separate file would be beneficial. Domains with either of those statuses almost always get that status for a reason that would make the domain something that should not be on the PSL, PLUS the domain would be NXD as those statuses cause the domain name to not be listed with NS delegation in their TLD zone files. |
def check_dns_status(domain): | ||
def make_request(): | ||
try: | ||
url = f"https://dns.google/resolve?name={domain}&type=NS" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Google is a good source for this, but I do want to mention that this places authority on to Google - they could theoretically intervene in the resolution process on a given record.
Would we perhaps want to use an array of such resolvers that are randomly selected from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated - Good point about relying solely on Google for DNS, as it centralizes authority and potential intervention risks. Sometimes, it gives errors due to random reasons such as network fluctuations, so perhaps randomly selecting from multiple resolvers is still not error-proof. So, I decided to query two resolvers (Google and Cloudflare) and only take the value if they are consistent across the two: the updated implementation uses both Google and Cloudflare DNS resolvers. It queries both resolvers and only considers the result usable and consistent if both return the same status. If the results are inconsistent or if there are errors, it re-retries up to 5 times.
Thank you for spotting this issue. I have replaced it with the whoisdomain package, which is recommended by the author of the retired whois package. Update: |
Thanks for the input. I added a new filter to get a CSV list of domains with any form of hold status, and another filter for CSV files with domains expiring within 2 years. I guess the latter might not be very useful at the moment since a handful of them are expiring less than 2 years and were probably submitted before the requirement was established, so they might not be aware of it, similar to the requirement for keeping the _psl TXT records at all times. |
My stance is that we want to be strict on the _psl DNS entry but lax with the expiration times because it's often impossible to check for us and often impossible to get >2y for the requester. |
This PR is related to #1996. It introduces the
tools/private_domains_checker
, a Python script created to fetch data from the PSL and check the domain status and expiry dates of the private section domains. It performs WHOIS checks on these domains and saves the results into CSV files for manual review.Please feel free to make any edits!
The README file has been updated to reflect the usage instructions.
Example CSV outputs from real PSL data: