-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Registrant not consistent for a few TLDs #21
Comments
what exactly is wrong with ./test2.py -d google.cz
test domain: <<<<<<<<<< google.cz >>>>>>>>>>>>>>>>>>>>
name str 'google.cz'
tld str 'cz'
registrar str 'REG-MARKMONITOR'
registrant_country str ''
creation_date datetime.datetime 2000-07-21 15:21:00
expiration_date datetime.datetime 2024-07-22 00:00:00
last_updated datetime.datetime 2018-04-23 20:24:01
dnssec bool False
status str ''
statuses list []
name_servers list ['ns1.google.com', 'ns2.google.com', '
ns3.google.com', 'ns4.google.com']
*registrant str 'MM1171195'*emails list
[]
…On Wed, Aug 30, 2023 at 11:36 PM Badreddine Lejmi ***@***.***> wrote:
The registrant sometimes is not the real one. Let's take google as an
example. Here the list of there domains:
https://www.google.com/supported_domains
>>> import requests
>>> r = requests.get("https://www.google.com/supported_domains")
>>> domains = [domain[1:] for domain in r.text.splitlines()]
>>> registrants = []
>>> for domain in domains:
... try:
... w = whoisdomain.query(domain)
... except Exception as e:
... print(e)
... continue
... if hasattr(w, "registrant"):
... if registrants.get(w.registrant):
... registrants[w.registrant].append(domain)
... else:
... registrants[w.registrant] = [domain, ]
Here the results:
{None: ['google.com.ag',
'google.as',
'google.bg',
'google.com.bo',
'google.cd',
'google.fi',
'google.ge',
'google.gg',
'google.hr',
'google.im',
'google.je',
'google.kg',
'google.kz',
'google.com.ly',
'google.co.ma',
'google.mg',
'google.com.mx',
'google.com.ng',
'google.nu',
'google.ro',
'google.se',
'google.sn',
'google.sm',
'google.td',
'google.com.tw',
'google.co.ug',
'google.ws',
'google.rs'],
'': ['google.ae',
'google.com.ar',
'google.com.au',
'google.bf',
'google.com.br',
'google.de',
'google.dk',
'google.fm',
'google.gl',
'google.co.id',
'google.ie',
'google.co.il',
'google.co.jp',
'google.la',
'google.lt',
'google.lu',
'google.lv',
'google.mu',
'google.nl',
'google.no',
'google.com.om',
'google.pl',
'google.pt',
'google.com.qa',
'google.ru',
'google.tm',
'google.com.ua',
'google.co.uk',
'google.co.za'],
'ADMIN-LEO': ['google.co.ls'],
'CN_10': ['google.co.cr'],
'CON000020360': ['google.co.ve'],
'G830057': ['google.si'],
'GDA-ITFARM': ['google.co.tz'],
'GI7803022-NICAT': ['google.at'],
'GL210-IS': ['google.is'],
'GOOGLE INC': ['google.bj'],
'GOOGLE LLC (SGNIC-ORG1624232)': ['google.com.sg'],
'Google Canada Corporation': ['google.ca'],
'Google Inc.': ['google.co.zm'],
'Google Ireland Holdings Unlimited Company': ['google.fr', 'google.it'],
'Google Korea, LLC': ['google.co.kr'],
'Google LLC': ['google.com',
'google.com',
'google.am',
'google.bi',
'google.by',
'google.ci',
'google.cl',
'google.cm',
'google.com.co',
'google.dm',
'google.ee',
'google.com.gi',
'google.co.in',
'google.co.ke',
'google.com.lb',
'google.me',
'google.mn',
'google.co.mz',
'google.com.na',
'google.com.pe',
'google.com.pr',
'google.rw',
'google.com.sa',
'google.sc',
'google.sh',
'google.com.sl',
'google.so',
'google.st',
'google.tn',
'google.com.tr',
'google.com.vc',
'google.cat'],
'Google LLC (กูเกิล แอลแอลซี)': ['google.co.th'],
'HONG KONG INTERNET HOLDING LIMITED': ['google.com.hk'],
'MM1171195': ['google.cz'],
'Not shown, please visit www.dnsbelgium.be for webbased whois.': ['google.be'],
'Techno Bros. IT Solution Pty. Ltd.': ['google.com.et'],
'UNET-R11': ['google.mk'],
'mmr-170347': ['google.sk'],
'北京谷翔信息技术有限公司': ['google.cn']}
Wrong values:
- 'MM1171195': ['google.cz'],
- 'UNET-R11': ['google.mk'],
- 'mmr-170347': ['google.sk'],
- 'ADMIN-LEO': ['google.co.ls'],
- 'CN_10': ['google.co.cr'],
- 'CON000020360': ['google.co.ve'],
- 'G830057': ['google.si'],
- 'GDA-ITFARM': ['google.co.tz'],
- 'GI7803022-NICAT': ['google.at'],
- 'GL210-IS': ['google.is'],
- '' and None for a few of them
It could be easily fixed with the most of them by using Registrant
Organization like in those examples below:
❯ whois google.cz
% (c) 2006-2021 CZ.NIC, z.s.p.o.
%
% Intended use of supplied data and information
%
% Data contained in the domain name register, as well as information
% supplied through public information services of CZ.NIC association,
% are appointed only for purposes connected with Internet network
% administration and operation, or for the purpose of legal or other
% similar proceedings, in process as regards a matter connected
% particularly with holding and using a concrete domain name.
%
% Full text available at:
% http://www.nic.cz/page/306/intended-use-of-supplied-data-and-information/
%
% See also a search service at http://www.nic.cz/whois/
%
%
% Whoisd Server Version: 3.12.2
% Timestamp: Wed Aug 30 13:06:30 2023
domain: google.cz
registrant: MM1171195
admin-c: MM1171195
nsset: MM1543911
registrar: REG-MARKMONITOR
registered: 21.07.2000 15:21:00
changed: 23.04.2018 20:24:01
expire: 22.07.2024
contact: MM1171195
org: Google LLC
name: Domain Administrator
address: 1600 Amphitheatre Parkway
address: Mountain View
address: 94043
address: CA
address: US
registrar: REG-MARKMONITOR
created: 02.03.2018 18:52:05
changed: 15.05.2018 21:32:00
nsset: MM1543911
nserver: ns2.google.com
nserver: ns4.google.com
nserver: ns3.google.com
nserver: ns1.google.com
tech-c: MM193020
registrar: REG-MARKMONITOR
created: 18.05.2011 23:27:16
contact: MM193020
org: MarkMonitor Inc.
name: Domain Provisioning
address: 2150 S Bonito Way
address: Suite 150
address: Meridian
address: 83642
address: ID
address: US
registrar: REG-MARKMONITOR
created: 03.02.2011 18:24:34
changed: 29.06.2021 23:29:20
or
❯ whois google.sk
Domain: google.sk
Created: 2003-07-24
Valid Until: 2024-07-24
Updated: 2023-06-22
Domain Status: clientTransferProhibited, clientUpdateProhibited, clientDeleteProhibited
Nameserver: ns1.google.com
Nameserver: ns2.google.com
Nameserver: ns3.google.com
Nameserver: ns4.google.com
Domain registrant: mmr-170347
Name: Domain Administrator
Organization: Google Ireland Holdings Unlimited Company
Organization ID: 369511
Phone: +353.14361000
Email: ***@***.***
Street: 70 Sir John Rogerson's Quay
City: Dublin
Postal Code: 2
Country Code: IE
Authorised Registrar: MARK-0292
Created: 2019-06-07
Updated: 2019-06-07
Registrar: MARK-0292
Name: MarkMonitor International Limited
Organization: MarkMonitor International Limited
Organization ID: 4847541
Phone: +1.2083895740
Email: ***@***.***
Street: 12 New Fetter Lane
City: London
Postal Code: EC4A 1JP
Country Code: UK
Created: 2018-06-27
Updated: 2023-08-03
Administrative Contact: mmr-170347
Name: Domain Administrator
Organization: Google Ireland Holdings Unlimited Company
Organization ID: 369511
Phone: +353.14361000
Email: ***@***.***
Street: 70 Sir John Rogerson's Quay
City: Dublin
Postal Code: 2
Country Code: IE
Created: 2019-06-07
Updated: 2019-06-07
Technical Contact: mmr-170347
Name: Domain Administrator
Organization: Google Ireland Holdings Unlimited Company
Organization ID: 369511
Phone: +353.14361000
Email: ***@***.***
Street: 70 Sir John Rogerson's Quay
City: Dublin
Postal Code: 2
Country Code: IE
Created: 2019-06-07
Updated: 2019-06-07
—
Reply to this email directly, view it on GitHub
<#21>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A7CCKLAN7STUIOKDZ7YM3CDXX6W3ZANCNFSM6AAAAAA4FEDTGU>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
The registrant IMHO should be "Google LLC" preferably over "MM1171195" |
That will need a sophisticaded regex to find the org string in the
registrar section , alternatively you could use cli whois to directly query
for the registrar handle once you have it
It is also a significant change from the original behaviour.
I expect due to gdpr rules some european whois tld responses may not have
the name string onky the handle.
…On Thu, Aug 31, 2023, 17:13 Badreddine Lejmi ***@***.***> wrote:
The registrant IMHO should be "Google Ireland Holdings Unlimited Company"
preferably over "MM1171195"
—
Reply to this email directly, view it on GitHub
<#21 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A7CCKLDHAL5BGKU34MZLHNLXYCSZDANCNFSM6AAAAAA4FEDTGU>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
I can handle this PR. It will not be so hard if I use the same technique as I did for .fr |
i was thinking of adding a group extract, extract a section of data
starting with a line registrar: untill the next empty line, and then we
could apply further regexes on the extracted section
but that may be messy in the code and i would have to add a
apply Regex B after extract A kind of behaviour
…On Thu, Aug 31, 2023 at 8:48 PM Badreddine Lejmi ***@***.***> wrote:
I can handle this PR. It will not be so hard if I use the same technique
as I did for .fr
Regarding GDPR, it's the organization's name, not the person's name so not
it does not apply.
And even if it does apply, here the data processor is the user not the
tool in itself.
The only issue a user may have with this tool, and that already exists
actually, is the persistent file storage cache mechanism yet it's unrelated
to this specific issue.
—
Reply to this email directly, view it on GitHub
<#21 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A7CCKLBA6TK4FVXRFNCRGQTXYDMAXANCNFSM6AAAAAA4FEDTGU>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
as i expected the solution for fr is also not stable:
as the type: ORGANISATION contact: happens several times and the order the
whois server replies those contacts are not defined i now get markmonitor
i expect you need the data from nic handle holder-c:
GIHU100-FRNIC in this case
test2.py SIMPLISTIC: False
DEBUG: ### lookup: tldString: fr; dList: ['google', 'fr']
DEBUG: CACHE_STUB None
DEBUG: initializing default cache
DEBUG cache init SimpleCacheBase
init SimpleCacheWithFile
DEBUG: force: False
DEBUG cache get: google.fr
get: no data
DEBUG: timout: 30.0
DEBUG: [Querying whois.nic.fr]
[whois.nic.fr]
%%
%% This is the AFNIC Whois server.
%%
%% complete date format: YYYY-MM-DDThh:mm:ssZ
%%
%% Rights restricted by copyright.
%% See
https://www.afnic.fr/en/domain-names-and-support/everything-there-is-to-know-about-domain-names/find-a-domain-name-or-a-holder-using-whois/
%%
%%
domain: google.fr
status: ACTIVE
eppstatus: serverUpdateProhibited
eppstatus: serverTransferProhibited
eppstatus: serverDeleteProhibited
eppstatus: serverRecoverProhibited
hold: NO
holder-c: GIHU100-FRNIC
admin-c: GIHU101-FRNIC
tech-c: MI3669-FRNIC
registrar: MARKMONITOR Inc.
Expiry Date: 2023-12-30T17:16:48Z
created: 2000-07-26T22:00:00Z
last-update: 2022-12-03T09:40:42.40624Z
source: FRNIC
nserver: ns1.google.com
nserver: ns2.google.com
nserver: ns3.google.com
nserver: ns4.google.com
source: FRNIC
registrar: MARKMONITOR Inc.
address: 2150 S. Bonito Way, Suite 150
address: ID 83642 MERIDIAN
country: US
phone: +1.2083895740
fax-no: +1.2083895771
e-mail: ***@***.***
website: http://www.markmonitor.com
anonymous: No
registered: 2002-01-07T00:00:00Z
source: FRNIC
nic-hdl: MI3669-FRNIC
type: ORGANIZATION
contact: MarkMonitor Inc.
address: 2150 S. Bonito Way, Suite 150
address: 83642 Meridian
country: US
phone: +1.2083895740
fax-no: +1.2083895771
e-mail: ***@***.***
registrar: MARKMONITOR Inc.
changed: 2023-08-25T15:02:54.100903Z
anonymous: NO
obsoleted: NO
eppstatus: associated
eppstatus: active
eligstatus: ok
eligsource: REGISTRAR
eligdate: 2021-10-05T00:00:00Z
reachstatus: ok
reachmedia: email
reachsource: REGISTRAR
reachdate: 2021-10-05T00:00:00Z
source: FRNIC
nic-hdl: GIHU101-FRNIC
type: ORGANIZATION
contact: Google Ireland Holdings Unlimited Company
address: 70 Sir John Rogerson's Quay
address: 2 Dublin
country: IE
phone: +353.14361000
e-mail: ***@***.***
registrar: MARKMONITOR Inc.
anonymous: NO
obsoleted: NO
eppstatus: associated
eppstatus: active
eligstatus: not identified
reachstatus: ok
reachmedia: email
reachsource: REGISTRAR
reachdate: 2018-03-02T00:00:00Z
source: FRNIC
nic-hdl: GIHU100-FRNIC
type: ORGANIZATION
contact: Google Ireland Holdings Unlimited Company
address: Google Ireland Holdings Unlimited Company
address: 70 Sir John Rogerson's Quay
address: 2 Dublin
country: IE
phone: +353.14361000
e-mail: ***@***.***
registrar: MARKMONITOR Inc.
changed: 2022-10-15T05:41:14.918179Z
anonymous: NO
obsoleted: NO
eppstatus: serverUpdateProhibited
eppstatus: associated
eligstatus: not identified
reachstatus: not identified
source: FRNIC
>> WHOIS request date: 2023-09-01T07:39:41.215977Z <<<
DEBUG: cache put: google.fr
DEBUG: Raw: [Querying whois.nic.fr]
[whois.nic.fr]
%%
%% This is the AFNIC Whois server.
%%
%% complete date format: YYYY-MM-DDThh:mm:ssZ
%%
%% Rights restricted by copyright.
%% See
https://www.afnic.fr/en/domain-names-and-support/everything-there-is-to-know-about-domain-names/find-a-domain-name-or-a-holder-using-whois/
%%
%%
domain: google.fr
status: ACTIVE
eppstatus: serverUpdateProhibited
eppstatus: serverTransferProhibited
eppstatus: serverDeleteProhibited
eppstatus: serverRecoverProhibited
hold: NO
holder-c: GIHU100-FRNIC
admin-c: GIHU101-FRNIC
tech-c: MI3669-FRNIC
registrar: MARKMONITOR Inc.
Expiry Date: 2023-12-30T17:16:48Z
created: 2000-07-26T22:00:00Z
last-update: 2022-12-03T09:40:42.40624Z
source: FRNIC
nserver: ns1.google.com
nserver: ns2.google.com
nserver: ns3.google.com
nserver: ns4.google.com
source: FRNIC
registrar: MARKMONITOR Inc.
address: 2150 S. Bonito Way, Suite 150
address: ID 83642 MERIDIAN
country: US
phone: +1.2083895740
fax-no: +1.2083895771
e-mail: ***@***.***
website: http://www.markmonitor.com
anonymous: No
registered: 2002-01-07T00:00:00Z
source: FRNIC
nic-hdl: MI3669-FRNIC
type: ORGANIZATION
contact: MarkMonitor Inc.
address: 2150 S. Bonito Way, Suite 150
address: 83642 Meridian
country: US
phone: +1.2083895740
fax-no: +1.2083895771
e-mail: ***@***.***
registrar: MARKMONITOR Inc.
changed: 2023-08-25T15:02:54.100903Z
anonymous: NO
obsoleted: NO
eppstatus: associated
eppstatus: active
eligstatus: ok
eligsource: REGISTRAR
eligdate: 2021-10-05T00:00:00Z
reachstatus: ok
reachmedia: email
reachsource: REGISTRAR
reachdate: 2021-10-05T00:00:00Z
source: FRNIC
nic-hdl: GIHU101-FRNIC
type: ORGANIZATION
contact: Google Ireland Holdings Unlimited Company
address: 70 Sir John Rogerson's Quay
address: 2 Dublin
country: IE
phone: +353.14361000
e-mail: ***@***.***
registrar: MARKMONITOR Inc.
anonymous: NO
obsoleted: NO
eppstatus: associated
eppstatus: active
eligstatus: not identified
reachstatus: ok
reachmedia: email
reachsource: REGISTRAR
reachdate: 2018-03-02T00:00:00Z
source: FRNIC
nic-hdl: GIHU100-FRNIC
type: ORGANIZATION
contact: Google Ireland Holdings Unlimited Company
address: Google Ireland Holdings Unlimited Company
address: 70 Sir John Rogerson's Quay
address: 2 Dublin
country: IE
phone: +353.14361000
e-mail: ***@***.***
registrar: MARKMONITOR Inc.
changed: 2022-10-15T05:41:14.918179Z
anonymous: NO
obsoleted: NO
eppstatus: serverUpdateProhibited
eppstatus: associated
eligstatus: not identified
reachstatus: not identified
source: FRNIC
>> WHOIS request date: 2023-09-01T07:39:41.215977Z <<<
domain_name, [' google.fr']
registrar, ['MARKMONITOR Inc.', 'MARKMONITOR Inc.', 'MARKMONITOR Inc.',
'MARKMONITOR Inc.', 'MARKMONITOR Inc.']
registrant, [' MarkMonitor Inc.', '
Google Ireland Holdings Unlimited Company', '
Google Ireland Holdings Unlimited Company']
registrant_country, [' US', '
US', ' IE', ' IE']
creation_date, [' 2000-07-26T22:00:00Z']
expiration_date, [' 2023-12-30T17:16:48Z']
updated_date, [' 2022-12-03T09:40:42.40624Z']
name_servers, ['ns1.google.com', 'ns2.google.com', 'ns3.google.com', '
ns4.google.com']
status, [' ACTIVE', '
serverUpdateProhibited', ' serverTransferProhibited', '
serverDeleteProhibited', '
serverRecoverProhibited', ' associated', '
active', ' ok', ' ok', '
associated', ' active', '
not identified', ' ok', '
serverUpdateProhibited', ' associated', '
not identified', ' not identified']
emails, ***@***.***', ***@***.***', '
***@***.***', ***@***.***']
registrant_organization, ['MarkMonitor Inc.', 'Google Ireland Holdings
Unlimited Company', 'Google Ireland Holdings Unlimited Company']
DEBUG: Clean: [Querying whois.nic.fr]
[whois.nic.fr]
%%
%% This is the AFNIC Whois server.
%%
%% complete date format: YYYY-MM-DDThh:mm:ssZ
%%
%% Rights restricted by copyright.
%% See
https://www.afnic.fr/en/domain-names-and-support/everything-there-is-to-know-about-domain-names/find-a-domain-name-or-a-holder-using-whois/
%%
%%
domain: google.fr
status: ACTIVE
eppstatus: serverUpdateProhibited
eppstatus: serverTransferProhibited
eppstatus: serverDeleteProhibited
eppstatus: serverRecoverProhibited
hold: NO
holder-c: GIHU100-FRNIC
admin-c: GIHU101-FRNIC
tech-c: MI3669-FRNIC
registrar: MARKMONITOR Inc.
Expiry Date: 2023-12-30T17:16:48Z
created: 2000-07-26T22:00:00Z
last-update: 2022-12-03T09:40:42.40624Z
source: FRNIC
nserver: ns1.google.com
nserver: ns2.google.com
nserver: ns3.google.com
nserver: ns4.google.com
source: FRNIC
registrar: MARKMONITOR Inc.
address: 2150 S. Bonito Way, Suite 150
address: ID 83642 MERIDIAN
country: US
phone: +1.2083895740
fax-no: +1.2083895771
e-mail: ***@***.***
website: http://www.markmonitor.com
anonymous: No
registered: 2002-01-07T00:00:00Z
source: FRNIC
nic-hdl: MI3669-FRNIC
type: ORGANIZATION
contact: MarkMonitor Inc.
address: 2150 S. Bonito Way, Suite 150
address: 83642 Meridian
country: US
phone: +1.2083895740
fax-no: +1.2083895771
e-mail: ***@***.***
registrar: MARKMONITOR Inc.
changed: 2023-08-25T15:02:54.100903Z
anonymous: NO
obsoleted: NO
eppstatus: associated
eppstatus: active
eligstatus: ok
eligsource: REGISTRAR
eligdate: 2021-10-05T00:00:00Z
reachstatus: ok
reachmedia: email
reachsource: REGISTRAR
reachdate: 2021-10-05T00:00:00Z
source: FRNIC
nic-hdl: GIHU101-FRNIC
type: ORGANIZATION
contact: Google Ireland Holdings Unlimited Company
address: 70 Sir John Rogerson's Quay
address: 2 Dublin
country: IE
phone: +353.14361000
e-mail: ***@***.***
registrar: MARKMONITOR Inc.
anonymous: NO
obsoleted: NO
eppstatus: associated
eppstatus: active
eligstatus: not identified
reachstatus: ok
reachmedia: email
reachsource: REGISTRAR
reachdate: 2018-03-02T00:00:00Z
source: FRNIC
nic-hdl: GIHU100-FRNIC
type: ORGANIZATION
contact: Google Ireland Holdings Unlimited Company
address: Google Ireland Holdings Unlimited Company
address: 70 Sir John Rogerson's Quay
address: 2 Dublin
country: IE
phone: +353.14361000
e-mail: ***@***.***
registrar: MARKMONITOR Inc.
changed: 2022-10-15T05:41:14.918179Z
anonymous: NO
obsoleted: NO
eppstatus: serverUpdateProhibited
eppstatus: associated
eligstatus: not identified
reachstatus: not identified
source: FRNIC
…>> WHOIS request date: 2023-09-01T07:39:41.215977Z <<<
On Thu, Aug 31, 2023 at 8:48 PM Badreddine Lejmi ***@***.***> wrote:
I can handle this PR. It will not be so hard if I use the same technique
as I did for .fr
Regarding GDPR, it's the organization's name, not the person's name so not
it does not apply.
And even if it does apply, here the data processor is the user not the
tool in itself.
The only issue a user may have with this tool, and that already exists
actually, is the persistent file storage cache mechanism yet it's unrelated
to this specific issue.
—
Reply to this email directly, view it on GitHub
<#21 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A7CCKLBA6TK4FVXRFNCRGQTXYDMAXANCNFSM6AAAAAA4FEDTGU>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
It's not stable but that's much better than it was. We can make it even better by comparing the registrant with the registrar and picking the first registrant that's different from the registrar (.lower()) if there is more than one registrant. |
im currently thinking about adding a feature like
```
{
"section: {
"from_to": [
r"<from regex",
r"to regex"
],
"extract": r"extract regex"
}
}
```
to first extract a section (like one contact record group) and then extract
info from that section only for entries like `whois google.sk`
this possibly with a feature to extract based on info we extracted earlier
for contact sections that themselves do not provide what the contact is
about, like `whois google.fr`
…On Fri, Sep 1, 2023 at 10:58 AM Badreddine Lejmi ***@***.***> wrote:
It's not stable but that's much better than it was. We can make it even
better by comparing the registrant with the registrat and picking the first
registrant that's different from the registrat (.lower()) if there is more
than one registrant.
—
Reply to this email directly, view it on GitHub
<#21 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A7CCKLGD4D6N4SGZXE2M4KLXYGPTFANCNFSM6AAAAAA4FEDTGU>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
ok new code is now available in tld_regexpr that convers all existing re strings to functions. the function is called by whoisParser.py as func(textString) and should return a [str] this means it is now easy to add more complicated parsing then one single regex
to limit searched for something in a particular section of the whois cli result the road is open for much more targeted searches |
i updated the "sk" tld to use the new contextual extract |
add findFromToAndLookForWithFindFirst contextual search based on a previous findFirst, |
how would you want to handle
|
Ok, we could do it progressively :
FYI, I've benchmarked whoisdomain, asyncwhois and whoisit against this registrant name issue and performance. It appears that whoisdomain is the fastest and close second in terms of quality. Test has to be reproduced in other machines because of network/caching issues.
|
added the proper parsing for google. from your test case #21 (comment), some actually have no organization or name (google.si, google.co.tz) some have no data or no registrant, a similar test for meta wold be nice if that is possible |
What are meta words? Could you describe a little bit the test case you wish, I could write it. I think than in the long-term, an approched based similar to JSWhois is the most interesting: https://github.com/jschauma/jswhois/blob/main/src/jswhois.go / https://www.netmeister.org/blog/whois.html i.e:
|
JSwhois is certainly interesting to look at (item 2 could be derived from work at jswhois)
i finished all registrars i could find for the google test: see
|
meta is all domains owned by facebook see: |
Ok, for meta.com it was a bit hard so I tried with various facebook domains It gaves me this result (quite good actually) on a old release, so it should be even better now:
|
thanks, (.za has no whois server, nz has no registrar in the response) i will see if i can make a minimal rdap client we could use currently adding public suffix list info (if library 'tld' is included on the running platform) |
shall we close this one for now, we can add new issues for future work on rdap , and json response with grouping |
The registrant sometimes is not the real one. Let's take google as an example. Here the list of there domains: https://www.google.com/supported_domains
Here the results:
Example of wrong values:
It could be easily fixed with the most of them by using Registrant Organization like in those examples below:
or
The text was updated successfully, but these errors were encountered: