You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I try this URL: subdomain.mynetwork.testo
Result is ExtractResult(subdomain='subdomain.mynetwork', domain='testo', suffix='')
❎ Problem
And if I try this URL: subdomain.mynetwork.test
Result is ExtractResult(subdomain='subdomain', domain='mynetwork', suffix='test')
✅ Correct
So tldextract does not correctly extract the TLD.
It's because the way tldextract works tldextract relies on a list of known domain suffixes to determine which part of your URL is the domain and which part is the suffix.
When you use a URL with an unusual or non-standard suffix (like .testo in my example), tldextract may not recognize it as a valid suffix if it is not present in its list. As a result, it may misinterpret parts of the URL.
For this project I have modified the code of custom_func to split url on . and achieve my goal, but maybe we could refound this part to have a more accurate domain extraction
psyray
changed the title
bug: Subdomain import could fail if suffix more than 4 chars
bug(ui): Subdomain import could fail if suffix more than 4 chars
Jun 13, 2024
Is there an existing issue for this?
Current Behavior
During a pentest, I have a vpn connection to internal network, and AD domain name was like mynetwork.testo.
I have retrieved some hostnames from the domain : servers, workstations ... ~ 100 assets
After configuring DNS resolving to query internal DC (in /etc/resolv.conf), and adding the domain name above as a target,
I initiate a scan and supply hostnames in the textarea field.
I've launched the scan but subdomains was not imported.
After investigation it comes from the
get_domain_from_subdomain
, more precisely fromtldextract
function.rengine-ng/web/reNgine/common_func.py
Lines 427 to 437 in 55c9179
Here are the test sample result
I try this URL: subdomain.mynetwork.testo
Result is
ExtractResult(subdomain='subdomain.mynetwork', domain='testo', suffix='')
❎ Problem
And if I try this URL: subdomain.mynetwork.test
Result is
ExtractResult(subdomain='subdomain', domain='mynetwork', suffix='test')
✅ Correct
So tldextract does not correctly extract the TLD.
It's because the way tldextract works
tldextract relies on a list of known domain suffixes to determine which part of your URL is the domain and which part is the suffix.
When you use a URL with an unusual or non-standard suffix (like .testo in my example), tldextract may not recognize it as a valid suffix if it is not present in its list. As a result, it may misinterpret parts of the URL.
For this project I have modified the code of custom_func to split url on
.
and achieve my goal, but maybe we could refound this part to have a more accurate domain extractionExpected Behavior
Subdomains should have been imported as the TLD is the same as the target TLD
Steps To Reproduce
Environment
Anything else?
No response
The text was updated successfully, but these errors were encountered: