You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I try this URL: subdomain.mynetwork.testo
Result is ExtractResult(subdomain='subdomain.mynetwork', domain='testo', suffix='')
❎ Problem
And if I try this URL: subdomain.mynetwork.test
Result is ExtractResult(subdomain='subdomain', domain='mynetwork', suffix='test')
✅ Correct
So tldextract does not correctly extract the TLD.
It's because the way tldextract works tldextract relies on a list of known domain suffixes to determine which part of your URL is the domain and which part is the suffix.
When you use a URL with an unusual or non-standard suffix (like .testo in my example), tldextract may not recognize it as a valid suffix if it is not present in its list. As a result, it may misinterpret parts of the URL.
For this project I have modified the code of custom_func to split url on . and achieve my goal, but maybe we could refound this part to have a more accurate domain extraction
👋 Hi @psyray,
Issues is only for reporting a bug/feature request. Please read documentation before raising an issue https://rengine.wiki
For very limited support, questions, and discussions, please join reNgine Discord channel: https://discord.gg/azv6fzhNCE
Please include all the requested and relevant information when opening a bug report. Improper reports will be closed without any response.
Is there an existing issue for this?
Current Behavior
During a pentest, I have a vpn connection to internal network, and AD domain name was like mynetwork.testo.
I have retrieved some hostnames from the domain : servers, workstations ... ~ 100 assets
After configuring DNS resolving to query internal DC (in /etc/resolv.conf), and adding the domain name above as a target,
I initiate a scan and supply hostnames in the textarea field.
I've launched the scan but subdomains was not imported.
After investigation it comes from the
get_domain_from_subdomain
, more precisely fromtldextract
function.rengine/web/reNgine/common_func.py
Lines 427 to 437 in fd5a5e5
Here are the test sample result
I try this URL: subdomain.mynetwork.testo
Result is
ExtractResult(subdomain='subdomain.mynetwork', domain='testo', suffix='')
❎ Problem
And if I try this URL: subdomain.mynetwork.test
Result is
ExtractResult(subdomain='subdomain', domain='mynetwork', suffix='test')
✅ Correct
So tldextract does not correctly extract the TLD.
It's because the way tldextract works
tldextract relies on a list of known domain suffixes to determine which part of your URL is the domain and which part is the suffix.
When you use a URL with an unusual or non-standard suffix (like .testo in my example), tldextract may not recognize it as a valid suffix if it is not present in its list. As a result, it may misinterpret parts of the URL.
For this project I have modified the code of custom_func to split url on
.
and achieve my goal, but maybe we could refound this part to have a more accurate domain extractionExpected Behavior
Subdomains should have been imported as the TLD is the same as the target TLD
Steps To Reproduce
Environment
Anything else?
No response
The text was updated successfully, but these errors were encountered: