bug: Subdomain import could fail if suffix more than 4 chars #1128

psyray · 2023-12-11T23:56:33Z

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

During a pentest, I have a vpn connection to internal network, and AD domain name was like mynetwork.testo.

I have retrieved some hostnames from the domain : servers, workstations ... ~ 100 assets

After configuring DNS resolving to query internal DC (in /etc/resolv.conf), and adding the domain name above as a target,

I initiate a scan and supply hostnames in the textarea field.

I've launched the scan but subdomains was not imported.

After investigation it comes from the get_domain_from_subdomain, more precisely from tldextract function.

rengine/web/reNgine/common_func.py

Lines 427 to 437 in fd5a5e5

    
           def get_domain_from_subdomain(subdomain): 
        
           	"""Get domain from subdomain. 
        
           	Args: 
        
           		subdomain (str): Subdomain name. 
        
           	Returns: 
        
           		str: Domain name. 
        
           	""" 
        
           	ext = tldextract.extract(subdomain) 
        
           	return '.'.join(ext[1:3])

Here are the test sample result

I try this URL: subdomain.mynetwork.testo
Result is
ExtractResult(subdomain='subdomain.mynetwork', domain='testo', suffix='')
❎ Problem

And if I try this URL: subdomain.mynetwork.test
Result is
ExtractResult(subdomain='subdomain', domain='mynetwork', suffix='test')
✅ Correct

So tldextract does not correctly extract the TLD.
It's because the way tldextract works
tldextract relies on a list of known domain suffixes to determine which part of your URL is the domain and which part is the suffix.

When you use a URL with an unusual or non-standard suffix (like .testo in my example), tldextract may not recognize it as a valid suffix if it is not present in its list. As a result, it may misinterpret parts of the URL.

For this project I have modified the code of custom_func to split url on . and achieve my goal, but maybe we could refound this part to have a more accurate domain extraction

def get_domain_from_subdomain(subdomain):
	"""Get domain from subdomain.

	Args:
		subdomain (str): Subdomain name.

	Returns:
		str: Domain name.
	"""
	domain, suffix = extract_domain_and_suffix(subdomain)
	return '.'.join([domain, suffix])

def extract_domain_and_suffix(url):
	parts = url.split('.')

	if len(parts) >= 2:
		domain = parts[-2]
		suffix = parts[-1]
		return domain, suffix
	else:
		return None, None

Expected Behavior

Subdomains should have been imported as the TLD is the same as the target TLD

Steps To Reproduce

Add a target TLD with an exotic suffix
Initiate a scan and provide a list of subdomain to import with valid TLD
Subdomains not imported

Environment

- reNgine: 2.0.2
- OS: Ubuntu 22.04
- Python: 3.10
- Docker Engine: 
- Docker Compose: 
- Browser: FF 120

Anything else?

No response

The text was updated successfully, but these errors were encountered:

github-actions · 2023-12-11T23:56:44Z

👋 Hi @psyray,
Issues is only for reporting a bug/feature request. Please read documentation before raising an issue https://rengine.wiki
For very limited support, questions, and discussions, please join reNgine Discord channel: https://discord.gg/azv6fzhNCE
Please include all the requested and relevant information when opening a bug report. Improper reports will be closed without any response.

…uld-fail-if-suffix-more-than-4-chars (bug) Fix subdomain import for subdomains with suffix more than 4 chars Fixes #1128

psyray added the bug Something isn't working label Dec 11, 2023

psyray self-assigned this Feb 21, 2024

psyray added the top-priority label Feb 21, 2024

yogeshojha removed the top-priority label Jul 30, 2024

yogeshojha assigned yogeshojha and unassigned psyray Jul 30, 2024

yogeshojha added the release/2.1.2 label Jul 30, 2024

yogeshojha linked a pull request Jul 30, 2024 that will close this issue

(bug) Fix subdomain import for subdomains with suffix more than 4 chars Fixes #1128 #1340

Merged

yogeshojha added a commit that referenced this issue Jul 30, 2024

Merge pull request #1340 from yogeshojha/1128-bug-subdomain-import-co…

ff7a64d

…uld-fail-if-suffix-more-than-4-chars (bug) Fix subdomain import for subdomains with suffix more than 4 chars Fixes #1128

yogeshojha closed this as completed in #1340 Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: Subdomain import could fail if suffix more than 4 chars #1128

bug: Subdomain import could fail if suffix more than 4 chars #1128

psyray commented Dec 11, 2023 •

edited

Loading

github-actions bot commented Dec 11, 2023

bug: Subdomain import could fail if suffix more than 4 chars #1128

bug: Subdomain import could fail if suffix more than 4 chars #1128

Comments

psyray commented Dec 11, 2023 • edited Loading

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

github-actions bot commented Dec 11, 2023

psyray commented Dec 11, 2023 •

edited

Loading