Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Subdomain import could fail if suffix more than 4 chars #1128

Closed
1 task done
psyray opened this issue Dec 11, 2023 · 1 comment · Fixed by #1340
Closed
1 task done

bug: Subdomain import could fail if suffix more than 4 chars #1128

psyray opened this issue Dec 11, 2023 · 1 comment · Fixed by #1340
Assignees
Labels
bug Something isn't working release/2.1.2

Comments

@psyray
Copy link
Contributor

psyray commented Dec 11, 2023

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

During a pentest, I have a vpn connection to internal network, and AD domain name was like mynetwork.testo.

I have retrieved some hostnames from the domain : servers, workstations ... ~ 100 assets

After configuring DNS resolving to query internal DC (in /etc/resolv.conf), and adding the domain name above as a target,

I initiate a scan and supply hostnames in the textarea field.
image

I've launched the scan but subdomains was not imported.

After investigation it comes from the get_domain_from_subdomain, more precisely from tldextract function.

def get_domain_from_subdomain(subdomain):
"""Get domain from subdomain.
Args:
subdomain (str): Subdomain name.
Returns:
str: Domain name.
"""
ext = tldextract.extract(subdomain)
return '.'.join(ext[1:3])

Here are the test sample result

I try this URL: subdomain.mynetwork.testo
Result is
ExtractResult(subdomain='subdomain.mynetwork', domain='testo', suffix='')
❎ Problem

And if I try this URL: subdomain.mynetwork.test
Result is
ExtractResult(subdomain='subdomain', domain='mynetwork', suffix='test')
✅ Correct

So tldextract does not correctly extract the TLD.
It's because the way tldextract works
tldextract relies on a list of known domain suffixes to determine which part of your URL is the domain and which part is the suffix.

When you use a URL with an unusual or non-standard suffix (like .testo in my example), tldextract may not recognize it as a valid suffix if it is not present in its list. As a result, it may misinterpret parts of the URL.

For this project I have modified the code of custom_func to split url on . and achieve my goal, but maybe we could refound this part to have a more accurate domain extraction

def get_domain_from_subdomain(subdomain):
	"""Get domain from subdomain.

	Args:
		subdomain (str): Subdomain name.

	Returns:
		str: Domain name.
	"""
	domain, suffix = extract_domain_and_suffix(subdomain)
	return '.'.join([domain, suffix])

def extract_domain_and_suffix(url):
	parts = url.split('.')

	if len(parts) >= 2:
		domain = parts[-2]
		suffix = parts[-1]
		return domain, suffix
	else:
		return None, None

Expected Behavior

Subdomains should have been imported as the TLD is the same as the target TLD

Steps To Reproduce

  1. Add a target TLD with an exotic suffix
  2. Initiate a scan and provide a list of subdomain to import with valid TLD
  3. Subdomains not imported

Environment

- reNgine: 2.0.2
- OS: Ubuntu 22.04
- Python: 3.10
- Docker Engine: 
- Docker Compose: 
- Browser: FF 120

Anything else?

No response

@psyray psyray added the bug Something isn't working label Dec 11, 2023
Copy link
Contributor

👋 Hi @psyray,
Issues is only for reporting a bug/feature request. Please read documentation before raising an issue https://rengine.wiki
For very limited support, questions, and discussions, please join reNgine Discord channel: https://discord.gg/azv6fzhNCE
Please include all the requested and relevant information when opening a bug report. Improper reports will be closed without any response.

@psyray psyray self-assigned this Feb 21, 2024
@yogeshojha yogeshojha assigned yogeshojha and unassigned psyray Jul 30, 2024
yogeshojha added a commit that referenced this issue Jul 30, 2024
…uld-fail-if-suffix-more-than-4-chars

(bug) Fix subdomain import for subdomains with suffix more than 4 chars Fixes #1128
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working release/2.1.2
Projects
None yet
2 participants