Skip to content

Commit 4b7ecb6

Browse files
CaedenPHgithub-actionstianyizheng02pre-commit-ci[bot]
authored
Create is valid email address algorithm (#8907)
* feat(strings): Create is valid email address * updating DIRECTORY.md * feat(strings): Create is_valid_email_address algorithm * chore(is_valid_email_address): Implement changes from code review * Update strings/is_valid_email_address.py Co-authored-by: Tianyi Zheng <tianyizheng02@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * chore(is_valid_email_address): Fix ruff error * Update strings/is_valid_email_address.py Co-authored-by: Tianyi Zheng <tianyizheng02@gmail.com> --------- Co-authored-by: github-actions <${GITHUB_ACTOR}@users.noreply.github.com> Co-authored-by: Tianyi Zheng <tianyizheng02@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent c290dd6 commit 4b7ecb6

File tree

2 files changed

+118
-0
lines changed

2 files changed

+118
-0
lines changed

DIRECTORY.md

+1
Original file line numberDiff line numberDiff line change
@@ -1171,6 +1171,7 @@
11711171
* [Is Pangram](strings/is_pangram.py)
11721172
* [Is Spain National Id](strings/is_spain_national_id.py)
11731173
* [Is Srilankan Phone Number](strings/is_srilankan_phone_number.py)
1174+
* [Is Valid Email Address](strings/is_valid_email_address.py)
11741175
* [Jaro Winkler](strings/jaro_winkler.py)
11751176
* [Join](strings/join.py)
11761177
* [Knuth Morris Pratt](strings/knuth_morris_pratt.py)

strings/is_valid_email_address.py

+117
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
"""
2+
Implements an is valid email address algorithm
3+
4+
@ https://en.wikipedia.org/wiki/Email_address
5+
"""
6+
7+
import string
8+
9+
email_tests: tuple[tuple[str, bool], ...] = (
10+
("simple@example.com", True),
11+
("very.common@example.com", True),
12+
("disposable.style.email.with+symbol@example.com", True),
13+
("other-email-with-hyphen@and.subdomains.example.com", True),
14+
("fully-qualified-domain@example.com", True),
15+
("user.name+tag+sorting@example.com", True),
16+
("x@example.com", True),
17+
("example-indeed@strange-example.com", True),
18+
("test/test@test.com", True),
19+
(
20+
"123456789012345678901234567890123456789012345678901234567890123@example.com",
21+
True,
22+
),
23+
("admin@mailserver1", True),
24+
("example@s.example", True),
25+
("Abc.example.com", False),
26+
("A@b@c@example.com", False),
27+
("abc@example..com", False),
28+
("a(c)d,e:f;g<h>i[j\\k]l@example.com", False),
29+
(
30+
"12345678901234567890123456789012345678901234567890123456789012345@example.com",
31+
False,
32+
),
33+
("i.like.underscores@but_its_not_allowed_in_this_part", False),
34+
("", False),
35+
)
36+
37+
# The maximum octets (one character as a standard unicode character is one byte)
38+
# that the local part and the domain part can have
39+
MAX_LOCAL_PART_OCTETS = 64
40+
MAX_DOMAIN_OCTETS = 255
41+
42+
43+
def is_valid_email_address(email: str) -> bool:
44+
"""
45+
Returns True if the passed email address is valid.
46+
47+
The local part of the email precedes the singular @ symbol and
48+
is associated with a display-name. For example, "john.smith"
49+
The domain is stricter than the local part and follows the @ symbol.
50+
51+
Global email checks:
52+
1. There can only be one @ symbol in the email address. Technically if the
53+
@ symbol is quoted in the local-part, then it is valid, however this
54+
implementation ignores "" for now.
55+
(See https://en.wikipedia.org/wiki/Email_address#:~:text=If%20quoted,)
56+
2. The local-part and the domain are limited to a certain number of octets. With
57+
unicode storing a single character in one byte, each octet is equivalent to
58+
a character. Hence, we can just check the length of the string.
59+
Checks for the local-part:
60+
3. The local-part may contain: upper and lowercase latin letters, digits 0 to 9,
61+
and printable characters (!#$%&'*+-/=?^_`{|}~)
62+
4. The local-part may also contain a "." in any place that is not the first or
63+
last character, and may not have more than one "." consecutively.
64+
65+
Checks for the domain:
66+
5. The domain may contain: upper and lowercase latin letters and digits 0 to 9
67+
6. Hyphen "-", provided that it is not the first or last character
68+
7. The domain may also contain a "." in any place that is not the first or
69+
last character, and may not have more than one "." consecutively.
70+
71+
>>> for email, valid in email_tests:
72+
... assert is_valid_email_address(email) == valid
73+
"""
74+
75+
# (1.) Make sure that there is only one @ symbol in the email address
76+
if email.count("@") != 1:
77+
return False
78+
79+
local_part, domain = email.split("@")
80+
# (2.) Check octet length of the local part and domain
81+
if len(local_part) > MAX_LOCAL_PART_OCTETS or len(domain) > MAX_DOMAIN_OCTETS:
82+
return False
83+
84+
# (3.) Validate the characters in the local-part
85+
if any(
86+
char not in string.ascii_letters + string.digits + ".(!#$%&'*+-/=?^_`{|}~)"
87+
for char in local_part
88+
):
89+
return False
90+
91+
# (4.) Validate the placement of "." characters in the local-part
92+
if local_part.startswith(".") or local_part.endswith(".") or ".." in local_part:
93+
return False
94+
95+
# (5.) Validate the characters in the domain
96+
if any(char not in string.ascii_letters + string.digits + ".-" for char in domain):
97+
return False
98+
99+
# (6.) Validate the placement of "-" characters
100+
if domain.startswith("-") or domain.endswith("."):
101+
return False
102+
103+
# (7.) Validate the placement of "." characters
104+
if domain.startswith(".") or domain.endswith(".") or ".." in domain:
105+
return False
106+
return True
107+
108+
109+
if __name__ == "__main__":
110+
import doctest
111+
112+
doctest.testmod()
113+
114+
for email, valid in email_tests:
115+
is_valid = is_valid_email_address(email)
116+
assert is_valid == valid, f"{email} is {is_valid}"
117+
print(f"Email address {email} is {'not ' if not is_valid else ''}valid")

0 commit comments

Comments
 (0)