Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CVE-2023-27043] gh-102988: Reject malformed addresses in email.parseaddr() #111116

Merged
merged 14 commits into from
Dec 15, 2023

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Oct 20, 2023

Detect email address parsing errors and return empty tuple to indicate the parsing error (old API). Add an optional 'strict' parameter to getaddresses() and parseaddr() functions. Patch by Thomas Dwyer.


📚 Documentation preview 📚: https://cpython-previews--111116.org.readthedocs.build/

@vstinner
Copy link
Member Author

@gpshead @serhiy-storchaka @bitdancer @warsaw: Would you mind to review this security fix?

See issue gh-102988 for the context.

This PR is a copy of PR #108250 but I added strict=True parameter, so it's possible to get the old behavior. I added tests on both modes, strict=True and strict=False.

@vstinner
Copy link
Member Author

This PR is a copy of PR #108250 but I added strict=True parameter

My colleague Lumir Balhar @frenzymadness ran an impact check of PR #108250 on Fedora: in short, there is no impact, the test suite of all Python packages (in Fedora) pass with the change. While there were some build errors, they were unrelated to the email issue. For details, see https://copr.fedorainfracloud.org/coprs/lbalhar/email-CVE/builds/ COPR which as more than 4300 builds.

Now with an additional strict parameter, if there is any impacted project, at least there is a way to "opt out".

@vstinner
Copy link
Member Author

@tdwyer: Would you mind to review my change, to see if I preserved your work correctly? (code and tests)

@vstinner vstinner added type-security A security issue needs backport to 3.8 needs backport to 3.9 only security fixes needs backport to 3.10 only security fixes needs backport to 3.11 only security fixes needs backport to 3.12 bug and security fixes labels Oct 27, 2023
@vstinner
Copy link
Member Author

I think that we should backport the change to all branches accepting security fixes. Problem: the change refer to version numbers, which as .. versionchanged:: 3.13. I suppose that if the change is backported, we should compute the next version of each branch, so backport manually.

@vstinner
Copy link
Member Author

@ambv @SethMichaelLarson: Would you mind to review this PR?

@ambv ambv changed the title gh-102988: email parseaddr() now rejects malformed address gh-102988: Reject malformed addresses in email.parseaddr() Oct 27, 2023
@ambv
Copy link
Contributor

ambv commented Oct 27, 2023

Why is this a separate PR from #108250?

@@ -165,7 +165,7 @@ email
encountered instead of potentially inaccurate values. Add optional *strict*
parameter to these two functions: use ``strict=False`` to get the old
behavior, accept malformed inputs.
(Contributed by Thomas Dwyer for :gh:`102988` to ameliorate CVE-2023-27043
(Contributed by Thomas Dwyer for :gh:`102988` to improve the CVE-2023-27043
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL a new word.

@@ -42,6 +42,8 @@

specialsre = re.compile(r'[][\\()<>@,:;".]')
escapesre = re.compile(r'[\\"]')
realname_comma_re = re.compile(r'"[^"]*,[^"]*"')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
realname_comma_re = re.compile(r'"[^"]*,[^"]*"')
realname_comma_re = re.compile(r'"[^",]*+,[^"]*+"')

It is faster. But I am not sure that the use of such regex is correct.

def _pre_parse_validation(email_header_fields):
accepted_values = []
for v in email_header_fields:
s = v.replace('\\(', '').replace('\\)', '')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what if that backslash was already escaped with a backslash? For example \\) or \\\\).

Lib/email/utils.py Outdated Show resolved Hide resolved
Lib/email/utils.py Outdated Show resolved Hide resolved
@vstinner
Copy link
Member Author

Why is this a separate PR from #108250?

I'm not the author of the other PR. I copied the other PR and added strict parameter.

@ambv
Copy link
Contributor

ambv commented Oct 27, 2023

I'm not the author of this PR and I was able to make commits to it.

@vstinner
Copy link
Member Author

I'm not the author of this PR and I was able to make commits to it.

I don't feel comfortable to make significant change of a PR without asking the author. I prefer to create a separated PR and ask for review.

@vstinner
Copy link
Member Author

vstinner commented Oct 30, 2023

Is this behavior a bug or a feature? I don't know how ; is supposed to behave.

$ python
Python 3.11.6 (main, Oct  3 2023, 00:00:00) [GCC 13.2.1 20230728 (Red Hat 13.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from email.utils import getaddresses
>>> from pprint import pprint
>>> pprint(getaddresses('<bob@example.org>; <alice@example.org>'))
[('', ''),
 ('', 'b'),
 ('', 'o'),
 ('', 'b'),
 ('', ''),
 ('', 'e'),
 ('', 'x'),
 ('', 'a'),
 ('', 'm'),
 ('', 'p'),
 ('', 'l'),
 ('', 'e'),
 ('', '.'),
 ('', 'o'),
 ('', 'r'),
 ('', 'g'),
 ('', ''),
 ('', ''),
 ('', ''),
 ('', ''),
 ('', 'a'),
 ('', 'l'),
 ('', 'i'),
 ('', 'c'),
 ('', 'e'),
 ('', ''),
 ('', 'e'),
 ('', 'x'),
 ('', 'a'),
 ('', 'm'),
 ('', 'p'),
 ('', 'l'),
 ('', 'e'),
 ('', '.'),
 ('', 'o'),
 ('', 'r'),
 ('', 'g'),
 ('', '')]

@vstinner
Copy link
Member Author

Is this behavior a bug or a feature? I don't know how ; is supposed to behave.

Oh. getaddresses() expects a sequence, not a string :-)

@vstinner
Copy link
Member Author

Except of parsedate_tz() function, Lib/email/_parseaddr.py file didn't evolve much it was added in 2002 by commit 030ddf7. The file was created from Lib/rfc822.py which was added in 1992 (commit 01ca336).

The latest major change was done in... 1997 with commit be7c45e

New address parser by Ben Escoto replaces Sjoerd Mullender's parseaddr()

The latest minor change was done in 2019 to fix CVE-2019-16056: commit 8cb65d1 of issue #78336.

@vstinner
Copy link
Member Author

What if the input is '"Jane Doe" jane@example.net, "John Doe" john@example.net'?

Oh, realname_comma_re replaces "Jane Doe" <jane@example.net>, "John Doe" <john@example.net> with "Jane Doe <john@example.net> which is invalid...

@vstinner
Copy link
Member Author

@vstinner vstinner marked this pull request as draft October 30, 2023 14:20
@miss-islington-app
Copy link

Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.9.
🐍🍒⛏🤖

@miss-islington-app
Copy link

Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.8.
🐍🍒⛏🤖

@miss-islington-app
Copy link

Sorry, @vstinner, I could not cleanly backport this to 3.12 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker 4a153a1d3b18803a684cd1bcc2cdf3ede3dbae19 3.12

@miss-islington-app
Copy link

Sorry, @vstinner, I could not cleanly backport this to 3.10 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker 4a153a1d3b18803a684cd1bcc2cdf3ede3dbae19 3.10

@miss-islington-app
Copy link

Sorry, @vstinner, I could not cleanly backport this to 3.11 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker 4a153a1d3b18803a684cd1bcc2cdf3ede3dbae19 3.11

@miss-islington-app
Copy link

Sorry, @vstinner, I could not cleanly backport this to 3.9 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker 4a153a1d3b18803a684cd1bcc2cdf3ede3dbae19 3.9

@miss-islington-app
Copy link

Sorry, @vstinner, I could not cleanly backport this to 3.8 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker 4a153a1d3b18803a684cd1bcc2cdf3ede3dbae19 3.8

encukou pushed a commit to encukou/cpython that referenced this pull request Sep 6, 2024
…n email.parseaddr() (pythonGH-111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
(cherry picked from commit 4a153a1)
encukou added a commit to encukou/cpython that referenced this pull request Sep 6, 2024
…n email.parseaddr() (pythonGH-111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

(cherry picked from commit 4a153a1)

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-Authored-By: Thomas Dwyer <github@tomd.tel>
encukou pushed a commit to encukou/cpython that referenced this pull request Sep 6, 2024
…n email.parseaddr() (pythonGH-111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
(cherry picked from commit 4a153a1)
@bedevere-app
Copy link

bedevere-app bot commented Sep 6, 2024

GH-123766 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.12 bug and security fixes label Sep 6, 2024
encukou pushed a commit to encukou/cpython that referenced this pull request Sep 6, 2024
…n email.parseaddr() (pythonGH-111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

(cherry picked from commit 4a153a1)

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-Authored-By: Thomas Dwyer <github@tomd.tel>
@bedevere-app
Copy link

bedevere-app bot commented Sep 6, 2024

GH-123767 is a backport of this pull request to the 3.11 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.11 only security fixes label Sep 6, 2024
encukou pushed a commit to encukou/cpython that referenced this pull request Sep 6, 2024
…n email.parseaddr() (pythonGH-111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

(cherry picked from commit 4a153a1)

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-Authored-By: Thomas Dwyer <github@tomd.tel>
@bedevere-app
Copy link

bedevere-app bot commented Sep 6, 2024

GH-123768 is a backport of this pull request to the 3.10 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.10 only security fixes label Sep 6, 2024
ambv pushed a commit that referenced this pull request Sep 6, 2024
…l.parseaddr() (GH-111116) (#123766)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
(cherry picked from commit 4a153a1)

Co-authored-by: Victor Stinner <vstinner@python.org>
ambv pushed a commit that referenced this pull request Sep 6, 2024
…l.parseaddr() (GH-111116) (#123767)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

(cherry picked from commit 4a153a1)

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Thomas Dwyer <github@tomd.tel>
encukou pushed a commit to encukou/cpython that referenced this pull request Sep 6, 2024
… email.parseaddr() (pythonGH-111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

(cherry picked from commit 4a153a1)

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-Authored-By: Thomas Dwyer <github@tomd.tel>
@bedevere-app
Copy link

bedevere-app bot commented Sep 6, 2024

GH-123769 is a backport of this pull request to the 3.9 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.9 only security fixes label Sep 6, 2024
encukou pushed a commit to encukou/cpython that referenced this pull request Sep 6, 2024
… email.parseaddr() (pythonGH-111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

(cherry picked from commit 4a153a1)

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-Authored-By: Thomas Dwyer <github@tomd.tel>
@bedevere-app
Copy link

bedevere-app bot commented Sep 6, 2024

GH-123770 is a backport of this pull request to the 3.8 branch.

ambv pushed a commit that referenced this pull request Sep 6, 2024
….parseaddr() (GH-111116) (#123769)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

(cherry picked from commit 4a153a1)

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-Authored-By: Thomas Dwyer <github@tomd.tel>
ambv pushed a commit that referenced this pull request Sep 6, 2024
….parseaddr() (GH-111116) (#123770)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

(cherry picked from commit 4a153a1)

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-Authored-By: Thomas Dwyer <github@tomd.tel>
ambv pushed a commit that referenced this pull request Sep 6, 2024
…l.parseaddr() (GH-111116) (#123768)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

(cherry picked from commit 4a153a1)

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-Authored-By: Thomas Dwyer <github@tomd.tel>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error type-security A security issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants