Some edge cases in email.utils.parsedate_to_datetime
seem to differ from RFC2822 spec
#126845
Labels
email.utils.parsedate_to_datetime
seem to differ from RFC2822 spec
#126845
Bug report
Bug description:
While tinkering around with
email.utils.parsedate_to_datetime
, I found some behavior that may be worth adjusting.1. low-number years aren't handled according to spec:
expected: either year 1, or a parsing failure. Neither the new or old format interpret 4-digit years this way.
2. offset minutes larger than 59 don't lead to parsing failure
expected: parse failure. Instead, the "90 minutes" component is parsed without issue (0590 being equal to 0630). The spec is actually not explicit about this, although "A date-time specification MUST be semantically valid". Note that a "90" value as minute in the time component does give the appropriate parsing failure.
Note:
datetime.fromisoformat()
has the same behavior. Also in this case, I can't determine whether ISO8601 explicitly disallows it. RFC3339 is clear on disallowing this.3. Invalid day-of-week doesn't lead to parsing failure
expected: parsing failure
4. Non-ASCII digits don't lead to parsing failure
If I'm reading the RFC correctly, only ASCII characters are valid.
expected: parsing failure
5. Handling of the
-0000
case may be inconsistent with drive to eliminate the practice of "naive UTC" datetimes.Lately, the
datetime
module appears to discourage the usage of naive datetimes to mean UTC, as evidenced by the deprecation ofutcnow()
and other methods.However,
parsedate_to_datetime
will return a naive datetime in the-0000
case.expected:
tzinfo=UTC
The spec says:
The spec again is a bit fuzzy, but my reading here is that
-0000
means "UTC, with no offset known". In contrast,+0000
means "UTC offset known to be 0". My impression would be that only omission of the offset should result in a naive datetime. What do you think?CPython versions tested on:
3.13
Operating systems tested on:
macOS
edit: typo
The text was updated successfully, but these errors were encountered: