-
-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ignore unencoded special characters #130
Comments
Hahaha
Well, that's easy... it's because I'm following the RFC 😄 I wrote a parser to handle as much of the RFC as possible. That means you need to use whitespace as a delimiter, and is why RFC 2047 specifically prohibits whitespace in the 'encoded-word' part.
I'm not specifically prohibiting whitespace, it's just that the 'delimiting' of components of a header happens before the decoding of RFC 2047 happens in most cases (there had to be an exception made for 'message-id' to allow it to happen first because of #109 ). This way of doing things allows me to fully support 'valid' headers... with it's comments, weird nested comments, quoted parts, escaped characters, address groups, RFC 2047, RFC 2231, and whatever other weird things thrown at it. I don't know specifically what of that is or isn't supported by php-mime-mail-parser and it probably doesn't matter... the quirkier bits of the standards are so rarely encountered anyway that it doesn't matter for it. My goals are different -- which is why I don't use that project as a gauge myself... but it's also why thinking up random scenarios and testing them might not be useful also... some standards need to be followed (you still put =?utf-8 in your test... what if the header had =&utf-8 instead? Point being, both are equally invalid) 😝 |
Here's the more relevant part of RFC 2047:
... and Still wondering why Anyway, you can set this to "wontfix" and close it :-) |
Yeah, quoted_printable_decode isn't a header-specific function (could be the body of a mime part with Content-Transfer-Encoding set to quoted-printable). |
Continuing my homework ;-)
If I have this in the email (notice the unencoded tab and
ä
):...
$message->getHeader('subject')->getValue()
just returns the undecoded string:However, php-mime-mail-parser returns
föö bär
, since it just throws the string intoquoted_printable_decode()
.I don't know if leaving some characters unencoded is legal or not - didn't look in the RFCs.
But my question is: Why are you doing more work (i.e. somehow "validate" the string), instead of just throwing it into
quoted_printable_decode()
and take whatever it returns? Where is this happening in your code (couldn't find it)?The text was updated successfully, but these errors were encountered: