-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/improve email extract #373
Conversation
Does this work with I18n domain names? For example (from https://shkspr.mobi/blog/2016/09/why-cant-you-send-email-to-a-chinese-address/) |
It does not appear to. Nor does it appear to work with any non-latin characters anywhere in the email address. E.g:
This is actually allowed though, according to RFC 5322, which specifically stipulates only latin characters in email addresses. Non-latin characters are usually converted into something that looks like:
Behind the scenes of whatever client you are using. Despite such characters not being allowed in some |
good point all, i didnt consider those cases. lets leave this pull request open and i'll make some additional commits. |
so, I've learnt this week that email addresses are more complicated then i originally thought.
|
Excellent, thanks very much for this. I haven't updated the built in email regex in the 'Regular Expression' operation as I feel this version should be a bit more quick and dirty. It's useful to have a fairly accurate one for 'Extract email addresses' though. |
To resolve #365
Used email regex from https://www.regular-expressions.info/email.html
Added some basic tests.
The following email address are valid:
email@example.com firstname.lastname@example.com email@subdomain.example.com firstname+lastname@example.com 1234567890@example.com email@example-one.com _______@example.com email@example.name email@example.museum email@example.co.jp firstname-lastname@example.com