Skip to content

Regular expression matching for email addresses. Maintained, configurable, more accurate, and browser-friendly alternative to email-regex. Works in Node v14+ and browsers. Made for @spamscanner and @forwardemail.

License

Notifications You must be signed in to change notification settings

spamscanner/email-regex-safe

Repository files navigation

email-regex-safe

build status code style styled with prettier made with lass license npm downloads

Regular expression matching for email addresses. Maintained, configurable, more accurate, and browser-friendly alternative to email-regex. Works in Node v14+ and browsers. Maintained for Spam Scanner and Forward Email.

Table of Contents

Foreword

Previously we were using email-regex through our work on Spam Scanner and Forward Email. However this package has too many issues and false positives.

This package should hopefully more closely resemble real-world intended usage of an email regular expression, and also let you configure several Options. Please check out Forward Email if this package helped you, and explore our source code on GitHub which shows how we use this package.

It will not perform strict email validation, but instead hints the complete matches resembling an email address. We recommend to use validator.isEmail for validation (e.g. validator.isEmail(match)).

Install

NOTE: The default behavior of this package will attempt to load re2 (it is an optional peer dependency used to prevent regular expression denial of service attacks and more). If you wish to use this behavior, you must have re2 installed via npm install re2 – otherwise it will fallback to using normal RegExp instances. As of v4.0.0 we added an option if you wish to force this package to not even attempt to load re2 (e.g. it's in your node_modules but you don't want to use it) – simply pass re2: false as an option.

npm:

npm install email-regex-safe

Usage

Node

const emailRegexSafe = require('email-regex-safe');

const str = 'some long string with foo@bar.com in it';
const matches = str.match(emailRegexSafe());

for (const match of matches) {
  console.log('match', match);
}

console.log(emailRegexSafe({ exact: true }).test('hello@example.com'));

Browser

Since RE2 is not made for the browser, it will not be used. If there were to be any regex vulnerabilities, they would only crash the user's browser tab, and not your server (as they would on the Node.js side without the use of RE2).

VanillaJS

This is the solution for you if you're just using <script> tags everywhere!

<script src="https://unpkg.com/email-regex-safe"></script>
<script type="text/javascript">
  (function() {
    var str = 'some long string with foo@bar.com in it';
    var matches = str.match(emailRegexSafe());

    for (var i=0; i<matches.length; i++) {
      console.log('match', matches[i]);
    }

    console.log(emailRegexSafe({ exact: true }).test('hello@example.com'));
  })();
</script>

Bundler

Assuming you are using browserify, webpack, rollup, or another bundler, you can simply follow Node usage above.

Options

Property Type Default Value Description
re2 Boolean true Attempt to load re2 to use instead of RegExp for creating new regular expression instances. If you pass re2: false, then re2 will not even be attempted to be loaded.
exact Boolean false Only match an exact String. Useful with regex.test(str) to check if a String is an email address. We set this to false by default as the most common use case for a RegExp parser is to parse out emails, as opposed to check strict validity; we feel this closely more resembles real-world intended usage of this package.
strict Boolean false If true, then it will allow any TLD as long as it is a minimum of 2 valid characters. If it is false, then it will match the TLD against the list of valid TLD's using tlds.
gmail Boolean true Whether or not to abide by Gmail's rules for email usernames (see Gmail's Create a username article for more insight). Note that since RE2 does not support negative lookahead nor negative lookbehind, we are leaving it up to you to filter out a select few invalid matches while using gmail: true. Invalid matches would be those that end with a "." (period) or "+" (plus symbol), or have two or more consecutive ".." periods in a row anywhere in the username portion. We recommend to use str.matches(emailSafeRegex()) to get an Array of all matches, and then filter those that pass validator.isEmail after having end period(s) and/or plus symbol(s) stripped from them, as well as filtering out matches with repeated periods.
utf8 Boolean true Whether or not to allow UTF-8 characters for email usernames. This Boolean is only applicable if gmail option is set to false.
localhost Boolean true Allows localhost in the URL hostname portion. See the test/test.js for more insight into the localhost test and how it will return a value which may be unwanted. A pull request would be considered to resolve the "pic.jp" vs. "pic.jpg" issue.
ipv4 Boolean true Match against IPv4 URL's.
ipv6 Boolean false Match against IPv6 URL's. This is set to false by default, since IPv6 is not really supported anywhere for email addresses, and it's not even included in validator.isEmail's logic.
tlds Array tlds Match against a specific list of tlds, or the default list provided by tlds.
returnString Boolean false Return the RegExp as a String instead of a RegExp (useful for custom logic, such as we did with Spam Scanner).

How to validate an email address

If you would like to validate email addresses found, then you should use the validator.isEmail method. This will further enforce the email RFC specification limitations of 64 characters for the username/local part of the email address, 254 for the domain/hostname portion, and 255 in total; including the "@" (at symbol).

Limitations

This limitation only applies if you are using re2: Since we cannot use regular expression's "negative lookbehinds" functionality (due to RE2 limitations), we could not merge the logic from this pull request. This would have allowed us to make it so example.jpeg would match only if it was example.jp, however if you pass example.jpeg right now it will extract example.jp from it (since .jp is a TLD). An alternative solution may exist, and we welcome community contributions regarding this issue.

Contributors

Name Website
Forward Email https://forwardemail.net

License

MIT © Forward Email