Skip to content

liquid-labs/regex-repo

Repository files navigation

regex-repo

coverage: 100% Unit tests

A a collection of useful regular expressions. Refer to the regex reference below for a list of the provided REs. Supports both CJS and ESM packages.

Installation

npm i @liquid-labs/regex-repo

pnpm and yarn also work.

Usage

import { emailRe } from '@liquid-labs/regex-repo' // ES6
// const { emailRe } = require('@liquid-labs/regex-repo') // cjs

const verified = emailRe.test(userInput)

Regex reference

Each regular expression listed below is paired with an embeddable string named xxxString. E.g., rgbRe is paired with rgbReString (special cases noted). Each Re will only match strings that are the given type and nothing else. I.e., the RE begins with '^' and ends with '$'. The xxxReString can be used for partial matches, matchAlls, and used as part of larger expressions. E.g., to find all unique CSS RGB colors used in a style sheet, you might do something like:

import { rgbReString } from '@liquid-labs/regex-repo'

const allColors = cssContent
  .matchAll(new RegExp(`[ :](${rgbReString})[; }]`, 'g'))
  .map((match) => match[1]) // extract the capture group
  .filter((v, i, arr) => i === arr.indexOf(v)) // filter non-unique items
  .sort()

API generated with dmd-readme-api.

  • Constants:
    • AWS
    • Contacts
      • emailRe: Match most valid emails.
      • usPhoneRe: Matches US phone numbers with optional country code and area code.
      • zipCodeRe: Matches 5 or 9 digit US zip codes.
    • CSS
      • cssColor3Re: Matches CSS3 'hex, rgb, rgba, hsl, and predefined colors.
      • cssColorRe: Matches CSS4 'hex, rgb, rgba, hsl, and predefined colors.
      • cssPreColors1Re: Matches CSS1 predefined color names.
      • cssPreColors2Re: Matches CSS2 predefined color names.
      • cssPreColors3Re: Matches CSS3 predefined color names.
      • cssPreColorsRe: Matches CSS4 predefined color names.
      • hexColorAlphaRe: Matches hex specified RGBA colors with an alpha channel.
      • hexColorNoAlphaRe: Matches hex specified RGB colors with no alpha channel.
      • hsl3Re: Matches CSS3 'hsl(...) and hsla(...) deg and percent notation.
      • hslRe: Matches CSS4 'hsl(...) and hsla(...) deg, grad, rad, turn and percent notation.
      • rgbaFuncRe: Matches CSS3 'rgba(...) using '0...255 and percent (integer) notation.
      • rgbFuncRe: Matches CSS1 'rgb(...) using '0...255 and percent (integer) notation.
      • rgbRe: Matches CSS4 'rgb(...) and rgba(...) functios using '0...255 and percent (float) notation.
    • CSS numbers
      • zeroTo100FloatPercentRe: Matches a 0 to 100% float as used in CSS color specifications.
      • zeroTo100PercentRe: Matches a 0 to 100% integer as used in CSS color specifications.
      • zeroTo1FloatRe: Matches a 0 to 1 float as used in CSS color specifications.
      • zeroTo255FloatRe: Matches a 0 to 255 float as used in CSS color specifications.
      • zeroTo255Re: Matches a 0 to 255 integer as used in CSS color specifications.
      • zeroTo360FloatRe: Matches a 0 to 360 float as used in CSS color specifications.
      • zeroTo360Re: Matches a 0 to 360 integer as used in CSS color specifications.
    • Date time
      • intlDateRe: Matches an international style 'YYYY/MM/DD' string.
      • iso8601DateRe: Matches an ISO 8601 date time like '20240101T1212Z.
      • iso8601DateReString: Matches the time designation portion of an ISO 8601 date+time.
      • iso8601DateTimeRe: Matches an ISO 8601 requiring both date and time components.
      • iso8601DayRe: An RE ready string that matches the day designation portion of an ISO 8601 date+time.
      • iso8601DayReString: An RE ready string that matches the day designation portion of an ISO 8601 date+time.
      • iso8601TimeRe: An RE ready string that matches the time designation portion of an ISO 8601 date+time.
      • militaryTimeRe: Matches military time style 'HHMM' string.
      • rfc2822DateRe: Matches an RFC 2822 style date like 'Mon, 6 Jan 1992 12:12 UTC'.
      • rfc2822DayRe: Matches the day designation portion of an RFC 2822 date+time.
      • rfc2822TimeRe: Matches the time designation portion of an RFC 2822 date+time.
      • timeRe: Matches a twelve hour time designation, requires AM or PM designation.
      • timezoneRe: Matches a general timezone designation; compliant with RFC 2822 timezone portion.
      • twentyFourHourTimeRe: Matches a twenty-four hour time designationAllows optional leading 0 in hour.
      • usDateRe: Matches a US style 'MM/DD/YYYY' string.
    • Domain names
      • domainLabelRe: Matches a non-tld domain label.
      • fqDomainNameRe: Matches fully qualified domain name (one or more subdomains + TLD).
      • localhostRe: Matches any representation of localhost; the special name, IPV4 loopbacks, or IPV6 loopbacks.
      • tldNameRe: Matches a Top Level Domain (TLD).
    • Identifiers
      • einRe: Matches a valid EIN number.
      • ssnRe: Matches a valid SSN.
      • uuidRe: Matches a UUID.
    • Javascript
    • Network
      • ipAddressRe: Matches a string in IP address format.
      • ipHostRe: Matches a valid, non-localhost IP address.
      • ipV6Re: Matches a string in IPV6 format.
      • ipVFutureRe: Matches potential future IP protocols.
    • NPM
    • Numbers
    • semver
      • semver2RangeRe: Matches a semantic versioning range specification.
      • semver2Re: Matches a semantic version string according to the Semantic Versioning 2.0.0 specification.
    • testCaptureGroups: Tests that a regular expression correctly extracts capture groups from input strings.
    • URL
      • commonUrlRe: Matches any of the "common" web URL types: 'mailto', 'http/https', 'ftp', and 'file'.
      • fileUrlRe: Matches a valid 'file' URL.
      • ftpUrlRe: Matches a valid 'ftp' URL.
      • httpUrlRe: Matches a valid 'http/https' URL.
      • mailtoUrlRe: Matches a valid 'mailto:' URL.
      • urlRe: Matches a valid, generic URL.

awsS3BucketNameRe source code AWS index | global index

Matches (most) valid S3 bucket name. Note awsS3BucketNameReString cannot be used for partial matches. Does not enforce 63 character limit. Due to checking for invalid S3 bucket names, awsS3BucketNameReString embeds '^' and '$' and so cannot be used for partial matches.

awsS3BucketNameReString source code AWS index | global index

An RE ready string that matches (most) valid S3 bucket names. When using this partial, you should verify the results do not match invalidS3PartialsReString. Because of the way the RE is constructed, this is one case where the partial string is not the same as that used to construct the awsS3BucketNameRe RE.

awsS3TaBucketNameRe source code AWS index | global index

Matches (most) S3 Transfer Acceleration compatible S3 bucket name. Note awsS3TaBucketNameReString cannot be used for partial matches.

awsS3TaBucketNameReString source code AWS index | global index

An RE ready string that matches (most) valid S3 Transfer Acceleration compatible bucket names. When using this partial, you should verify the results do not match invalidS3TaBucketNameReString. Because of the way the RE is constructed, this is one case where the partial string is not the same as that used to construct the awsS3TaBucketNameRe RE.

invalidS3TaBucketNameReString source code AWS index | global index

An RE ready string that matches excluded S3 Transfer Acceleration compatible bucket names that would be matched by awsS3TaBucketNameReString. Because of the way the RE is constructed, this is one case where the partial string is not the same as that used to construct the invalidS3TaBucketNameRe RE.

Match most valid emails. Provides matching groups 1 (user name) and 2 (domain). When using the partial string to create a Re, you must use the 'u' flag.

Matches US phone numbers with optional country code and area code.

Matches 5 or 9 digit US zip codes.

cssColor3Re source code CSS index | global index

Matches CSS3 'hex, rgb, rgba, hsl, and predefined colors.

Matches CSS4 'hex, rgb, rgba, hsl, and predefined colors.

cssPreColors1Re source code CSS index | global index

Matches CSS1 predefined color names.

cssPreColors2Re source code CSS index | global index

Matches CSS2 predefined color names.

cssPreColors3Re source code CSS index | global index

Matches CSS3 predefined color names.

cssPreColorsRe source code CSS index | global index

Matches CSS4 predefined color names.

hexColorAlphaRe source code CSS index | global index

Matches hex specified RGBA colors with an alpha channel.

hexColorNoAlphaRe source code CSS index | global index

Matches hex specified RGB colors with no alpha channel.

Matches CSS3 'hsl(...) and hsla(...) deg and percent notation.

Matches CSS4 'hsl(...) and hsla(...) deg, grad, rad, turn and percent notation.

Matches CSS3 'rgba(...) using '0...255 and percent (integer) notation.

Matches CSS1 'rgb(...) using '0...255 and percent (integer) notation.

Matches CSS4 'rgb(...) and rgba(...) functios using '0...255 and percent (float) notation.

zeroTo100FloatPercentRe source code CSS numbers index | global index

Matches a 0 to 100% float as used in CSS color specifications.

zeroTo100PercentRe source code CSS numbers index | global index

Matches a 0 to 100% integer as used in CSS color specifications.

Matches a 0 to 1 float as used in CSS color specifications.

zeroTo255FloatRe source code CSS numbers index | global index

Matches a 0 to 255 float as used in CSS color specifications.

Matches a 0 to 255 integer as used in CSS color specifications.

zeroTo360FloatRe source code CSS numbers index | global index

Matches a 0 to 360 float as used in CSS color specifications.

Matches a 0 to 360 integer as used in CSS color specifications.

Matches an international style 'YYYY/MM/DD' string. Accepts separators '.', '/', '-'. Will except 1 or 2 digits for month and day and 1-4 digits for the year. Also accepts a + or - before the year. Provides capture groups:

  • Group 1: BCE/CE indicator
  • Group 2: year
  • Group 3: month
  • Group 4: day

Matches an ISO 8601 date time like '20240101T1212Z. Provides matching groups:

  • Group 1: year
  • Group 3: month
  • Group 4: day of month
  • Group 5: week of year
  • Group 6: day of week date
  • Group 7: ordinal or Julian date
  • Group 8: special end of day time
  • Group 10: hour
  • Group 11: decimal fraction of hour
  • Group 13: minute
  • Group 14: decimal fraction of minute
  • Group 15: seconds
  • Group 16: decimal fraction of a second
  • Group 17: timezone designation

(Groups 2, 11, and 13 are internal back references.)

iso8601DateReString source code Date time index | global index

Matches the time designation portion of an ISO 8601 date+time. Provides matching groups:

  • Group 1: year
  • Group 3: month
  • Group 4: day of month
  • Group 5: week of year
  • Group 6: day of week date
  • Group 7: ordinal or Julian date
  • Group 8: special end of day time
  • Group 10: hour
  • Group 11: decimal fraction of hour
  • Group 13: minute
  • Group 14: decimal fraction of minute
  • Group 15: seconds
  • Group 16: decimal fraction of a second
  • Group 17: timezone designation

iso8601DateTimeRe source code Date time index | global index

Matches an ISO 8601 requiring both date and time components. See iso8601DateRe for matching groups.

An RE ready string that matches the day designation portion of an ISO 8601 date+time. Provides matching groups:

  • Group 1: year
  • Group 3: month
  • Group 4: day of month
  • Group 5: week of year
  • Group 6: day of week date
  • Group 7: ordinal or Julian date

iso8601DayReString source code Date time index | global index

An RE ready string that matches the day designation portion of an ISO 8601 date+time. Provides matching groups:

  • Group 1: year
  • Group 3: month
  • Group 4: day of month
  • Group 5: week of year
  • Group 6: day of week date
  • Group 7: ordinal or Julian date

An RE ready string that matches the time designation portion of an ISO 8601 date+time. Provides matching groups:

  • Group 1: special end of day time
  • Group 3: hours
  • Group 4: fraction of hour
  • Group 6: minutes
  • Group 7: fraction of minute
  • Group 8: seconds
  • Group 9: fraction of seconds
  • Group 10: timezone

(Groups 2 and 5 are internal backreferences for separator consistency)

Matches military time style 'HHMM' string. Provides capture groups:

  • Group 1: special 2400 time
  • Group 2: hour
  • Group 3: minutes

Matches an RFC 2822 style date like 'Mon, 6 Jan 1992 12:12 UTC'. Provides matching groups:

  • Group 1: day of week
  • Group 2: day of month
  • Group 3: month name
  • Group 4: year
  • Group 5: hour
  • Group 6: min
  • Group 7: second
  • Group 8: time zone

Matches the day designation portion of an RFC 2822 date+time. Provides matching groups:

  • Group 1: day of week name
  • Group 2: day of month
  • Group 3: month name
  • Group 4: year

Matches the time designation portion of an RFC 2822 date+time. Provides matching groups:

  • Group 1: hour
  • Group 2: minutes
  • Group 3: seconds
  • Group 4: timezone

Matches a twelve hour time designation, requires AM or PM designation. Allows optional leading 0 in hour. Provides matching groups:

  • Group 1: hour
  • Group 2: minutes
  • Group 3: seconds, without decimal fractions
  • Group 4: decimal fraction seconds
  • Group 5: AM/PM indicator

Matches a general timezone designation; compliant with RFC 2822 timezone portion. Provides matching groups:

  • Group 1: timezone

twentyFourHourTimeRe source code Date time index | global index

Matches a twenty-four hour time designationAllows optional leading 0 in hour. Provides matching groups:

  • Group 1: special 24:00 designation with optional seconds
  • Group 2: hour
  • Group 3: minutes
  • Group 4: seconds, without decimal fractions
  • Group 5: decimal fraction seconds

Matches a US style 'MM/DD/YYYY' string. Accepts separators '.', '/', '-'. Will except 1 or 2 digits for month and day and 1-4 digits for the year. Also accepts a + or - before the year. Provides capture groups:

  • Group 1: month
  • Group 2: day of month
  • Group 3: BCE/CE indicator
  • Group 4: year

Matches a non-tld domain label. Enforces the 63 byte domain label limit for non-international (all ASCII) labels. See domain name rules. When using the partial string to create a Re, you must use the 'u' or 'v' flag.

Matches fully qualified domain name (one or more subdomains + TLD). Partially enforces the 255 byte FQ domain name limit, but this is only valid for non-international (all ASCII) domain names because we can only count characters. When using the partial string to create a Re, you must use the 'u' or 'v' flag.

Matches any representation of localhost; the special name, IPV4 loopbacks, or IPV6 loopbacks.

Matches a Top Level Domain (TLD). See domain name rules. When using the partial string to create a Re, you must use the 'u' or 'v' flag.

Matches a valid EIN number.

Matches a valid SSN. Provides 3 matching groups, 1 (area number), 2 (group number), and 3 (serial number).

Matches a UUID.

jsReservedWordRe source code Javascript index | global index

Matches a JS resereved word.

Matches a valid JS variable name.

Matches a string in IP address format. Use 'ipHostRe' to match actually valid IP addresses.

Matches a valid, non-localhost IP address.

Matches a string in IPV6 format.

Matches potential future IP protocols.

npmPackageNameRe source code NPM index | global index

Matches an NPM package name. Provides matching groups 1 (org name, if any) and 2 (package basename).

npmPackageSpecRe source code NPM index | global index

Matches an NPM package specification. Note, because any string that cannot be confused with a semver is, in theory, a valid tag, this could be any string.

npmPackageTagRe source code NPM index | global index

Matches an NPM package tag. A tag can, in theory, be anything that cannot be confused with a semver range. Due to the requirements of RE construction, the RE string ends up being useless for partial matches so is NOT exported.

Matches a float in either plan or scientific format.

Matches an integer.

Matches a plain (non-scientific notation) float.

scientificFloatRe source code Numbers index | global index

Matches a scientific notation float.

semver2RangeRe source code semver index | global index

Matches a semantic versioning range specification. Allows for optional 'v' prefix (equivalent to '='), and otherwise follows the original spec's BNF grammar. This means that an 'and' space between versions must be a single space and also requires exactly one space around hyphenated ranges.

Matches a semantic version string according to the Semantic Versioning 2.0.0 specification. Provides matching groups:

  • Group 1: major version
  • Group 2: minor version
  • Group 3: patch version
  • Group 4: pre-release version (if present)
  • Group 5: build metadata (if present)

testCaptureGroups source code global index

Tests that a regular expression correctly extracts capture groups from input strings.

This function supports two modes:

  1. Sequential mode: Tests capture groups 1, 2, 3, ... in order (when groupNumbers is omitted)
  2. Numbered mode: Tests specific capture group numbers (when groupNumbers is provided)

The numbered mode is useful for regexes with internal non-capturing groups, backreferences, or alternations that create sparse or non-sequential capture group numbering.

Param Type Description
re RegExp The regular expression to test
inputs Array.<string> Array of input strings to test against the regex
expectedMatches `Array.<Array.<(string undefined)>>`
[groupNumbers] Array.<number> Optional array of capture group numbers to test. If omitted, tests groups 1, 2, 3, ... sequentially.
desc string Description of the test (shown in test output)

Examples:

// Sequential mode (tests groups 1, 2, 3)
testCaptureGroups(
  /(\d{4})-(\d{2})-(\d{2})/,
  ['2024-01-15'],
  [['2024', '01', '15']],
  'date capture groups'
)
// Numbered mode (tests specific group numbers: 1, 3, 5)
testCaptureGroups(
  /^(https?):\/\/((?:\w+\.)*\w+)(:\d+)?(\/.*)?$/,
  ['https://example.com:8080/path'],
  [['https', 'example.com', ':8080', '/path']],
  [1, 2, 3, 4], // explicitly specify which groups to test
  'URL capture groups'
)

commonUrlRe source code URL index | global index

Matches any of the "common" web URL types: 'mailto', 'http/https', 'ftp', and 'file'. You must use the either the 'u' or 'v' flag when using the Re string.

Matches a valid 'file' URL. Provides capture groups:

  • Group 1: host
  • Group 2: path

You must use the either the 'u' or 'v' flag when using the Re string.

Matches a valid 'ftp' URL. Provides capture groups:

  • Group 1: username
  • Group 2: user password
  • Group 3: host or IP
  • Group 4: port
  • Group 5: path

You must use the either the 'u' or 'v' flag when using the Re string.

Matches a valid 'http/https' URL. Provides capture groups:

  • Group 1: protocol
  • Group 2: username
  • Group 3: user password
  • Group 4: host or IP
  • Group 5: port
  • Group 6: path
  • Group 7: query string
  • Group 8: fragment

You must use the either the 'u' or 'v' flag when using the Re string.

mailtoUrlRe source code URL index | global index

Matches a valid 'mailto:' URL. Provides a single capture group:

  • Group 1: email address

You must use the either the 'u' or 'v' flag when using the Re string.

Matches a valid, generic URL. Provides capture groups:

  • Group 1: schema
  • Group 2: server/authority
  • Group 3: path
  • Group 4: query part
  • Group 5: intra-page link/fragment

Note, a URL always has scheme, and at a minimum a server/authority or path, and may have both. The query and fragment components are always optional. For general usage, you might want to use the more specific Res for specific protocols or the commonUrlRe.

Domain name rules

Unfortunately, there isn't clear consensus on what is allowed in a subdomain vs a top level domain (TLD); referred to collectively as 'domain labels'. So, here are the rules we follow:

  • Domain labels may no more than 63 bytes in length.
  • Labels are composed of alpha-numeric characters (a-z, 0-9, and any non-ASCII Unicode character) and hyphens ('-'), except:
    • the label may not begin or end with a hyphen,
    • may not consist of a single digit,
    • the label must not have consecutive hyphens in the 3rd and 4th position. E.g. 'xy--z' is invalid.1
  • TLDs must be at least two bytes (two ASCII characters or a single Unicode character) and may not be composed only of digits.
  • A fully qualified domain is limited to 255 bytes in total.

Footnotes

  1. The DNS protocol only allows a-z, 0-9, and '-' in domain labels. International domains are encoded as special 'xn--' domains. E.g., 'कॉम"' is encoded as 'xn--11b4c3d'. This is why hyphens in the third and fourth position are restricted. So, while 'xn--11b4c3d' is a valid domain, you can't register such domains directly. You would register the international domain and it's translated to an 'xn--' domain in the background.

About

Collection of JS regular expressions.

Resources

License

Stars

Watchers

Forks

Packages

No packages published