Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support for lookarounds and non-capture groups #64

Merged
merged 7 commits into from
Feb 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 19 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,23 +104,30 @@ See [Regex Builder API doc](./docs/API.md#builder) for more info.

### Regex Constructs

| Construct | Regex Syntax | Notes |
| ------------------- | ------------ | ------------------------------- |
| `capture(...)` | `(...)` | Create a capture group |
| `choiceOf(x, y, z)` | `x\|y\|z` | Match one of provided sequences |
| Construct | Regex Syntax | Notes |
| ------------------------- | ------------ | ------------------------------------------- |
| `choiceOf(x, y, z)` | `x\|y\|z` | Match one of provided sequences |
| `capture(...)` | `(...)` | Create a capture group |
| `lookahead(...)` | `(?=...)` | Match subsequent text without consuming it |
| `negativeLookhead(...)` | `(?!...)` | Reject subsequent text without consuming it |
| `lookbehind(...)` | `(?<=...)` | Match preceding text without consuming it |
| `negativeLookbehind(...)` | `(?<!...)` | Reject preceding text without consuming it |

See [Regex Constructs API doc](./docs/API.md#constructs) for more info.

> [!NOTE]
> TS Regex Builder does not have a construct for non-capturing groups. Such groups are implicitly added when required.

### Quantifiers

| Quantifier | Regex Syntax | Description |
| ----------------------------------------------- | ------------ | -------------------------------------------------------------- |
| `zeroOrMore(x)` | `x*` | Zero or more occurence of a pattern |
| `oneOrMore(x)` | `x+` | One or more occurence of a pattern |
| `optional(x)` | `x?` | Zero or one occurence of a pattern |
| `repeat(x, n)` | `x{n}` | Pattern repeats exact number of times |
| `repeat(x, { min: n, })` | `x{n,}` | Pattern repeats at least given number of times |
| `repeat(x, { min: n, max: n2 })` | `x{n1,n2}` | Pattern repeats between n1 and n2 number of times |
| Quantifier | Regex Syntax | Description |
| -------------------------------- | ------------ | ------------------------------------------------- |
| `zeroOrMore(x)` | `x*` | Zero or more occurence of a pattern |
| `oneOrMore(x)` | `x+` | One or more occurence of a pattern |
| `optional(x)` | `x?` | Zero or one occurence of a pattern |
| `repeat(x, n)` | `x{n}` | Pattern repeats exact number of times |
| `repeat(x, { min: n, })` | `x{n,}` | Pattern repeats at least given number of times |
| `repeat(x, { min: n, max: n2 })` | `x{n1,n2}` | Pattern repeats between n1 and n2 number of times |

See [Quantifiers API doc](./docs/API.md#quantifiers) for more info.

Expand Down
65 changes: 58 additions & 7 deletions docs/API.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,20 @@ It optionally accepts a list of regex flags:

These functions and objects represent available regex constructs.

### `choiceOf()`

```ts
function choiceOf(
...alternatives: RegexSequence[]
): ChoiceOf {
```

Regex syntax: `a|b|c`.

The `choiceOf` (disjunction) construct matches one out of several possible sequences. It functions similarly to a logical OR operator in programming. It can match simple string options as well as complex patterns.

Example: `choiceOf("color", "colour")` matches either `color` or `colour` pattern.

### `capture()`

```ts
Expand All @@ -58,19 +72,56 @@ Regex syntax: `(...)`.

Captures, also known as capturing groups, extract and store parts of the matched string for later use.

### `choiceOf()`
> [!NOTE]
> TS Regex Builder does not have a construct for non-capturing groups. Such groups are implicitly added when required. E.g., `zeroOrMore(["abc"])` is encoded as `(?:abc)+`.

### `lookahead()`

```ts
function choiceOf(
...alternatives: RegexSequence[]
): ChoiceOf {
function lookahead(
sequence: RegexSequence
): Lookahead
```

Regex syntax: `a|b|c`.
Regex syntax: `(?=...)`.

The `choiceOf` (disjunction) construct matches one out of several possible sequences. It functions similarly to a logical OR operator in programming. It can match simple string options as well as complex patterns.
Allows for conditional matching by checking for subsequent patterns in regexes without consuming them.

Example: `choiceOf("color", "colour")` matches either `color` or `colour` pattern.
### `negativeLookahead()`

```ts
function negativeLookahead(
sequence: RegexSequence
): NegativeLookahead
```

Regex syntax: `(?!...)`.

Allows for matches to be rejected if a specified subsequent pattern is present, without consuming any characters.

### `lookbehind()`

```ts
function lookbehind(
sequence: RegexSequence
): Lookahead
```

Regex syntax: `(?<=...)`.

Allows for conditional matching by checking for preceeding patterns in regexes without consuming them.

### `negativeLookbehind()`

```ts
function negativeLookahead(
sequence: RegexSequence
): NegativeLookahead
```

Regex syntax: `(?<!...)`.

Allows for matches to be rejected if a specified preceeding pattern is present, without consuming any characters.

## Quantifiers

Expand Down
70 changes: 67 additions & 3 deletions docs/Examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ Encoded regex: `/^#?(?:[a-f\d]{6}|[a-f\d]{3})$/i`.

See tests: [example-hex-color.ts](../src/__tests__/example-hex-color.ts).

## Simple URL validation
## URL validation

This regex validates (in a simplified way) whether a given string is a URL.

Expand Down Expand Up @@ -75,7 +75,9 @@ const isValid = regex.test("https://hello.github.com");

Encoded regex: `/^(?:(?:http|https):\/\/)?(?:(?:[a-z\d]|[a-z\d][a-z\d-]*[a-z\d])\.)+[a-z][a-z\d]+$/`.

See tests: [example-url.ts](../src/__tests__/example-url.ts).
See tests: [example-url-simple.ts](../src/__tests__/example-url-simple.ts).

For more advanced URL validation check: [example-url-advanced.ts](../src/__tests__/example-url-advanced.ts).

## Email address validation

Expand Down Expand Up @@ -109,7 +111,6 @@ See tests: [example-email.ts](../src/__tests__/example-email.ts).

This regex validates if a given string is a valid JavaScript number.


```ts
const sign = anyOf('+-');
const exponent = [anyOf('eE'), optional(sign), oneOrMore(digit)];
Expand Down Expand Up @@ -185,3 +186,66 @@ const isValid = regex.test(192.168.0.1");
Encoded regex: `/^(?:(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\.){3}(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])$/,`.

See tests: [example-regexp.ts](../src/__tests__/example-regexp.ts).

## Simple password validation

This regex corresponds to following password policy:
- at least one uppercase letter
- at least one lowercase letter
- at least one digit
- at least one special character
- at least 8 characters long

```ts
const atLeastOneUppercase = lookahead([zeroOrMore(any), /[A-Z]/]);
const atLeastOneLowercase = lookahead([zeroOrMore(any), /[a-z]/]);
const atLeastOneDigit = lookahead([zeroOrMore(any), /[0-9]/]);
const atLeastOneSpecialChar = lookahead([zeroOrMore(any), /[^A-Za-z0-9\s]/]);
const atLeastEightChars = /.{8,}/;

// Match
const validPassword = buildRegExp([
startOfString,
atLeastOneUppercase,
atLeastOneLowercase,
atLeastOneDigit,
atLeastOneSpecialChar,
atLeastEightChars,
endOfString
]);

const isValid = regex.test("Aa$123456");
```

Encoded regex: `/^(?=.*[A-Z])(?=.*[a-z])(?=.*[0-9])(?=.*[^A-Za-z0-9\s])(?:.{8,})$/`.

See tests: [example-password.ts](../src/__tests__/example-password.ts).

## Match currency values

```ts
const currencySymbol = '$€£¥R₿';
const decimalSeparator = '.';

const firstThousandsClause = repeat(digit, { min: 1, max: 3 });
const thousandsSeparator = ',';
const thousands = repeat(digit, 3);
const thousandsClause = [optional(thousandsSeparator), thousands];
const cents = repeat(digit, 2);
const isCurrency = lookbehind(anyOf(currencySymbol));

const currencyRegex = buildRegExp([
isCurrency,
optional(whitespace),
firstThousandsClause,
zeroOrMore(thousandsClause),
optional([decimalSeparator, cents]),
endOfString,
]);

const isValid = regex.test("£1,000");
```

Encoded regex: `/(?<=[$€£¥R₿])\s?\d{1,3}(?:,?\d{3})*(?:\.\d{2})?$/`.

See tests: [example-currency.ts](../src/__tests__/example-currency.ts).
41 changes: 41 additions & 0 deletions src/__tests__/example-currency.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
import { buildRegExp } from '../builders';
import { anyOf, digit, endOfString, optional, repeat, whitespace, zeroOrMore } from '../index';
import { lookbehind } from '../constructs/lookbehind';

const currencySymbol = '$€£¥R₿';
const decimalSeparator = '.';

const firstThousandsClause = repeat(digit, { min: 1, max: 3 });
const thousandsSeparator = ',';
const thousands = repeat(digit, 3);
const thousandsClause = [optional(thousandsSeparator), thousands];
const cents = repeat(digit, 2);
const isCurrency = lookbehind(anyOf(currencySymbol));

test('example: extracting currency values', () => {
const currencyRegex = buildRegExp([
isCurrency,
optional(whitespace),
firstThousandsClause,
zeroOrMore(thousandsClause),
optional([decimalSeparator, cents]),
endOfString,
]);

expect(currencyRegex).toMatchString('$10');
expect(currencyRegex).toMatchString('$ 10');
expect(currencyRegex).not.toMatchString('$ 10.');
expect(currencyRegex).toMatchString('$ 10');
expect(currencyRegex).not.toMatchString('$10.5');
expect(currencyRegex).toMatchString('$10.50');
expect(currencyRegex).not.toMatchString('$10.501');
expect(currencyRegex).toMatchString('€100');
expect(currencyRegex).toMatchString('£1,000');
expect(currencyRegex).toMatchString('$ 100000000000000000');
expect(currencyRegex).toMatchString('€ 10000');
expect(currencyRegex).toMatchString('₿ 100,000');
expect(currencyRegex).not.toMatchString('10$');
expect(currencyRegex).not.toMatchString('£A000');

expect(currencyRegex).toEqualRegex(/(?<=[$€£¥R₿])\s?\d{1,3}(?:,?\d{3})*(?:\.\d{2})?$/);
});
21 changes: 21 additions & 0 deletions src/__tests__/example-filename.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
import { buildRegExp, choiceOf, endOfString, negativeLookbehind, oneOrMore } from '../index';

const isRejectedFileExtension = negativeLookbehind(choiceOf('js', 'css', 'html'));

test('example: filename validator', () => {
const filenameRegex = buildRegExp([
oneOrMore(/[A-Za-z0-9_]/),
isRejectedFileExtension,
endOfString,
]);

expect(filenameRegex).toMatchString('index.ts');
expect(filenameRegex).toMatchString('index.tsx');
expect(filenameRegex).toMatchString('ind/ex.ts');
expect(filenameRegex).not.toMatchString('index.js');
expect(filenameRegex).not.toMatchString('index.html');
expect(filenameRegex).not.toMatchString('index.css');
expect(filenameRegex).not.toMatchString('./index.js');
expect(filenameRegex).not.toMatchString('./index.html');
expect(filenameRegex).not.toMatchString('./index.css');
});
43 changes: 43 additions & 0 deletions src/__tests__/example-password.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
import { any, buildRegExp, endOfString, lookahead, startOfString, zeroOrMore } from '../index';

//^(?=.*[A-Z])(?=.*[a-z])(?=.*\d)(?=.*[^A-Za-z0-9\s]).{8,}$

//
// The password policy is as follows:
// - At least one uppercase letter
// - At least one lowercase letter
// - At least one digit
// - At least one special character
// - At least 8 characters long

const atLeastOneUppercase = lookahead([zeroOrMore(any), /[A-Z]/]);
const atLeastOneLowercase = lookahead([zeroOrMore(any), /[a-z]/]);
const atLeastOneDigit = lookahead([zeroOrMore(any), /[0-9]/]);
const atLeastOneSpecialChar = lookahead([zeroOrMore(any), /[^A-Za-z0-9\s]/]);
const atLeastEightChars = /.{8,}/;

test('Example: Validating passwords', () => {
const validPassword = buildRegExp([
startOfString,
atLeastOneUppercase,
atLeastOneLowercase,
atLeastOneDigit,
atLeastOneSpecialChar,
atLeastEightChars,
endOfString,
]);

expect(validPassword).toMatchString('Aaaaa$aaaaaaa1');
expect(validPassword).not.toMatchString('aaaaaaaaaaa');
expect(validPassword).toMatchString('9aaa#aaaaA');
expect(validPassword).not.toMatchString('Aa');
expect(validPassword).toMatchString('Aa$123456');
expect(validPassword).not.toMatchString('Abba');
expect(validPassword).not.toMatchString('#password');
expect(validPassword).toMatchString('#passworD666');
expect(validPassword).not.toMatchString('Aa%1234');

expect(validPassword).toEqualRegex(
/^(?=.*[A-Z])(?=.*[a-z])(?=.*[0-9])(?=.*[^A-Za-z0-9\s])(?:.{8,})$/,
);
});
Loading
Loading