`brzozowski-ts`

An experiment in implementing static checks of regular expressions in Typescript using Brzozowski derivatives.

This is also an experiment in what an API for what a RegEx-validated string type might look like. There has already been discussion of the use cases and potential API in the Typescript RegEx-validated string types issue.

Use case

This allows you to make assertions about string constants:

import type { Compile, Exec, Recognize } from "brzozowski-ts/src";

type HexStrRE = Compile<"(?<hex>[0-9A-Fa-f]{5,})">;
type Match = Exec<HexStrRE, "abc123">;
const captures: Match["captures"] = ["abc123"];
const groups: Match["groups"] = { hex: "abc123" };

type HexStr<S extends string> = Recognize<HexStrRE, S>;
type NominalHex = string & { readonly isHex: unique symbol };

const castSpell = <S extends string>(hex: HexStr<S> | NominalHex) => hex;

const spell = castSpell("00dead00" as const); // ok!
const spellCheck: typeof spell = "00dead00"; // ok!

// @ts-expect-error
castSpell("xyz");

let dynamicHex: string = "a5df0";
castSpell(dynamicHex as NominalHex); // ok!

Limitations

RecognizePattern<RE, Str> uses a limited and naively-implemented regular expression language:
- no lookaround:
  - no positive or negative lookahead: (?=no) (?!nope)
  - no positive or negative lookbehind: (?<=no) (?<!nope)
- matches always start from the start of the string: /^always/
- string recognition is implemented as a series of potentially-nested commands rather than state transitions within a finite automaton.
- no flags:
  - no case-insensitive matching: /NOPE/i
  - no multiline mode: nope$
Using these types likely slows down builds

Usage

This is pre-alpha software: the API will change without warning, the implementation is brittle and incomplete, and none of this code has been optimized for memory usage or speed.

If you're brave, you can:

pnpm add -D "git+https://github.com/skalt/brzozowski-ts.git"

and import the types like

import type { Compile, Exec } from "brzozowski-ts/src";

Design

Compile-time parsing follows this general algorithm:

Given a constant string type S and a regular expression string type R
take the derivative of R with respect the start of S to produce a shorter regular expression r and a shorter string s
recur using s and r
when r is empty, the entire regular expression has been matched.
if s is empty and r is not, the expression has not been matched

Prior art

These DFA-based pure-type RegEx implementations were an inspiration! brzozowski_ts adds the ability to compile regular expressions, but uses a naive backtracking algorithm based on Brzozowski derivatives rather than DFAs.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.github		.github
.husky		.husky
.vscode		.vscode
scripts		scripts
src		src
test		test
.editorconfig		.editorconfig
.envrc		.envrc
.gitignore		.gitignore
.prettierrc		.prettierrc
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`brzozowski-ts`

Use case

Limitations

Usage

Design

Prior art

About

Releases

Sponsor this project

Packages

Languages

License

SKalt/brzozowski-ts

Folders and files

Latest commit

History

Repository files navigation

brzozowski-ts

Use case

Limitations

Usage

Design

Prior art

About

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Languages

`brzozowski-ts`

Packages