PairOfCleats

Local-first hybrid indexing and retrieval for source repositories.

PairOfCleats builds deterministic index artifacts for code and prose, then runs mixed sparse + dense retrieval with strict contracts around artifacts, schemas, and cache identity.

Runtime Requirements

Hard requirements:

Node.js >=24.13.0 (.nvmrc is 24.13.0)
npm (normal dependency install; scripts enabled)

Important install requirement:

Source-checkout installs are expected to include dev dependencies so required patches can be applied.
npm ci --omit=dev / production-only installs can fail in this repo when required patches/*.patch files are present.

Optional capabilities:

Python 3 (for Python-related tooling/tests and optional AST paths)
sqlite-vec extension (faster ANN path when available)
LMDB / LanceDB / HNSW backends (selected by policy and capability)
PDF/DOCX extraction dependencies (capability-gated document extraction flows)

What It Provides

CLI: pairofcleats <command>
HTTP API: pairofcleats service api
Indexer service worker: pairofcleats service indexer
MCP server mode via tooling scripts (npm run mcp-server)

Primary CLI surface:

setup
bootstrap
index build
index watch
index validate
search
lmdb build

Quick Start

Install:

npm install

Guided setup (recommended):

pairofcleats setup

Non-interactive bootstrap:

pairofcleats bootstrap

Build and validate:

pairofcleats index build --mode all --quality balanced
pairofcleats index validate

Search:

pairofcleats search -- "where is query cache invalidated?" --mode code
pairofcleats search -- "release matrix and packaging" --mode prose --explain --json

Start API server:

pairofcleats service api

Mental Model

PairOfCleats is a two-plane system:

Build plane: deterministic artifact production
Retrieval plane: query planning, candidate generation, scoring, and output shaping

Core data model:

Repo identity -> cache root -> build root -> per-mode index roots
Modes: code, prose, extracted-prose, records
Contract-first artifacts with manifest-first loading

High-level flow:

Repo files
  -> discovery + mode classification
  -> chunking + metadata + postings + relations
  -> artifact pieces + manifest + build_state
  -> optional sqlite/ann materialization
  -> builds/current.json promotion

Query
  -> parse + plan + intent
  -> candidate prefilter
  -> sparse rank (BM25 / sqlite-fts)
  -> dense rank (ann providers)
  -> fusion + boosts + explain
  -> stable output (human or json)

Build Pipeline (Technical)

Runtime envelope:

config resolution + policy normalization
concurrency and capability resolution

Discovery and classification:

ignore rules + file caps
deterministic mode assignment

Foreground indexing:

chunk extraction and metadata
sparse artifacts (postings/chargrams/filter index)
per-mode artifact writing with manifest entries

Background enrichment:

tree-sitter/lint/risk/embeddings (policy-gated)
optional ANN materialization paths

Promotion:

validation gate
builds/current.json update only after successful build

Retrieval Pipeline (Technical)

Query parse and routing:

query-plan construction
mode-aware tokenization and routing

Candidate generation:

filter index and chargram prefilter for path/file constraints
backend/provider availability checks

Ranking:

sparse ranking (bm25 or sqlite-fts)
dense ranking (ann providers based on capability/policy)

Fusion and output:

RRF or blend policy
deterministic tie-breaking
optional --explain score breakdown and pipeline stats

Cache behavior:

query cache keys include retrieval-relevant knobs and index identity
strict manifest-first index loading by default

Artifact and Cache Layout

Default cache layout is outside the repository:

<cacheRoot>/repos/<repoId>/builds/<buildId>/index-code
<cacheRoot>/repos/<repoId>/builds/<buildId>/index-prose
<cacheRoot>/repos/<repoId>/builds/<buildId>/index-extracted-prose
<cacheRoot>/repos/<repoId>/builds/<buildId>/index-records
<cacheRoot>/repos/<repoId>/builds/current.json

Set custom cache root in .pairofcleats.json:

{
  "cache": {
    "root": "C:/absolute/path/to/cache"
  }
}

Query Notes

Core syntax:

"exact phrase"
-term
-"excluded phrase"

Mode flags:

--mode code
--mode prose
--mode extracted-prose
--mode records
--mode all

Diagnostics:

--explain for ranking/routing details
--stats for pipeline timing and memory checkpoints
--json for machine-readable output

Testing and CI Lanes

Run a lane:

node tests/run.js --lane ci-lite
node tests/run.js --lane ci
node tests/run.js --lane ci-long
node tests/run.js --lane gate

Run with parallel jobs and timing outputs:

node tests/run.js --lane ci-long --jobs 4 --log-times .testLogs/ci-long-testRunTimes.txt --timings-file .testLogs/ci-long-timings.json

List lanes/tags:

node tests/run.js --list-lanes
node tests/run.js --list-tags

Learn More

Architecture and pipelines:

Contracts and schemas:

SQLite and ANN:

Setup, service, and integrations:

Advanced roadmap features and specs:

Testing and reliability:

License

License not yet specified in this repository.

Name		Name	Last commit message	Last commit date
Latest commit History 839 Commits
.github		.github
assets/isomap		assets/isomap
benchmarks		benchmarks
bin		bin
docs		docs
eslint-rules		eslint-rules
extensions/vscode		extensions/vscode
patches		patches
rules		rules
src		src
sublime/PairOfCleats		sublime/PairOfCleats
tests		tests
tools		tools
.gitattributes		.gitattributes
.gitignore		.gitignore
.markdownlint-cli2.jsonc		.markdownlint-cli2.jsonc
.npmrc		.npmrc
.nvmrc		.nvmrc
.pairofcleats.json		.pairofcleats.json
.rgignore		.rgignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
COMPLETED_PHASES.md		COMPLETED_PHASES.md
FUTUREROADMAP.md		FUTUREROADMAP.md
README.md		README.md
SKYMAP.md		SKYMAP.md
TES_LAYN_ROADMAP.md		TES_LAYN_ROADMAP.md
broken_tests.md		broken_tests.md
build_index.js		build_index.js
clete.png		clete.png
eslint.config.js		eslint.config.js
package-lock.json		package-lock.json
package.json		package.json
search.js		search.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PairOfCleats

Runtime Requirements

What It Provides

Quick Start

Mental Model

Build Pipeline (Technical)

Retrieval Pipeline (Technical)

Artifact and Cache Layout

Query Notes

Testing and CI Lanes

Learn More

License

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors 3

Languages

Uh oh!

doublemover/PairOfCleats

Folders and files

Latest commit

History

Repository files navigation

PairOfCleats

Runtime Requirements

What It Provides

Quick Start

Mental Model

Build Pipeline (Technical)

Retrieval Pipeline (Technical)

Artifact and Cache Layout

Query Notes

Testing and CI Lanes

Learn More

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors 3

Languages

Packages