An autonomous AI coding agent that builds entire applications from a simple description.
- What Is This?
- What You'll Need
- Understanding Key Concepts
- Step-by-Step Setup
- Getting Claude Access
- Installing the Agent Harness
- Creating Your First App Specification
- Running the Agent
- Understanding the Phases
- All Options Explained
- Understanding the Output
- Troubleshooting
- Glossary
- Architecture Reference
Imagine you could describe an app you want built in plain English, and then an AI builds it for you automatically. That's what this tool does.
You write something like:
"Build me a todo list app where users can add tasks, mark them complete, and delete them."
And the Agent Harness:
- Thinks through what needs to be built (planning)
- Breaks it down into small, manageable features
- Writes the code for each feature, one by one
- Tests its work to make sure it works
- Keeps going until everything is done
- You don't need to know how to code - Just describe what you want
- It builds complete projects - Not just code snippets, but full applications
- It's autonomous - You can walk away and let it work
- It's resumable - If something stops, it picks up where it left off
- It has safety features - Human checkpoints let you review the work
┌─────────────────────────────────────────────────────────────────┐
│ │
│ YOU AGENT HARNESS │
│ │
│ ┌─────────────┐ ┌─────────────────────────────┐ │
│ │ "Build me │ │ │ │
│ │ a todo │ ────────► │ Phase 1: VIBE PLAN │ │
│ │ app..." │ │ (Explores ideas freely) │ │
│ └─────────────┘ │ │ │
│ │ Phase 2: INITIALIZATION │ │
│ │ (Creates feature list) │ │
│ │ │ │
│ │ Phase 3: CODING │ │
│ │ (Builds each feature) │ │
│ │ │ │
│ ┌─────────────┐ │ Phase 4: VALIDATION │ │
│ │ Complete │ ◄──────── │ (You review the work) │ │
│ │ Application │ │ │ │
│ └─────────────┘ └─────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Before we start, make sure you have:
| Requirement | Description |
|---|---|
| A computer | Mac, Windows, or Linux |
| Internet connection | For downloading tools and AI access |
| 30-60 minutes | For the initial setup |
| Claude access | Either an API key OR a Claude Max/Pro subscription |
Cost considerations:
- Claude API: Pay per use (roughly $3-15 per application built, depending on size)
- Claude Max: $100/month subscription with generous usage included
If you're completely new to this, here are the terms you'll encounter:
The terminal is a text-based way to control your computer. Instead of clicking on icons, you type commands.
This is what the terminal looks like:
your-computer:~ username$ _
↑
You type here
Why use it? Many developer tools only work through the terminal. Don't worry - we'll show you exactly what to type.
- Git is a tool that tracks changes to your code (like "undo history" for an entire project)
- GitHub is a website that stores your code online (like Google Drive for code)
Why do we need this? The Agent Harness uses Git to save its work and let you see what changed.
Node.js is a program that runs JavaScript code on your computer. The Agent Harness is written in JavaScript/TypeScript, so you need Node.js to run it.
An API key is like a password that lets programs access online services. You'll need one to let the Agent Harness talk to Claude (the AI).
Claude is an AI assistant made by Anthropic. It's the "brain" that writes the code. There are different ways to access Claude:
- Claude API - Direct access, pay per use
- Claude Code - A command-line tool that uses Claude
- Claude Max/Pro - Subscription plans with Claude Code included
Choose your operating system:
- Press Command (⌘) + Space on your keyboard
- Type Terminal
- Press Enter
A window should appear with a prompt like:
your-name@your-mac ~ %
Tip: You can also find Terminal in Applications → Utilities → Terminal
Homebrew is a tool that makes installing other software easy on Mac.
- Copy and paste this entire command into your terminal:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"- Press Enter
- You may be asked to enter your Mac password (the one you use to log in)
- Note: When you type your password, nothing will appear on screen - this is normal for security
- Just type it and press Enter
- Wait for it to finish (this can take 5-10 minutes)
How to verify it worked:
brew --versionYou should see something like Homebrew 4.x.x
Node.js lets your computer run the Agent Harness.
- Type this command and press Enter:
brew install node- Wait for it to finish
How to verify it worked:
node --versionYou should see something like v20.x.x or v22.x.x
Also check npm (Node's package manager):
npm --versionYou should see something like 10.x.x
Git tracks changes to your code.
- Type this command and press Enter:
brew install gitHow to verify it worked:
git --versionYou should see something like git version 2.x.x
Tell Git who you are (this is attached to the code changes you make):
git config --global user.name "Your Name"git config --global user.email "your.email@example.com"Replace Your Name and your.email@example.com with your actual name and email.
The GitHub CLI lets you interact with GitHub from the terminal.
- Type this command and press Enter:
brew install ghHow to verify it worked:
gh --versionYou should see something like gh version 2.x.x
-
If you don't have a GitHub account, go to github.com and create one (it's free)
-
In your terminal, type:
gh auth login-
You'll see a series of questions. Choose these options:
- What account do you want to log into? →
GitHub.com - What is your preferred protocol? →
HTTPS - Authenticate Git with your GitHub credentials? →
Yes - How would you like to authenticate? →
Login with a web browser
- What account do you want to log into? →
-
A code will appear. Press Enter, and a browser window will open
-
Paste the code in the browser and authorize the application
How to verify it worked:
gh auth statusYou should see "Logged in to github.com"
Claude Code is the command-line tool that lets Claude write code for you.
npm install -g @anthropic-ai/claude-codeHow to verify it worked:
claude --versionYou should see version information.
Now continue to Getting Claude Access
Windows has a built-in terminal, but Windows Terminal is better.
- Open the Microsoft Store (search for it in the Start menu)
- Search for Windows Terminal
- Click Install
- Once installed, open Windows Terminal from the Start menu
- Open your web browser and go to: https://nodejs.org
- Click the big green button that says "LTS" (Long Term Support)
- A file will download (something like
node-v20.x.x-x64.msi) - Double-click the downloaded file to run the installer
- Click Next through all the screens, accepting the defaults
- Click Install
- Click Finish when done
How to verify it worked:
- Open Windows Terminal
- Type:
node --versionYou should see something like v20.x.x
Also check npm:
npm --version- Go to: https://git-scm.com/download/windows
- The download should start automatically
- Run the installer
- Important settings during installation:
- Choosing the default editor: Select "Use Visual Studio Code" or "Use Notepad++" if you have them, otherwise keep the default
- Adjusting your PATH: Select "Git from the command line and also from 3rd-party software"
- For all other options, the defaults are fine - just click Next
- Click Install, then Finish
How to verify it worked:
Close and reopen Windows Terminal, then type:
git --versiongit config --global user.name "Your Name"git config --global user.email "your.email@example.com"- Go to: https://cli.github.com
- Click Download for Windows
- Run the installer
- Follow the prompts and click Install
How to verify it worked:
Close and reopen Windows Terminal, then type:
gh --version- Create a GitHub account at github.com if you don't have one
- In Windows Terminal, type:
gh auth login- Follow the prompts (same as Mac instructions above)
npm install -g @anthropic-ai/claude-codeHow to verify it worked:
claude --versionNow continue to Getting Claude Access
These instructions are for Ubuntu/Debian-based distributions. Commands may vary for other distributions.
Press Ctrl + Alt + T or find Terminal in your applications menu.
sudo apt update && sudo apt upgrade -yEnter your password when prompted.
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt install -y nodejsVerify:
node --version
npm --versionsudo apt install -y gitVerify:
git --versiongit config --global user.name "Your Name"
git config --global user.email "your.email@example.com"curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | sudo dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | sudo tee /etc/apt/sources.list.d/github-cli.list > /dev/null
sudo apt update
sudo apt install gh -yVerify:
gh --versiongh auth loginFollow the prompts.
npm install -g @anthropic-ai/claude-codeVerify:
claude --versionNow continue to Getting Claude Access
You need to connect Claude (the AI) to the Agent Harness. There are two ways to do this:
If you have a Claude Max ($100/month) or Claude Pro subscription, Claude Code is included.
- In your terminal, type:
claude- Claude Code will open and ask you to authenticate
- Follow the prompts to log in with your Anthropic account
- Once logged in, type
/exitto close Claude Code
That's it! Claude Code will remember your login.
If you want to use the API directly (pay-per-use):
- Go to https://console.anthropic.com
- Click Sign Up and create an account
- Verify your email
- In the Anthropic Console, go to Settings → Billing
- Add a credit card
- Add some credits (start with $10-20 for testing)
- Go to Settings → API Keys
- Click Create Key
- Give it a name like "agent-harness"
- Copy the key immediately - you won't be able to see it again!
The key looks like: sk-ant-api03-xxxxxxxxxxxx...
You need to make this key available to the Agent Harness.
On Mac/Linux:
echo 'export ANTHROPIC_API_KEY=sk-ant-api03-your-key-here' >> ~/.zshrc
source ~/.zshrcOn Windows (PowerShell):
[Environment]::SetEnvironmentVariable("ANTHROPIC_API_KEY", "sk-ant-api03-your-key-here", "User")Then close and reopen your terminal.
IMPORTANT: Replace sk-ant-api03-your-key-here with your actual API key.
Security note: Never share your API key or commit it to GitHub.
Now let's get the Agent Harness itself!
This downloads the Agent Harness code from GitHub to your computer.
git clone https://github.com/tripleyak/agent-harness.gitcd agent-harnessYour terminal prompt should now show you're in the agent-harness folder.
This downloads all the code libraries that Agent Harness needs:
npm installWait for it to finish. You'll see lots of text scrolling by - this is normal.
This compiles the code so it can run:
npm run buildHow to verify it worked:
npm start -- --helpYou should see a help message with all the available commands.
An app specification (or "spec") is a text file that describes what you want to build.
You can use any text editor. Here's how to do it from the terminal:
On Mac/Linux:
nano prompts/my_first_app.txtOn Windows:
notepad prompts/my_first_app.txtHere's an example of a simple app spec:
# Simple Todo App
A command-line todo list application.
## Requirements
- Users can add new tasks with a description
- Users can list all tasks
- Users can mark tasks as complete
- Users can delete tasks
- Tasks are saved to a file so they persist between runs
- Use Node.js with no external dependencies
## Example Usage
$ node todo.js add "Buy groceries"
Added: Buy groceries
$ node todo.js list
1. [ ] Buy groceries
$ node todo.js complete 1
Completed: Buy groceries
$ node todo.js list
1. [x] Buy groceries
- nano: Press Ctrl + X, then Y, then Enter
- notepad: Press Ctrl + S and close the window
- Be specific - Instead of "build a good app", say exactly what features you want
- Include examples - Show how you expect the app to work
- Mention technologies - If you want Node.js, Python, etc., say so
- List constraints - "No external dependencies" or "Must work offline"
- Start small - Your first project should be simple (5-10 features)
Minimal (2 features):
# Hello Greeter
A simple Node.js module that greets users.
## Requirements
- Single file: greeter.js
- One function: greet(name) returns "Hello, {name}!"
- Handle empty name: return "Hello, stranger!"
- No external dependencies
Medium (10 features):
# Password Generator CLI
A command-line password generator tool.
## Requirements
- Generate random passwords of specified length
- Option to include/exclude: uppercase, lowercase, numbers, symbols
- Option to generate multiple passwords at once
- Copy password to clipboard
- Save passwords to encrypted file
- Check password strength
- No external dependencies except for clipboard functionality
Navigate to the agent-harness folder and run:
npm start -- run -s prompts/my_first_app.txt --subprocessLet's break this down:
npm start- Starts the Agent Harness--- Separates npm arguments from program argumentsrun- The command to run the agent-s prompts/my_first_app.txt- Path to your app specification--subprocess- Uses Claude Code for authentication (recommended)
- Project folder created - A new folder appears (default:
agent_project/) - Vibe plan phase - The AI thinks through the project
- Initialization - Features are identified and listed
- Coding begins - Features are built one by one
- Completion - All features are implemented
The terminal will show you what's happening:
🤖 Agent Harness - Autonomous Coding Agent
Project: /path/to/agent_project
Mode: Claude Code subprocess (Max plan)
Model: claude-opus-4-5-20251101
Features: 50
📁 New project detected. Running initialization...
🎨 Phase 1: Vibe Plan (Exploration)
Exploring project ideas and architecture...
🏗 Phase 2: Structured Planning & Setup
📊 Progress:
Total features: 5
Completed: 2 (40%)
In progress: 1
[████████████░░░░░░░░░░░░░░░░░░] 40%
- To stop: Press Ctrl + C
- To resume: Run the same command again - it picks up where it left off
For a full-featured run with human review:
npm start -- run \
-s prompts/my_first_app.txt \
--project ./my_app \
--subprocess \
--enable-rag \
--enable-validation \
--review-every 3 \
--enable-testsWhat it does: The AI freely explores your idea, thinking about:
- Architecture approaches
- Potential challenges
- Edge cases
- Technology choices
Output: vibe-plan-notes.md - A document with the AI's thoughts
Skip it with: --skip-vibe-plan
What it does:
- Reads your app specification
- Creates a detailed feature list
- Sets up the project folder structure
- Initializes git repository
Output: feature_list.json - List of all features to build
What it does:
- Works through features one by one
- Writes code for each feature
- Commits changes to git
- Tracks progress in
feature_list.json
Each session:
- Looks at what's done and what's next
- Picks the next feature
- Writes the code
- Marks it complete
- Repeats until done
What it does: Pauses for you to review the AI's work
When it triggers:
- After every N features (configurable)
- Automatically for security-related features
Your options:
- Approve - Continue to next feature
- Reject - Stop for manual fixes
- Skip - Continue without approval
- Feedback - Tell the AI what to change
| Option | Description | Example |
|---|---|---|
-s, --spec <path> |
Path to your app specification file | -s prompts/my_app.txt |
| Option | Description | Default | Example |
|---|---|---|---|
-p, --project <dir> |
Where to create the project | ./agent_project |
-p ./my_app |
-m, --model <model> |
Which AI model to use | claude-opus-4-5-20251101 |
-m claude-sonnet-4-5-20250514 |
-f, --features <n> |
Maximum features to generate | 50 |
-f 20 |
-i, --iterations <n> |
Maximum coding sessions (0=unlimited) | 0 |
-i 10 |
--subprocess |
Use Claude Code for auth | Off | --subprocess |
| Option | Description | Default |
|---|---|---|
--skip-vibe-plan |
Skip the exploration phase | Off |
--no-auto-resume |
Don't automatically continue | Auto-resume on |
-d, --delay <ms> |
Wait time between sessions | 3000 (3 seconds) |
| Option | Description |
|---|---|
--enable-rag |
Gather context from existing code |
--enable-validation |
Enable human review checkpoints |
--review-every <n> |
Review after every N features |
--enable-tests |
Run tests before each session |
--test-after-each |
Also run tests after each feature |
Minimal run:
npm start -- run -s prompts/app.txt --subprocessFull featured:
npm start -- run \
-s prompts/app.txt \
--project ./my_project \
--subprocess \
--features 30 \
--enable-rag \
--enable-validation \
--review-every 5 \
--enable-tests \
--test-after-eachQuick test (small project):
npm start -- run \
-s prompts/minimal_spec.txt \
--project ./test_project \
--subprocess \
--features 5 \
--iterations 3 \
--skip-vibe-planCheck project status:
npm start -- status -p ./my_projectAfter running, you'll find these files in your project folder:
my_project/
├── feature_list.json # List of features and their status
├── app_spec.txt # Copy of your specification
├── vibe-plan-notes.md # AI's exploration thoughts
├── claude-progress.txt # Session log
├── human-feedback.md # Your feedback (if any)
├── validation-log.json # Review decisions
├── .git/ # Git repository
└── [your app files] # The actual code!
In feature_list.json, each feature has a status:
| Status | Meaning |
|---|---|
pending |
Not started yet |
in_progress |
Currently being worked on |
passing |
Completed successfully |
failing |
Attempted but has issues |
Every feature creates a git commit. You can see the history:
cd my_project
git log --onelineTo see what changed in a commit:
git show <commit-id>📊 Progress:
Total features: 10
Completed: 4 (40%)
In progress: 1
Failed: 0
Pending: 5
[████████████░░░░░░░░░░░░░░░░░░] 40%
Problem: Claude Code isn't installed properly.
Solution:
npm install -g @anthropic-ai/claude-codeIf that doesn't work, try:
sudo npm install -g @anthropic-ai/claude-codeProblem: The API key isn't configured.
Solutions:
-
If using Claude Max: Make sure you ran
claudeand logged in first -
If using API key: Set it in your terminal:
export ANTHROPIC_API_KEY=your-key-here
Problem: You don't have permission to install or run something.
Solution: On Mac/Linux, add sudo before the command:
sudo npm install -g @anthropic-ai/claude-codeProblem: Dependencies aren't installed.
Solution:
cd agent-harness
npm install
npm run buildProblem: You might be hitting rate limits or the AI is confused.
Solutions:
- Wait a few minutes and run again (it will resume)
- Check your API credits if using pay-per-use
- Try a simpler specification
Problem: The AI made mistakes.
Solutions:
- Enable validation:
--enable-validation --review-every 1 - Provide more specific requirements in your spec
- Check the generated code and provide feedback
- Start with a simpler project
If you want to completely restart:
rm -rf ./agent_projectThen run the command again.
| Term | Definition |
|---|---|
| Agent | An AI system that can take actions autonomously |
| API | Application Programming Interface - how programs talk to each other |
| API Key | A secret code that authenticates you to a service |
| CLI | Command Line Interface - text-based program interaction |
| Clone | Download a copy of a Git repository |
| Commit | A saved snapshot of code changes in Git |
| Dependencies | Other code libraries your project needs |
| Git | Version control system that tracks code changes |
| GitHub | Website that hosts Git repositories |
| Homebrew | Package manager for Mac |
| JSON | A text format for storing structured data |
| Node.js | Program that runs JavaScript on your computer |
| npm | Node Package Manager - installs JavaScript libraries |
| OAuth | A way to log in without sharing your password |
| RAG | Retrieval-Augmented Generation - giving AI relevant context |
| Repository (Repo) | A folder tracked by Git |
| Terminal | Text-based interface to control your computer |
| Token | A unit of text that AI models process |
| TypeScript | JavaScript with added type checking |
┌─────────────────────────────────────────────────────────────┐
│ Agent Harness │
├─────────────────────────────────────────────────────────────┤
│ Orchestrator (index.ts) │
│ ├── Vibe Plan Session → Generate exploration notes │
│ ├── Initializer Session → Generate feature_list.json │
│ └── Coder Sessions → Implement features incrementally │
├─────────────────────────────────────────────────────────────┤
│ Agent Modes │
│ ├── API Mode (agent.ts) → Direct Anthropic API calls │
│ └── Subprocess Mode (agent-subprocess.ts) → Claude Code │
├─────────────────────────────────────────────────────────────┤
│ Context Engineering │
│ ├── RAG (rag.ts) → Retrieve relevant code context │
│ ├── Vibe Plan (vibe-plan.ts) → Unstructured exploration │
│ └── Human Feedback → Incorporate review notes │
├─────────────────────────────────────────────────────────────┤
│ Validation │
│ ├── Human Checkpoints (validation.ts) │
│ └── Test Runner (test-runner.ts) │
├─────────────────────────────────────────────────────────────┤
│ Security (security.ts) │
│ ├── Command allowlist │
│ ├── Path boundary validation │
│ └── Dangerous pattern detection │
├─────────────────────────────────────────────────────────────┤
│ State Management │
│ └── feature_list.json (source of truth) │
└─────────────────────────────────────────────────────────────┘
-
Initializer Agent (Session 1)
- Reads application specification
- Generates comprehensive feature list
- Sets up project structure
- Initializes git repository
-
Coder Agent (Sessions 2+)
- Reads progress from feature_list.json
- Implements features incrementally
- Tests and commits changes
- Auto-continues between sessions
Defense-in-depth approach:
- Command Allowlist - Only specific commands can execute
- Path Boundary - All file operations restricted to project directory
- Dangerous Pattern Detection - Blocks fork bombs, pipe to shell, etc.
All state is externalized to files for multi-session continuity:
feature_list.json- Source of truth for features and progressclaude-progress.txt- Human-readable session logvibe-plan-notes.md- Exploration phase outputhuman-feedback.md- Feedback for next sessionvalidation-log.json- Audit trail of reviews- Git commits - Audit trail and rollback capability
- GitHub Issues: https://github.com/tripleyak/agent-harness/issues
- Claude Code Help: Run
claudeand type/help - Anthropic Documentation: https://docs.anthropic.com
MIT
This project was inspired by Anthropic's autonomous-coding quickstart.