Use your existing AI subscriptions with Claude Code. Works with Anthropic Max, Gemini Advanced, ChatGPT Plus/Codex, Kimi, GLM, OllamaCloud — plus 580+ models via OpenRouter and local models for complete privacy.
Claudish (Claude-ish) is a CLI tool that allows you to run Claude Code with any AI model by proxying requests through a local Anthropic API-compatible server.
Supported Providers:
- Cloud: OpenRouter (580+ models), Google Gemini, OpenAI, MiniMax, Kimi, GLM, Z.AI, OllamaCloud, OpenCode Zen
- Local: Ollama, LM Studio, vLLM, MLX
- Enterprise: Vertex AI (Google Cloud)
Stop paying for multiple AI subscriptions. Claudish lets you use subscriptions you already have with Claude Code's powerful interface:
| Your Subscription | Command |
|---|---|
| Anthropic Max | Native support (just use claude) |
| Gemini Advanced | claudish --model g@gemini-3-pro-preview |
| ChatGPT Plus/Codex | claudish --model oai@gpt-5.3 or oai@gpt-5.3-codex |
| Kimi | claudish --model kimi@kimi-k2.5 |
| GLM | claudish --model glm@GLM-4.7 |
| MiniMax | claudish --model mm@minimax-m2.1 |
| OllamaCloud | claudish --model oc@qwen3-next |
100% Offline Option — Your code never leaves your machine:
claudish --model ollama@qwen3-coder:latest "your task"Claudish is a BYOK AI coding assistant:
- ✅ Use API keys you already have
- ✅ No additional subscription fees
- ✅ Full cost control — pay only for what you use
- ✅ Works with any provider
- ✅ Switch models mid-session
- ✅ Multi-provider support - OpenRouter, Gemini, Vertex AI, OpenAI, OllamaCloud, and local models
- ✅ New routing syntax - Use
provider@model[:concurrency]for explicit routing (e.g.,google@gemini-2.0-flash) - ✅ Native auto-detection - Models like
gpt-4o,gemini-2.0-flash,llama-3.1-70broute to their native APIs automatically - ✅ Direct API access - Google, OpenAI, MiniMax, Kimi, GLM, Z.AI, OllamaCloud, Poe with direct billing
- ✅ Vertex AI Model Garden - Access Google + partner models (MiniMax, Mistral, DeepSeek, Qwen, OpenAI OSS)
- ✅ Local model support - Ollama, LM Studio, vLLM, MLX with
ollama@,lmstudio@syntax and concurrency control - ✅ Cross-platform - Works with both Node.js and Bun (v1.3.0+)
- ✅ Universal compatibility - Use with
npxorbunx- no installation required - ✅ Interactive setup - Prompts for API key and model if not provided (zero config!)
- ✅ Monitor mode - Proxy to real Anthropic API and log all traffic (for debugging)
- ✅ Protocol compliance - 1:1 compatibility with Claude Code communication protocol
- ✅ Headless mode - Automatic print mode for non-interactive execution
- ✅ Quiet mode - Clean output by default (no log pollution)
- ✅ JSON output - Structured data for tool integration
- ✅ Real-time streaming - See Claude Code output as it happens
- ✅ Parallel runs - Each instance gets isolated proxy
- ✅ Autonomous mode - Bypass all prompts with flags
- ✅ Context inheritance - Runs in current directory with same
.claudesettings - ✅ Agent support - Use Claude Code agents in headless mode with
--agent
# Shell script (Linux/macOS)
curl -fsSL https://raw.githubusercontent.com/MadAppGang/claudish/main/install.sh | bash
# Homebrew (macOS)
brew tap MadAppGang/claudish && brew install claudish
# npm
npm install -g claudish
# Bun
bun install -g claudish- Claude Code - Claude CLI must be installed
- At least one API key:
- OpenRouter API Key - Access 100+ models (free tier available)
- Google Gemini API Key - For direct Gemini access
- OpenAI API Key - For direct OpenAI access
- OllamaCloud API Key - For cloud-hosted Ollama models (
oc/prefix) - Or local models (Ollama, LM Studio) - No API key needed
Use without installing:
npx claudish@latest --model x-ai/grok-code-fast-1 "your prompt"
bunx claudish@latest --model x-ai/grok-code-fast-1 "your prompt"Install from source:
git clone https://github.com/MadAppGang/claudish.git
cd claudish
bun install && bun run build && bun link# Navigate to your project directory
cd /path/to/your/project
# Install Claudish skill for automatic best practices
claudish --init
# Reload Claude Code to discover the skillWhat this does:
- ✅ Installs Claudish usage skill in
.claude/skills/claudish-usage/ - ✅ Enables automatic sub-agent delegation
- ✅ Enforces file-based instruction patterns
- ✅ Prevents context window pollution
After running --init, Claude will automatically:
- Use sub-agents when you mention external models (Grok, GPT-5, etc.)
- Follow best practices for Claudish usage
- Suggest specialized agents for different tasks
# Just run it - will prompt for API key and model
claudish
# Enter your OpenRouter API key when prompted
# Select a model from the list
# Start coding!# Set up environment
export OPENROUTER_API_KEY=sk-or-v1-... # For OpenRouter models
export GEMINI_API_KEY=... # For direct Google API
export OPENAI_API_KEY=sk-... # For direct OpenAI API
export ANTHROPIC_API_KEY=sk-ant-api03-placeholder # Required placeholder
# Run with auto-detected model
claudish --model gpt-4o "implement user authentication" # → OpenAI
claudish --model gemini-2.0-flash "add tests" # → Google
# Or with explicit provider
claudish --model openrouter@anthropic/claude-3.5-sonnet "review code"Note: In interactive mode, if OPENROUTER_API_KEY is not set, you'll be prompted to enter it. This makes first-time usage super simple!
For AI agents running within Claude Code: Use the dedicated AI agent guide for comprehensive instructions on file-based patterns and sub-agent delegation.
# Print complete AI agent usage guide
claudish --help-ai
# Save guide to file for reference
claudish --help-ai > claudish-agent-guide.mdQuick Reference for AI Agents:
-
Get available models:
# List all models or search claudish --models claudish --models gemini # Get top recommended models (JSON) claudish --top-models --json
-
Run Claudish through sub-agent (recommended pattern):
// Don't run Claudish directly in main conversation // Use Task tool to delegate to sub-agent const result = await Task({ subagent_type: "general-purpose", description: "Implement feature with Grok", prompt: ` Use Claudish to implement feature with Grok model. STEPS: 1. Create instruction file: /tmp/claudish-task-${Date.now()}.md 2. Write feature requirements to file 3. Run: claudish --model x-ai/grok-code-fast-1 --stdin < /tmp/claudish-task-*.md 4. Read result and return ONLY summary (2-3 sentences) DO NOT return full implementation. Keep response under 300 tokens. ` });
-
File-based instruction pattern (avoids context pollution):
// Write instructions to file const instructionFile = `/tmp/claudish-task-${Date.now()}.md`; const resultFile = `/tmp/claudish-result-${Date.now()}.md`; await Write({ file_path: instructionFile, content: ` # Task Your task description here # Output Write results to: ${resultFile} ` }); // Run Claudish with stdin await Bash(`claudish --model x-ai/grok-code-fast-1 --stdin < ${instructionFile}`); // Read result const result = await Read({ file_path: resultFile }); // Return summary only return extractSummary(result);
Key Principles:
- ✅ Use file-based patterns to avoid context window pollution
- ✅ Delegate to sub-agents instead of running directly
- ✅ Return summaries only (not full conversation transcripts)
- ✅ Choose appropriate model for task (see
--modelsor--top-models)
Resources:
- Full AI agent guide:
claudish --help-ai - Skill document:
skills/claudish-usage/SKILL.md(in repository root) - Model integration:
skills/claudish-integration/SKILL.md(in repository root)
claudish [OPTIONS] <claude-args...>| Flag | Description | Default |
|---|---|---|
-i, --interactive |
Run in interactive mode (persistent session) | Single-shot mode |
-m, --model <model> |
OpenRouter model to use | x-ai/grok-code-fast-1 |
-p, --port <port> |
Proxy server port | Random (3000-9000) |
-q, --quiet |
Suppress [claudish] log messages | Quiet in single-shot |
-v, --verbose |
Show [claudish] log messages | Verbose in interactive |
--json |
Output in JSON format (implies --quiet) | false |
-d, --debug |
Enable debug logging to file | false |
--no-auto-approve |
Disable auto-approve (require prompts) | Auto-approve enabled |
--dangerous |
Pass --dangerouslyDisableSandbox |
false |
--agent <agent> |
Use specific agent (e.g., frontend:developer) |
- |
--models |
List all models or search (e.g., --models gemini) |
- |
--top-models |
Show top recommended programming models | - |
--list-agents |
List available agents in current project | - |
--force-update |
Force refresh model cache | - |
--init |
Install Claudish skill in current project | - |
--help-ai |
Show AI agent usage guide | - |
-h, --help |
Show help message | - |
| Variable | Description | Used For |
|---|---|---|
OPENROUTER_API_KEY |
OpenRouter API key | Default backend (100+ models) |
GEMINI_API_KEY |
Google Gemini API key | Direct Gemini access (g/ prefix) |
VERTEX_API_KEY |
Vertex AI Express API key | Vertex AI Express mode (v/ prefix) |
VERTEX_PROJECT |
GCP Project ID | Vertex AI OAuth mode (v/ prefix) |
VERTEX_LOCATION |
GCP Region (default: us-central1) | Vertex AI regional endpoint |
OPENAI_API_KEY |
OpenAI API key | Direct OpenAI access (oai/ prefix) |
OLLAMA_API_KEY |
OllamaCloud API key | OllamaCloud access (oc/ prefix) |
ANTHROPIC_API_KEY |
Placeholder (any value) | Prevents Claude Code dialog |
| Variable | Description | Default |
|---|---|---|
GEMINI_BASE_URL |
Custom Gemini endpoint | https://generativelanguage.googleapis.com |
OPENAI_BASE_URL |
Custom OpenAI/Azure endpoint | https://api.openai.com |
OLLAMA_BASE_URL |
Ollama server URL | http://localhost:11434 |
OLLAMACLOUD_BASE_URL |
Custom OllamaCloud endpoint | https://ollama.com |
LMSTUDIO_BASE_URL |
LM Studio server URL | http://localhost:1234 |
VLLM_BASE_URL |
vLLM server URL | http://localhost:8000 |
MLX_BASE_URL |
MLX server URL | http://127.0.0.1:8080 |
| Variable | Description | Default |
|---|---|---|
CLAUDISH_MODEL |
Default model to use | openai/gpt-5.3 |
CLAUDISH_PORT |
Default proxy port | Random (3000-9000) |
CLAUDISH_CONTEXT_WINDOW |
Override context window size | Auto-detected |
Important Notes:
- You MUST set
ANTHROPIC_API_KEY=sk-ant-api03-placeholder(or any value). Without it, Claude Code will show a dialog - In interactive mode, if no API key is set, you'll be prompted to enter one
Claudish uses provider@model[:concurrency] syntax for explicit routing, plus smart auto-detection for native providers:
# Explicit provider routing
claudish --model google@gemini-2.0-flash "quick task"
claudish --model openrouter@deepseek/deepseek-r1 "analysis"
claudish --model oai@gpt-4o "implement feature"
claudish --model ollama@llama3.2:3 "code review" # 3 concurrent requests| Shortcut | Provider | API Key | Example |
|---|---|---|---|
g@, google@ |
Google Gemini | GEMINI_API_KEY |
g@gemini-2.0-flash |
oai@ |
OpenAI Direct | OPENAI_API_KEY |
oai@gpt-4o |
or@, openrouter@ |
OpenRouter | OPENROUTER_API_KEY |
or@deepseek/deepseek-r1 |
mm@, mmax@ |
MiniMax Direct | MINIMAX_API_KEY |
mm@MiniMax-M2.1 |
kimi@, moon@ |
Kimi Direct | MOONSHOT_API_KEY |
kimi@kimi-k2 |
glm@, zhipu@ |
GLM Direct | ZHIPU_API_KEY |
glm@glm-4 |
zai@ |
Z.AI Direct | ZAI_API_KEY |
zai@glm-4 |
llama@, lc@, meta@ |
OllamaCloud | OLLAMA_API_KEY |
llama@llama-3.1-70b |
oc@ |
OllamaCloud | OLLAMA_API_KEY |
oc@llama-3.1-70b |
zen@ |
OpenCode Zen | (free) | zen@grok-code |
v@, vertex@ |
Vertex AI | VERTEX_API_KEY |
v@gemini-2.5-flash |
go@ |
Gemini CodeAssist | (OAuth) | go@gemini-2.5-flash |
poe@ |
Poe | POE_API_KEY |
poe@GPT-4o |
ollama@ |
Ollama (local) | (none) | ollama@llama3.2 |
lms@, lmstudio@ |
LM Studio (local) | (none) | lms@qwen2.5-coder |
vllm@ |
vLLM (local) | (none) | vllm@mistral-7b |
mlx@ |
MLX (local) | (none) | mlx@llama-3.2-3b |
When no provider is specified, Claudish auto-detects from model name:
| Model Pattern | Routes To | Example |
|---|---|---|
gemini-*, google/* |
Google Gemini | gemini-2.0-flash |
gpt-*, o1-*, o3-* |
OpenAI Direct | gpt-4o |
llama-*, meta-llama/* |
OllamaCloud | llama-3.1-70b |
abab-*, minimax/* |
MiniMax Direct | abab-6.5 |
kimi-*, moonshot-* |
Kimi Direct | kimi-k2 |
glm-*, zhipu/* |
GLM Direct | glm-4 |
poe:* |
Poe | poe:GPT-4o |
claude-*, anthropic/* |
Native Anthropic | claude-sonnet-4 |
Unknown vendor/model |
Error | Use openrouter@vendor/model |
# Auto-detected native routing (no prefix needed!)
claudish --model gemini-2.0-flash "quick task" # → Google API
claudish --model gpt-4o "implement feature" # → OpenAI API
claudish --model llama-3.1-70b "code review" # → OllamaCloud
# Explicit provider routing
claudish --model google@gemini-2.5-pro "complex analysis"
claudish --model oai@o1 "complex reasoning"
claudish --model openrouter@deepseek/deepseek-r1 "deep analysis"
# OllamaCloud - cloud-hosted Llama models
claudish --model llama@llama-3.1-70b "code review"
claudish --model oc@llama-3.2-vision "analyze image"
# Vertex AI - Google Cloud
VERTEX_API_KEY=... claudish --model v@gemini-2.5-flash "task"
VERTEX_PROJECT=my-project claudish --model vertex@gemini-2.5-flash "OAuth mode"
# Local models with concurrency control
claudish --model ollama@llama3.2:3 "review" # 3 concurrent requests
claudish --model ollama@llama3.2:0 "fast" # No limit (bypass queue)
# Unknown vendors require explicit OpenRouter
claudish --model openrouter@qwen/qwen-2.5 "task"
claudish --model or@mistralai/mistral-large "analysis"The old prefix/model syntax still works but shows deprecation warnings:
# Old (deprecated) → New (recommended)
claudish --model g/gemini-pro → claudish --model g@gemini-pro
claudish --model oai/gpt-4o → claudish --model oai@gpt-4o
claudish --model ollama/llama3.2 → claudish --model ollama@llama3.2Top recommended models for development (v3.1.1):
| Model | Provider | Best For |
|---|---|---|
openai/gpt-5.3 |
OpenAI | Default - Most advanced reasoning |
minimax/minimax-m2.1 |
MiniMax | Budget-friendly, fast |
z-ai/glm-4.7 |
Z.AI | Balanced performance |
google/gemini-3-pro-preview |
1M context window | |
moonshotai/kimi-k2-thinking |
MoonShot | Extended reasoning |
deepseek/deepseek-v3.2 |
DeepSeek | Code specialist |
qwen/qwen3-vl-235b-a22b-thinking |
Alibaba | Vision + reasoning |
Vertex AI Partner Models (MaaS - Google Cloud billing):
| Model | Provider | Best For |
|---|---|---|
vertex/minimax/minimax-m2-maas |
MiniMax | Fast, budget-friendly |
vertex/mistralai/codestral-2 |
Mistral | Code specialist |
vertex/deepseek/deepseek-v3-2-maas |
DeepSeek | Deep reasoning |
vertex/qwen/qwen3-coder-480b-a35b-instruct-maas |
Qwen | Agentic coding |
vertex/openai/gpt-oss-120b-maas |
OpenAI | Open-weight reasoning |
List all models:
claudish --models # List all OpenRouter models
claudish --models gemini # Search for specific models
claudish --top-models # Show curated recommendationsRun specialized agents in headless mode with direct agent selection:
# Use frontend developer agent
claudish --model x-ai/grok-code-fast-1 --agent frontend:developer "create a React button component"
# Use API architect agent
claudish --model openai/gpt-5.3-codex --agent api-architect "design REST API for user management"
# Discover available agents in your project
claudish --list-agentsAgent Features:
- ✅ Direct agent selection - No need to ask Claude to use an agent
- ✅ Automatic prefixing - Adds
@agent-automatically (frontend:developer→@agent-frontend:developer) - ✅ Project-specific agents - Works with any agents installed in
.claude/agents/ - ✅ Agent discovery - List all available agents with
--list-agents
Claudish automatically shows critical information in the Claude Code status bar - no setup required!
Ultra-Compact Format: directory • model-id • $cost • ctx%
Visual Design:
- 🔵 Directory (bright cyan, bold) - Where you are
- 🟡 Model ID (bright yellow) - Actual OpenRouter model ID
- 🟢 Cost (bright green) - Real-time session cost from OpenRouter
- 🟣 Context (bright magenta) - % of context window remaining
- ⚪ Separators (dim) - Visual dividers
Examples:
claudish • x-ai/grok-code-fast-1 • $0.003 • 95%- Using Grok, $0.003 spent, 95% context leftmy-project • openai/gpt-5-codex • $0.12 • 67%- Using GPT-5, $0.12 spent, 67% context leftbackend • minimax/minimax-m2 • $0.05 • 82%- Using MiniMax M2, $0.05 spent, 82% lefttest • openrouter/auto • $0.01 • 90%- Using any custom model, $0.01 spent, 90% left
Critical Tracking (Live Updates):
- 💰 Cost tracking - Real-time USD from Claude Code session data
- 📊 Context monitoring - Percentage of model's context window remaining
- ⚡ Performance optimized - Ultra-compact to fit with thinking mode UI
Thinking Mode Optimized:
- ✅ Ultra-compact - Directory limited to 15 chars (leaves room for everything)
- ✅ Critical first - Most important info (directory, model) comes first
- ✅ Smart truncation - Long directories shortened with "..."
- ✅ Space reservation - Reserves ~40 chars for Claude's thinking mode UI
- ✅ Color-coded - Instant visual scanning
- ✅ No overflow - Fits perfectly even with thinking mode enabled
Custom Model Support:
- ✅ ANY OpenRouter model - Not limited to shortlist (e.g.,
openrouter/auto, custom models) - ✅ Actual model IDs - Shows exact OpenRouter model ID (no translation)
- ✅ Context fallback - Unknown models use 100k context window (safe default)
- ✅ Shortlist optimized - Our recommended models have accurate context sizes
- ✅ Future-proof - Works with new models added to OpenRouter
How it works:
- Each Claudish instance creates a temporary settings file with custom status line
- Settings use
--settingsflag (doesn't modify global Claude Code config) - Status line uses simple bash script with ANSI colors (no external dependencies!)
- Displays actual OpenRouter model ID from
CLAUDISH_ACTIVE_MODEL_NAMEenv var - Context tracking uses model-specific sizes for our shortlist, 100k fallback for others
- Temp files are automatically cleaned up when Claudish exits
- Each instance is completely isolated - run multiple in parallel!
Per-instance isolation:
- ✅ Doesn't modify
~/.claude/settings.json - ✅ Each instance has its own config
- ✅ Safe to run multiple Claudish instances in parallel
- ✅ Standard Claude Code unaffected
- ✅ Temp files auto-cleanup on exit
- ✅ No external dependencies (bash only, no jq!)
# Simple prompt
claudish "fix the bug in user.ts"
# Multi-word prompt
claudish "implement user authentication with JWT tokens"# Auto-detected native routing (model name determines provider)
claudish --model gpt-4o "refactor entire API layer" # → OpenAI
claudish --model gemini-2.0-flash "quick fix" # → Google
claudish --model llama-3.1-70b "code review" # → OllamaCloud
# Explicit provider routing (new @ syntax)
claudish --model google@gemini-2.5-pro "complex analysis"
claudish --model oai@o1 "deep reasoning task"
claudish --model openrouter@deepseek/deepseek-r1 "analysis" # Unknown vendors need explicit OR
# Local models with concurrency control
claudish --model ollama@llama3.2 "code review"
claudish --model ollama@llama3.2:3 "parallel processing" # 3 concurrent
claudish --model lmstudio@qwen2.5-coder "implement dashboard UI"Auto-approve is enabled by default. For fully autonomous mode, add --dangerous:
# Basic usage (auto-approve already enabled)
claudish "delete unused files"
# Fully autonomous (auto-approve + dangerous sandbox disabled)
claudish --dangerous "install dependencies"
# Disable auto-approve if you want prompts
claudish --no-auto-approve "make important changes"# Use specific port
claudish --port 3000 "analyze codebase"
# Or set default
export CLAUDISH_PORT=3000
claudish "your task"# Verbose mode
claudish "debug issue" --verbose
# Custom working directory
claudish "analyze code" --cwd /path/to/project
# Multiple flags
claudish --model openai/gpt-5.3-codex "task" --verbose --debugNEW! Claudish now includes a monitor mode to help you understand how Claude Code works internally.
# Enable monitor mode (requires real Anthropic API key)
claudish --monitor --debug "implement a feature"What Monitor Mode Does:
- ✅ Proxies to REAL Anthropic API (not OpenRouter) - Uses your actual Anthropic API key
- ✅ Logs ALL traffic - Captures complete requests and responses
- ✅ Both streaming and JSON - Logs SSE streams and JSON responses
- ✅ Debug logs to file - Saves to
logs/claudish_*.logwhen--debugis used - ✅ Pass-through proxy - No translation, forwards as-is to Anthropic
When to use Monitor Mode:
- 🔍 Understanding Claude Code's API protocol
- 🐛 Debugging integration issues
- 📊 Analyzing Claude Code's behavior
- 🔬 Research and development
Requirements:
# Monitor mode requires a REAL Anthropic API key (not placeholder)
export ANTHROPIC_API_KEY='sk-ant-api03-...'
# Use with --debug to save logs to file
claudish --monitor --debug "your task"
# Logs are saved to: logs/claudish_TIMESTAMP.logExample Output:
[Monitor] Server started on http://127.0.0.1:8765
[Monitor] Mode: Passthrough to real Anthropic API
[Monitor] All traffic will be logged for analysis
=== [MONITOR] Claude Code → Anthropic API Request ===
{
"model": "claude-sonnet-4.5",
"messages": [...],
"max_tokens": 4096,
...
}
=== End Request ===
=== [MONITOR] Anthropic API → Claude Code Response (Streaming) ===
event: message_start
data: {"type":"message_start",...}
event: content_block_start
data: {"type":"content_block_start",...}
...
=== End Streaming Response ===
Note: Monitor mode charges your Anthropic account (not OpenRouter). Use --debug flag to save logs for analysis.
Claudish supports three output modes for different use cases:
Clean output with no [claudish] logs - perfect for piping to other tools:
# Quiet by default in single-shot
claudish "what is 2+2?"
# Output: 2 + 2 equals 4.
# Use in pipelines
claudish "list 3 colors" | grep -i blue
# Redirect to file
claudish "analyze code" > analysis.txtShow all [claudish] log messages for debugging:
# Verbose mode
claudish --verbose "what is 2+2?"
# Output:
# [claudish] Starting Claude Code with openai/gpt-4o
# [claudish] Proxy URL: http://127.0.0.1:8797
# [claudish] Status line: dir • openai/gpt-4o • $cost • ctx%
# ...
# 2 + 2 equals 4.
# [claudish] Shutting down proxy server...
# [claudish] Done
# Interactive mode is verbose by default
claudish --interactiveStructured output perfect for automation and tool integration:
# JSON output (always quiet)
claudish --json "what is 2+2?"
# Output: {"type":"result","result":"2 + 2 equals 4.","total_cost_usd":0.068,"usage":{...}}
# Extract just the result with jq
claudish --json "list 3 colors" | jq -r '.result'
# Get cost and token usage
claudish --json "analyze code" | jq '{result, cost: .total_cost_usd, tokens: .usage.input_tokens}'
# Use in scripts
RESULT=$(claudish --json "check if tests pass" | jq -r '.result')
echo "AI says: $RESULT"
# Track costs across multiple runs
for task in task1 task2 task3; do
claudish --json "$task" | jq -r '"\(.total_cost_usd)"'
done | awk '{sum+=$1} END {print "Total: $"sum}'JSON Output Fields:
result- The AI's response texttotal_cost_usd- Total cost in USDusage.input_tokens- Input tokens usedusage.output_tokens- Output tokens usedduration_ms- Total duration in millisecondsnum_turns- Number of conversation turnsmodelUsage- Per-model usage breakdown
claudish "your prompt"
↓
1. Parse arguments (--model, --no-auto-approve, --dangerous, etc.)
2. Find available port (random or specified)
3. Start local proxy on http://127.0.0.1:PORT
4. Spawn: claude --auto-approve --env ANTHROPIC_BASE_URL=http://127.0.0.1:PORT
5. Proxy translates: Anthropic API → OpenRouter API
6. Stream output in real-time
7. Cleanup proxy on exit
Normal Mode (OpenRouter):
Claude Code → Anthropic API format → Local Proxy → OpenRouter API format → OpenRouter
↓
Claude Code ← Anthropic API format ← Local Proxy ← OpenRouter API format ← OpenRouter
Monitor Mode (Anthropic Passthrough):
Claude Code → Anthropic API format → Local Proxy (logs) → Anthropic API
↓
Claude Code ← Anthropic API format ← Local Proxy (logs) ← Anthropic API
Each claudish invocation:
- Gets a unique random port
- Starts isolated proxy server
- Runs independent Claude Code instance
- Cleans up on exit
This allows multiple parallel runs:
# Terminal 1
claudish --model x-ai/grok-code-fast-1 "task A"
# Terminal 2
claudish --model openai/gpt-5.3-codex "task B"
# Terminal 3
claudish --model minimax/minimax-m2 "task C"NEW in v1.1.0: Claudish now fully supports models with extended thinking/reasoning capabilities (Grok, o1, etc.) with complete Anthropic Messages API protocol compliance.
Claudish includes a sophisticated Thinking Translation Model that aligns Claude Code's native thinking budget with the unique requirements of every major AI provider.
When you set a thinking budget in Claude (e.g., budget: 16000), Claudish automatically translates it:
| Provider | Model | Translation Logic |
|---|---|---|
| OpenAI | o1, o3 | Maps budget to reasoning_effort (minimal/low/medium/high) |
| Gemini 3 | Maps to thinking_level (low/high) |
|
| Gemini 2.x | Passes exact thinking_budget (capped at 24k) |
|
| xAI | Grok 3 Mini | Maps to reasoning_effort (low/high) |
| Qwen | Qwen 2.5 | Enables enable_thinking + exact budget |
| MiniMax | M2 | Enables reasoning_split (interleaved thinking) |
| DeepSeek | R1 | Automatically manages reasoning (params stripped for safety) |
This ensures you can use standard Claude Code thinking controls with ANY supported model, without worrying about API specificities.
Some AI models (like Grok and OpenAI's o1) can show their internal reasoning process before providing the final answer. This "thinking" content helps you understand how the model arrived at its conclusion.
Claudish implements the Anthropic Messages API's interleaved-thinking protocol:
Thinking Blocks (Hidden):
- Contains model's reasoning process
- Automatically collapsed in Claude Code UI
- Shows "Claude is thinking..." indicator
- User can expand to view reasoning
Text Blocks (Visible):
- Contains final response
- Displayed normally
- Streams incrementally
- ✅ x-ai/grok-code-fast-1 - Grok's reasoning mode
- ✅ openai/gpt-5-codex - o1 reasoning (when enabled)
- ✅ openai/o1-preview - Full reasoning support
- ✅ openai/o1-mini - Compact reasoning
⚠️ Other models may support reasoning in future
Streaming Protocol (V2 - Protocol Compliant):
1. message_start
2. content_block_start (text, index=0) ← IMMEDIATE! (required)
3. ping
4. [If reasoning arrives]
- content_block_stop (index=0) ← Close initial empty block
- content_block_start (thinking, index=1) ← Reasoning
- thinking_delta events × N
- content_block_stop (index=1)
5. content_block_start (text, index=2) ← Response
6. text_delta events × M
7. content_block_stop (index=2)
8. message_delta + message_stop
Critical: content_block_start must be sent immediately after message_start, before ping. This is required by the Anthropic Messages API protocol for proper UI initialization.
Key Features:
- ✅ Separate thinking and text blocks (proper indices)
- ✅
thinking_deltavstext_deltaevent types - ✅ Thinking content hidden by default
- ✅ Smooth transitions between blocks
- ✅ Full Claude Code UI compatibility
Before (v1.0.0 - No Thinking Support):
- Reasoning visible as regular text
- Confusing output with internal thoughts
- No progress indicators
- "All at once" message updates
After (v1.1.0 - Full Protocol Support):
- ✅ Reasoning hidden/collapsed
- ✅ Clean, professional output
- ✅ "Claude is thinking..." indicator shown
- ✅ Smooth incremental streaming
- ✅ Message headers/structure visible
- ✅ Protocol compliant with Anthropic Messages API
For complete protocol documentation, see:
- STREAMING_PROTOCOL.md - Complete SSE protocol spec
- PROTOCOL_FIX_V2.md - Critical V2 protocol fix (event ordering)
- COMPREHENSIVE_UX_ISSUE_ANALYSIS.md - Technical analysis
- THINKING_BLOCKS_IMPLEMENTATION.md - Implementation summary
Claudish now intelligently adapts to ANY reasoning model!
No more hardcoded lists or manual flags. Claudish dynamically queries OpenRouter metadata to enable thinking capabilities for any model that supports them.
-
Auto-Detection:
- Automatically checks model capabilities at startup
- Enables Extended Thinking UI only when supported
- Future-proof: Works instantly with new models (e.g.,
deepseek-r1orminimax-m2)
-
Smart Parameter Mapping:
- Claude: Passes token budget directly (e.g., 16k tokens)
- OpenAI (o1/o3): Translates budget to
reasoning_effort- "ultrathink" (≥32k) →
high - "think hard" (16k-32k) →
medium - "think" (<16k) →
low
- "ultrathink" (≥32k) →
- Gemini & Grok: Preserves thought signatures and XML traces automatically
-
Universal Compatibility:
- Use "ultrathink" or "think hard" prompts with ANY supported model
- Claudish handles the translation layer for you
NEW in v1.2.0: Claudish now intelligently manages token counting to support ANY context window size (from 128k to 2M+) while preserving Claude Code's native auto-compaction behavior.
Claude Code naturally assumes a fixed context window (typically 200k tokens for Sonnet).
- Small Models (e.g., Grok 128k): Claude might overuse context and crash.
- Massive Models (e.g., Gemini 2M): Claude would compact way too early (at 10% usage), wasting the model's potential.
Claudish implements a "Dual-Accounting" system:
-
Internal Scaling (For Claude):
- We fetch the real context limit from OpenRouter (e.g., 1M tokens).
- We scale reported token usage so Claude thinks 1M tokens is 200k.
- Result: Auto-compaction triggers at the correct percentage of usage (e.g., 90% full), regardless of the actual limit.
-
Accurate Reporting (For You):
- The status line displays the Real Unscaled Usage and Real Context %.
- You see specific costs and limits, while Claude remains blissfully unaware and stable.
Benefits:
- ✅ Works with ANY model size (128k, 1M, 2M, etc.)
- ✅ Unlocks massive context windows (Claude Code becomes 10x more powerful with Gemini!)
- ✅ Prevents crashes on smaller models (Grok)
- ✅ Native behavior (compaction just works)
mcp/claudish/
├── src/
│ ├── index.ts # Main entry point
│ ├── cli.ts # CLI argument parser
│ ├── proxy-server.ts # Hono-based proxy server
│ ├── transform.ts # API format translation (from claude-code-proxy)
│ ├── claude-runner.ts # Claude CLI runner (creates temp settings)
│ ├── port-manager.ts # Port utilities
│ ├── config.ts # Constants and defaults
│ └── types.ts # TypeScript types
├── tests/ # Test files
├── package.json
├── tsconfig.json
└── biome.json
Claudish uses a Hono-based proxy server inspired by claude-code-proxy:
- Framework: Hono - Fast, lightweight web framework
- API Translation: Converts Anthropic API format ↔ OpenAI format
- Streaming: Full support for Server-Sent Events (SSE)
- Tool Calling: Handles Claude's tool_use ↔ OpenAI's tool_calls
- Battle-tested: Based on production-ready claude-code-proxy implementation
Why Hono?
- Native Bun support (no adapters needed)
- Extremely fast and lightweight
- Middleware support (CORS, logging, etc.)
- Works across Node.js, Bun, and Cloudflare Workers
# Install dependencies
bun install
# Development mode
bun run dev "test prompt"
# Build
bun run build
# Lint
bun run lint
# Format
bun run format
# Type check
bun run typecheck
# Run tests
bun testClaudish includes a comprehensive snapshot testing system to ensure 1:1 compatibility with the official Claude Code protocol:
# Run snapshot tests (13/13 passing ✅)
bun test tests/snapshot.test.ts
# Full workflow: capture fixtures + run tests
./tests/snapshot-workflow.sh --full
# Capture new test fixtures from monitor mode
./tests/snapshot-workflow.sh --capture
# Debug SSE events
bun tests/debug-snapshot.tsWhat Gets Tested:
- ✅ Event sequence (message_start → content_block_start → deltas → stop → message_delta → message_stop)
- ✅ Content block indices (sequential: 0, 1, 2, ...)
- ✅ Tool input streaming (fine-grained JSON chunks)
- ✅ Usage metrics (present in message_start and message_delta)
- ✅ Stop reasons (always present and valid)
- ✅ Cache metrics (creation and read tokens)
Documentation:
- Quick Start Guide - Get started with testing
- Snapshot Testing Guide - Complete testing documentation
- Implementation Details - Technical implementation summary
- Protocol Compliance Plan - Detailed compliance roadmap
# Link for global use
bun run install:global
# Now use anywhere
claudish "your task"Install Claude Code:
npm install -g claude-code
# or visit: https://claude.com/claude-codeSet your API key:
export OPENROUTER_API_KEY=sk-or-v1-...Or add to your shell profile (~/.zshrc, ~/.bashrc):
echo 'export OPENROUTER_API_KEY=sk-or-v1-...' >> ~/.zshrc
source ~/.zshrcSpecify a custom port:
claudish --port 3000 "your task"Or increase port range in src/config.ts.
Check OpenRouter API status:
Verify your API key works:
If the status line doesn't show the model name:
-
Check if --settings flag is being passed:
# Look for this in Claudish output: # [claudish] Instance settings: /tmp/claudish-settings-{timestamp}.json
-
Verify environment variable is set:
# Should be set automatically by Claudish echo $CLAUDISH_ACTIVE_MODEL_NAME # Should output something like: xAI/Grok-1
-
Test status line command manually:
export CLAUDISH_ACTIVE_MODEL_NAME="xAI/Grok-1" cat > /dev/null && echo "[$CLAUDISH_ACTIVE_MODEL_NAME] 📁 $(basename "$(pwd)")" # Should output: [xAI/Grok-1] 📁 your-directory-name
-
Check temp settings file:
# File is created in /tmp/claudish-settings-*.json ls -la /tmp/claudish-settings-*.json 2>/dev/null | tail -1 cat /tmp/claudish-settings-*.json | head -1
-
Verify bash is available:
which bash # Should show path to bash (usually /bin/bash or /usr/bin/bash)
Note: Temp settings files are automatically cleaned up when Claudish exits. If you see multiple files, you may have crashed instances - they're safe to delete manually.
| Feature | Claude Code | Claudish |
|---|---|---|
| Model | Anthropic models only | Any OpenRouter model |
| API | Anthropic API | OpenRouter API |
| Cost | Anthropic pricing | OpenRouter pricing |
| Setup | API key → direct | API key → proxy → OpenRouter |
| Speed | Direct connection | ~Same (local proxy) |
| Features | All Claude Code features | All Claude Code features |
When to use Claudish:
- ✅ Want to try different models (Grok, GPT-5, etc.)
- ✅ Need OpenRouter-specific features
- ✅ Prefer OpenRouter pricing
- ✅ Testing model performance
When to use Claude Code:
- ✅ Want latest Anthropic models only
- ✅ Need official Anthropic support
- ✅ Simpler setup (no proxy)
Contributions welcome! Please:
- Fork the repo
- Create feature branch:
git checkout -b feature/amazing - Commit changes:
git commit -m 'Add amazing feature' - Push to branch:
git push origin feature/amazing - Open Pull Request
MIT © MadAppGang
Claudish's proxy implementation is based on claude-code-proxy by @kiyo-e. We've adapted their excellent Hono-based API translation layer for OpenRouter integration.
Key contributions from claude-code-proxy:
- Anthropic ↔ OpenAI API format translation (
transform.ts) - Streaming response handling with Server-Sent Events
- Tool calling compatibility layer
- Clean Hono framework architecture
Thank you to the claude-code-proxy team for building a robust, production-ready foundation! 🙏
- GitHub: https://github.com/MadAppGang/claudish
- OpenRouter: https://openrouter.ai
- Claude Code: https://claude.com/claude-code
- Bun: https://bun.sh
- Hono: https://hono.dev
- claude-code-proxy: https://github.com/kiyo-e/claude-code-proxy
Made with ❤️ by MadAppGang