Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ coverage/
# Temporary files
tmp/
temp/
.npm-install-hash

# Browser profiles
profiles/
Expand Down
39 changes: 33 additions & 6 deletions skills/dev-browser/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,17 +15,44 @@ Browser automation that maintains page state across script executions. Write sma

## Setup

Two modes available. Ask the user if unclear which to use.
```bash
./skills/dev-browser/server.sh &
```

### Standalone Mode (Default)
**Wait for the `Ready` message before running scripts.**

Launches a new Chromium browser for fresh automation sessions.
The server auto-detects the best browser mode based on user configuration at `~/.dev-browser/config.json`:

```bash
./skills/dev-browser/server.sh &
- **External Browser** (default when Chrome for Testing is installed): Uses Chrome for Testing via CDP. Browser stays open after automation.
- **Standalone**: Uses Playwright's built-in Chromium. Use `--standalone` flag to force this mode.

**Flags:**
- `--standalone` - Force standalone Playwright mode
- `--headless` - Run headless (standalone mode only)

### Configuration

Browser settings are configured in `~/.dev-browser/config.json`:

```json
{
"browser": {
"mode": "auto",
"path": "/Applications/Google Chrome for Testing.app/Contents/MacOS/Google Chrome for Testing"
}
}
```

Add `--headless` flag if user requests it. **Wait for the `Ready` message before running scripts.**
| Setting | Values | Description |
|---------|--------|-------------|
| `browser.mode` | `"auto"` (default), `"external"`, `"standalone"` | `auto` uses Chrome for Testing if found, otherwise Playwright |
| `browser.path` | Path string | Custom browser executable path (auto-detected if not set) |
| `browser.userDataDir` | Path string | Browser profile directory for external mode (uses browser's default if not set) |

**Auto-detection paths:**
- **macOS**: `/Applications/Google Chrome for Testing.app/Contents/MacOS/Google Chrome for Testing`
- **Linux**: `/opt/google/chrome-for-testing/chrome`, `/usr/bin/google-chrome-for-testing`
- **Windows**: `C:\Program Files\Google\Chrome for Testing\Application\chrome.exe`

### Extension Mode

Expand Down
177 changes: 177 additions & 0 deletions skills/dev-browser/docs/CONCURRENCY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
# Multi-Agent Concurrency Support

This document explains how dev-browser supports multiple concurrent agents and the design decisions behind the implementation.

## The Problem

When multiple AI agents (e.g., Claude Code sub-agents) run browser automation tasks in parallel, they need to avoid conflicts. The original dev-browser design assumed a single server on a fixed port, which creates a bottleneck:

> "dev-browser is in fact a single point of congestion now, nullifying the advantages of dev browser"
> — [PR #15 discussion](https://github.com/SawyerHood/dev-browser/pull/15#issuecomment-3698722432)

## Solution: Dynamic Port Allocation

Each agent automatically gets its own HTTP API server on a unique port:

```
Agent 1 ──► server (port 9222) ──┐
Agent 2 ──► server (port 9224) ──┼──► Shared Browser (CDP 9223)
Agent 3 ──► server (port 9226) ──┘
```

### How It Works

1. **Port Auto-Assignment**: When `port` is not specified, the server finds an available port in the configured range (default: 9222-9300, step 2)

2. **Port Discovery**: Server outputs `PORT=XXXX` to stdout, which agents parse to know which port to connect to

3. **Server Tracking**: Active servers are tracked in `~/.dev-browser/active-servers.json` for coordination

4. **Shared Browser**: In external browser mode, all servers connect to the same browser via CDP, minimizing resource usage

## Design Decisions

### Options Considered

#### Option 1: Manual Port Assignment (Rejected)

From [PR #15](https://github.com/SawyerHood/dev-browser/pull/15), the initial proposal was to add `--port` and `--cdp-port` CLI flags for manual assignment.

**Why rejected**: Requires agents to coordinate port selection, adds complexity to agent implementation, and creates potential for conflicts.

#### Option 2: Singleton Server with Named Pages (Rejected)

Have one persistent server handling all agents, using page names for isolation.

**Why rejected**: Incompatible with the plugin architecture where each agent spawns its own server process. Also creates a true single point of failure.

#### Option 3: Dynamic Port Allocation (Chosen)

Servers automatically discover and claim available ports.

**Why chosen**:
- Zero configuration required
- Agents don't need to coordinate
- Works with existing plugin architecture
- Each agent is isolated (failure doesn't affect others)
- Memory overhead is acceptable (~140MB per server)

### Memory Considerations

Each dev-browser server uses approximately:
- **Node.js + Playwright + Express**: ~140MB
- **Browser (if standalone mode)**: ~300MB additional

In external browser mode, multiple servers share one browser, making the per-agent overhead just ~140MB.

## Configuration

Create `~/.dev-browser/config.json` to customize behavior:

```json
{
"portRange": {
"start": 9222,
"end": 9300,
"step": 2
},
"cdpPort": 9223
}
```

| Option | Default | Description |
|--------|---------|-------------|
| `portRange.start` | 9222 | First port to try for HTTP API |
| `portRange.end` | 9300 | Last port to try |
| `portRange.step` | 2 | Port increment (avoids CDP port collision) |
| `cdpPort` | 9223 | Chrome DevTools Protocol port |

## Usage Examples

### Multiple Agents (External Browser Mode)

```bash
# Terminal 1: Start Chrome for Testing, then:
BROWSER_PATH="/path/to/chrome" npx tsx scripts/start-external-browser.ts
# Output: PORT=9222

# Terminal 2: Second agent
npx tsx scripts/start-external-browser.ts
# Output: PORT=9224

# Terminal 3: Third agent
npx tsx scripts/start-external-browser.ts
# Output: PORT=9226

# All agents share the same browser on CDP port 9223
```

### Multiple Agents (Standalone Mode)

```bash
# Terminal 1: First agent launches its own browser
npx tsx scripts/start-server.ts
# Output: PORT=9222

# Terminal 2: Second agent launches separate browser
npx tsx scripts/start-server.ts
# Output: PORT=9224
```

### Programmatic Usage

```typescript
import { serve, serveWithExternalBrowser } from "dev-browser";

// Port is automatically assigned
const server1 = await serve(); // Gets port 9222
const server2 = await serve(); // Gets port 9224

console.log(`Server 1 on port ${server1.port}`);
console.log(`Server 2 on port ${server2.port}`);

// Or with external browser
const external1 = await serveWithExternalBrowser();
const external2 = await serveWithExternalBrowser();
// Both connect to same browser on CDP 9223
```

## Troubleshooting

### "No available ports in range"

Too many servers running. Check active servers:

```bash
cat ~/.dev-browser/active-servers.json
```

Clean up stale entries (servers that crashed):

```bash
rm ~/.dev-browser/active-servers.json
```

### Port Conflicts

If a specific port is required, set `PORT` environment variable:

```bash
PORT=9250 npx tsx scripts/start-external-browser.ts
```

### Checking Server Status

```bash
# List all active servers
cat ~/.dev-browser/active-servers.json

# Test a specific server
curl http://localhost:9222/
# Returns: {"wsEndpoint":"ws://...","mode":"external-browser","port":9222}
```

## References

- [PR #15: Multi-port support discussion](https://github.com/SawyerHood/dev-browser/pull/15)
- [PR #20: External browser mode](https://github.com/SawyerHood/dev-browser/pull/20)
37 changes: 37 additions & 0 deletions skills/dev-browser/scripts/get-browser-config.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
/**
* Output resolved browser configuration for shell scripts.
*
* Usage: npx tsx scripts/get-browser-config.ts
*
* Output format (shell-eval compatible):
* BROWSER_MODE="external"
* BROWSER_PATH="/path/to/chrome"
* BROWSER_USER_DATA_DIR="/path/to/profile"
*/

import { getResolvedBrowserConfig } from "@/config.js";

/**
* Shell-escape a string value for safe eval.
*/
function shellEscape(value: string): string {
// Use double quotes and escape special characters
return `"${value.replace(/"/g, '\\"')}"`;
}

try {
const config = getResolvedBrowserConfig();

// Output in shell-eval format with proper quoting
console.log(`BROWSER_MODE=${shellEscape(config.mode)}`);
console.log(`BROWSER_PATH=${shellEscape(config.path || "")}`);
// Only output userDataDir if explicitly configured
console.log(`BROWSER_USER_DATA_DIR=${shellEscape(config.userDataDir || "")}`);
} catch (err) {
// On error, output standalone mode as fallback
console.error(`Warning: ${err instanceof Error ? err.message : err}`);
console.log(`BROWSER_MODE="standalone"`);
console.log(`BROWSER_PATH=""`);
console.log(`BROWSER_USER_DATA_DIR=""`);
process.exit(0); // Don't fail - standalone is a valid fallback
}
89 changes: 89 additions & 0 deletions skills/dev-browser/scripts/start-external-browser.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
/**
* Start dev-browser server connecting to an external browser via CDP.
*
* This mode is ideal for:
* - Chrome for Testing or other specific browser builds
* - Development workflows where you want the browser visible
* - Keeping the browser open after automation for manual inspection
* - Running multiple agents concurrently (each gets its own port automatically)
*
* Environment variables:
* PORT - HTTP API port (default: auto-assigned from 9222-9300)
* CDP_PORT - Browser's CDP port (default: 9223)
* BROWSER_PATH - Path to browser executable (for auto-launch)
* USER_DATA_DIR - Browser profile directory (default: ~/.dev-browser-profile)
* AUTO_LAUNCH - Whether to auto-launch browser if not running (default: true)
*
* Configuration file: ~/.dev-browser/config.json
* {
* "portRange": { "start": 9222, "end": 9300, "step": 2 },
* "cdpPort": 9223
* }
*
* Example with Chrome for Testing:
* BROWSER_PATH="/Applications/Google Chrome for Testing.app/Contents/MacOS/Google Chrome for Testing" \
* npx tsx scripts/start-external-browser.ts
*
* Multi-agent usage:
* # Terminal 1: First agent gets port 9222
* npx tsx scripts/start-external-browser.ts
* # Output: PORT=9222
*
* # Terminal 2: Second agent gets port 9224
* npx tsx scripts/start-external-browser.ts
* # Output: PORT=9224
*
* # Both agents share the same browser on CDP port 9223
*/

import { serveWithExternalBrowser } from "@/external-browser.js";
import { mkdirSync } from "fs";
import { join, dirname } from "path";
import { fileURLToPath } from "url";

const __dirname = dirname(fileURLToPath(import.meta.url));
const tmpDir = join(__dirname, "..", "tmp");

// Create tmp directory if it doesn't exist
mkdirSync(tmpDir, { recursive: true });

// Configuration from environment (PORT is optional - will be auto-assigned)
const port = process.env.PORT ? parseInt(process.env.PORT, 10) : undefined;
const cdpPort = process.env.CDP_PORT ? parseInt(process.env.CDP_PORT, 10) : undefined;
const browserPath = process.env.BROWSER_PATH;
// Only pass userDataDir if explicitly set - let browser use default profile otherwise
const userDataDir = process.env.USER_DATA_DIR || undefined;
const autoLaunch = process.env.AUTO_LAUNCH !== "false";

console.log("Starting dev-browser with external browser mode...");
console.log(` HTTP API port: ${port ?? "auto (dynamic)"}`);
console.log(` CDP port: ${cdpPort ?? "from config (default: 9223)"}`);
if (browserPath) {
console.log(` Browser path: ${browserPath}`);
}
console.log(` User data dir: ${userDataDir ?? "(default profile)"}`);
console.log(` Auto-launch: ${autoLaunch}`);
console.log(` Config: ~/.dev-browser/config.json`);
console.log("");

const server = await serveWithExternalBrowser({
port,
cdpPort,
browserPath,
userDataDir,
autoLaunch,
});

console.log("");
console.log(`Dev browser server started`);
console.log(` WebSocket: ${server.wsEndpoint}`);
console.log(` HTTP API: http://localhost:${server.port}`);
console.log(` Mode: ${server.mode}`);
console.log(` Tmp directory: ${tmpDir}`);
console.log("");
console.log("Ready");
console.log("");
console.log("Press Ctrl+C to stop (browser will remain open)");

// Keep the process running
await new Promise(() => {});
Loading