Skip to content

feat: unify setup-dev flow into shared setup module#2054

Merged
nick-inkeep merged 18 commits intomainfrom
feat/unified-setup-dev
Feb 19, 2026
Merged

feat: unify setup-dev flow into shared setup module#2054
nick-inkeep merged 18 commits intomainfrom
feat/unified-setup-dev

Conversation

@nick-inkeep
Copy link
Collaborator

@nick-inkeep nick-inkeep commented Feb 17, 2026

Summary

Extracts the setup-dev flow into a shared @inkeep/agents-core/setup module. The monorepo's bash setup.sh (223 lines) and the template's setup.js (623 lines) are replaced by thin wrappers (~30-60 lines) that pass config to a single runSetup() entry point.

Key decisions

Why a shared module in agents-core (not a standalone package or script)

The setup flow needs access to loadEnvironmentFiles() from agents-core and must be importable by both the monorepo (scripts/setup-dev.js) and the published quickstart template (create-agents-template). Putting it in agents-core/setup avoids a new package while keeping it co-located with the env/auth/db primitives it depends on. Trade-off: agents-core grows slightly, and the monorepo wrapper must build agents-core before running setup (visible in the Cypress CI change).

Config-driven callers instead of flags

Each caller passes a SetupConfig object (compose file path, migration commands, push target, server commands, isCloud, skipPush). This avoids a flag explosion on a single CLI and lets each context (monorepo contributor vs. quickstart user vs. cloud deploy) declare exactly what it needs. The shared module has zero awareness of who's calling it.

Pre-flight checks before Docker startup

Rather than letting docker-compose up fail cryptically on port conflicts, the module runs lsof + docker ps + docker inspect to identify which Docker Compose project owns the conflicting port and tells the user the exact docker compose -p <name> down command. For standalone (non-compose) containers, it suggests docker rm -f <name>. If the ports belong to our own project, startup is skipped (idempotent re-run).

When Docker is unavailable, the module checks whether database URLs point to localhost or an external host. Localhost URLs (from .env.example defaults) indicate Docker-managed DBs are expected, so setup exits with an error. External host URLs indicate the user is managing DBs externally, so Docker is skipped gracefully.

Health polling: all three database services

After docker-compose up -d, health polling checks all three services in parallel: doltgres-db (60s timeout), postgres-db (30s), and spicedb-postgres (30s). Setup only hard-fails if both primary databases (doltgres + postgres) fail health checks — a SpiceDB-only failure logs a warning but continues, since writeSpiceDbSchema() in auth init has its own 30-retry logic.

Auth strategy: bypass secret vs. CLI keychain

When INKEEP_AGENTS_MANAGE_API_BYPASS_SECRET is available, push uses CI mode (INKEEP_CI=true + INKEEP_API_KEY). When it's absent (cloud users), the CLI resolves auth from keychain/profile — the template pre-runs inkeep init --no-interactive + inkeep login before calling runSetup(). This means local contributors and cloud users follow different auth paths through the same code, controlled entirely by the presence of the bypass secret.

Process group cleanup for temporarily spawned servers

When the API server is spawned for project push, it's started with detached: true so it gets its own process group. Cleanup uses kill(-pid) (negative PID = group signal) followed by a 2-second grace period and SIGKILL. This prevents orphaned Node processes after setup completes.

CI concurrency limit

Added --concurrency=8 to pnpm check in CI. On cache-miss runs, unlimited Turbo concurrency causes multiple Next.js builds to run simultaneously on a 4-vCPU runner, triggering OOM kills. This is tangential to the setup unification but was discovered and fixed during CI validation of this PR.

User scenarios addressed

Scenario Behavior
First-time contributor runs pnpm setup-dev Full flow: .env creation, JWT gen, Docker startup, migrations, auth init, project push
Re-run after setup already completed Idempotent: skips JWT gen (keys exist), skips Docker startup (containers running), migrations apply only what's new
Another Docker project occupies port 5432/5433/5434 Pre-flight detects conflict, reports project name and exact docker compose -p <name> down command (or docker rm -f <name> for standalone containers), continues with existing DBs
Docker not installed, DB URLs point to external host Graceful degradation: skips Docker entirely, runs migrations against external DBs
Docker not installed, DB URLs are localhost Exits with error — localhost URLs indicate Docker-managed DBs are expected
pnpm setup-dev --skip-push Runs env/Docker/migrations/auth but skips server spawn and project push
docker compose up slow on cold start Timeout (120s) triggers warning, falls through to health polling which waits for containers to become healthy
Quickstart template setup (local) Same flow via @inkeep/agents-core/setup, different compose file and migration commands
Quickstart template setup (cloud, no bypass secret) Runs inkeep init + inkeep login before setup, push authenticates via CLI keychain
Cypress E2E in CI Builds agents-core first, runs setup-dev --skip-push (no project push needed for E2E)

Changes

  • packages/agents-core/src/setup/setup.ts — Shared setup module (~730 lines). Includes isLocalDatabaseUrl() helper for localhost detection and health polling for all 3 DB services.
  • scripts/setup-dev.js — Monorepo wrapper (32 lines), replaces deleted scripts/setup.sh
  • create-agents-template/scripts/setup.js — Rewritten as thin wrapper
  • .github/workflows/ci.yml--concurrency=8 on pnpm check
  • .github/composite-actions/cypress-e2e/action.yml — Build agents-core before setup; --skip-push
  • agents-cookbook/template-projects/inkeep.config.ts — Added apiKey from bypass secret env var
  • agents-docs/ — Updated contributing overview + added troubleshooting section for local env issues
  • .gitignore — Added .setup-complete marker file

Test coverage

Unit tests (17 cases in setup.test.ts)

Covers the core runSetup() logic with mocked exec/spawn/fetch:

  • Migration execution (manage + runtime, parallel)
  • Auth init (with credentials, without credentials)
  • Docker skip when isCloud: true
  • Docker startup when ports are free
  • SpiceDB PostgreSQL health-checked alongside doltgres and postgres
  • Docker skip when pre-flight detects foreign containers on required ports
  • Docker compose timeout during startup (graceful continue to health polling)
  • Docker unavailable with external DB URLs (graceful skip)
  • Docker unavailable with localhost DB URLs (exits with error)
  • Missing DB URLs on non-cloud (exits with error)
  • Project push command format and env var pass-through
  • Push with bypass secret (CI mode: INKEEP_CI, INKEEP_API_KEY, INKEEP_TENANT_ID)
  • Push without bypass secret (CLI keychain auth, no CI env vars)
  • --skip-push flag honored
  • Server already running (no spawn)
  • Server not running (spawn + wait for health)

Test plan

Manual QA scenarios tested on macOS (2026-02-18). These resist automation because they require real Docker, real database startups, and real process lifecycle behavior.

  • Clean-state full flow: pnpm setup-dev from scratch (after docker-compose down -v + removing .setup-complete). All 8 steps completed: .env check, JWT skip (already configured), Docker startup with health polling, manage + runtime migrations, auth init, API server temporary spawn, weather-project push, server cleanup. · Why not a test: requires real Docker daemon, real database containers, real network health checks.
  • Idempotency re-run: pnpm setup-dev immediately after successful first run. JWT gen skipped ("already configured"), Docker up -d is a no-op (containers already healthy), all three health checks pass (DoltgreSQL, PostgreSQL, SpiceDB PostgreSQL), migrations ran cleanly, project re-pushed successfully. · Why not a test: requires persistent state across runs (.setup-complete file, running containers).
  • --skip-push flag: pnpm setup-dev --skip-push completes after Step 5 (auth init). No Step 6/7/8 output, no API server spawned, ends with "Database setup completed!" message. · Why not a test: verifiable via unit test (already covered), but manual run confirms real end-to-end flag plumbing.
  • Port conflict detection: Stopped our containers, started a foreign postgres on port 5432 (docker run -d --name conflict-test -p 5432:5432 postgres:16). Setup correctly detected :5432 (DoltgreSQL) → conflict-test, printed the conflict warning, skipped Docker startup, and proceeded to migrations. · Why not a test: requires real Docker containers, real port binding, real lsof + docker inspect calls.
  • Localhost detection regex: Validated isLocalDatabaseUrl() against 10 edge cases including false-positive guards (localhost in DB name, localhost in password). · Why not a test: could be a unit test; validated via inline Node script during QA.
  • CI pipeline: build, typecheck, lint, unit tests all green.
  • Create Agents E2E: Template setup flow green.
  • Cypress E2E: Docker-missing graceful degradation path green.

Bugs found and fixed during QA

  1. Exec timeout misclassified as port conflict (fixed in cee4e1eb2): startDockerDatabases() treated killed === true (Node exec timeout) as a port conflict, skipping health polling. On cold starts where Docker needed to pull images, this caused migrations to fail against containers that weren't ready. Fix: separated timeout from port-conflict handling — timeouts now fall through to health polling, and the exec timeout was increased from 60s to 120s.

  2. No fix suggestion for non-compose containers (fixed in cee4e1eb2): Port conflict detection showed an empty "Fix:" section for standalone containers without a com.docker.compose.project label. Fix: added docker rm -f <name> fallback for non-compose containers.

Review feedback addressed

  1. DB URL existence check was meaningless (fixed in 22d5934): .env.example always provides localhost DB URLs, so the "Docker not available but DB URLs are set" fallback always passed. Fix: replaced existence check with isLocalDatabaseUrl() — localhost URLs require Docker; external host URLs allow graceful Docker skip.

  2. SpiceDB PostgreSQL not health-checked (fixed in 22d5934): Health polling only checked doltgres-db and postgres-db, missing spicedb-postgres (port 5434). Fix: added spicedb-postgres to the Promise.allSettled health polling with 30s timeout.

Future considerations

  • waitForServerReady exists in 4 variants across the repo (setup.ts, create-agents/e2e/utils.ts, agents-cli/dev.ts, agents-sdk/module-hosted-tool-manager.ts). Consolidation was considered but deferred — the implementations differ intentionally (exponential backoff vs. fixed interval, CI-aware logging vs. silent, HTML body check vs. status code, etc.). A unified function flexible enough for all cases would be more complex than the ~15-line inline versions. Revisit if more call sites emerge.

🤖 Generated with Claude Code

@changeset-bot
Copy link

changeset-bot bot commented Feb 17, 2026

🦋 Changeset detected

Latest commit: 073a471

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 9 packages
Name Type
@inkeep/agents-core Patch
@inkeep/agents-api Patch
@inkeep/agents-manage-ui Patch
@inkeep/agents-cli Patch
@inkeep/agents-sdk Patch
@inkeep/agents-work-apps Patch
@inkeep/ai-sdk-provider Patch
@inkeep/create-agents Patch
@inkeep/agents-mcp Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link

vercel bot commented Feb 17, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agents-api Ready Ready Preview, Comment Feb 19, 2026 3:40am
agents-docs Ready Ready Preview, Comment Feb 19, 2026 3:40am
agents-manage-ui Ready Ready Preview, Comment Feb 19, 2026 3:40am

Request Review

@nick-inkeep nick-inkeep force-pushed the feat/unified-setup-dev branch from 3b64f83 to 5c7bf5d Compare February 17, 2026 10:48
@nick-inkeep nick-inkeep marked this pull request as ready for review February 17, 2026 12:44
nick-inkeep and others added 18 commits February 18, 2026 19:16
Extract shared setup logic into `packages/agents-core/src/setup/` so both
the quickstart template and monorepo contributor setup import from a single
source. This eliminates ~600 lines of duplicated setup code and gives
monorepo contributors a seeded project after `pnpm setup-dev`.

- Add `runSetup(config)` shared module with 8-step orchestrated flow:
  env check, JWT keys, Docker DBs, migrations, auth init, health detect,
  project push, cleanup
- Refactor `create-agents-template/scripts/setup.js` from 623 to 42 lines
- Replace `scripts/setup.sh` (bash) with `scripts/setup-dev.js` (Node.js)
- Add `./setup` export to agents-core package.json
- Add bypass secret auth support to cookbook inkeep.config.ts
- Update contributing docs to reflect new setup capabilities
- Add .setup-complete to .gitignore
- Include 10 unit tests covering all setup paths

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use `pnpm turbo` instead of bare `turbo` (resolves PATH issue in sh -c)
- Switch stdio from piped to 'ignore' + proc.unref() to prevent buffer blocking
- Replace PID-based cleanup with process-group kill (-pid) for reliable tree cleanup
- Increase waitForServerReady timeout to 180s for cold API startup
- Track spawned PID in SpawnedServers for group-based cleanup
- Update test mock to include unref()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…entials

The CLI's profile system overrides the config file's apiKey with stale
keychain credentials, causing 401 on push. Setting INKEEP_CI=true and
INKEEP_API_KEY forces CI mode which skips profile loading and sends the
bypass secret as a Bearer token directly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The old setup.sh was pure bash so it didn't need built packages.
The new setup-dev.js imports from agents-core/dist/setup/index.js,
which requires agents-core to be built first. Also pass --skip-push
since the Cypress action has its own push step with retry logic.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CI uses service containers for databases, not docker-compose.
The Cypress workflow now passes --cloud --skip-push so setup-dev
only runs migrations and auth init without attempting Docker startup.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ured

Instead of misusing the --cloud flag (which also skips DB URL validation
and is semantically wrong for CI), detect when docker-compose is not
available and continue setup if database URLs are already configured.
This correctly handles CI environments where databases are service
containers managed by the workflow runner.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The --cloud flag removal left a dangling isCloud reference. The monorepo
contributor path always uses local Docker, so isCloud is hardcoded false.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address review feedback: use logError(msg, error) instead of
logWarning to surface diagnostic information when JWT key generation
fails, matching the established error handling pattern in this file.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Run upfront checks before attempting Docker startup to give developers
clear, actionable feedback instead of hanging on port conflicts:

- Check 1: Docker daemon is running (graceful fallback if DB URLs set)
- Check 2: Docker Compose is available
- Check 3: Required ports (5432, 5433, 5434) — identifies which
  containers from other Docker projects are occupying them, with
  exact `docker compose -p <project> down` commands to resolve

Also adds a 60s timeout to `docker compose up -d` as a safety net
for edge cases that slip through pre-flight.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…sages

Read `com.docker.compose.project` label from container metadata instead
of guessing from container name (which fails for hyphenated project names).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CI environments have `docker compose` (plugin) but not `docker-compose`
(standalone). Track which variant is available during preflight checks
and use it consistently in all compose commands.

Also adds setup-dev documentation updates to contributing guide and
troubleshooting page.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove the hard gate that skipped push when no API key was available.
Now pushProject uses CI mode only when a bypass secret is present;
otherwise it passes through process.env so the CLI can resolve auth
from keychain/profile (cloud users who ran `inkeep login`).

Template setup.js runs `inkeep init --no-interactive` + `inkeep login`
before setup when in cloud mode without a bypass secret, matching the
old template's cloud onboarding flow.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
setup-dev now uses scripts/setup-dev.js which delegates to
@inkeep/agents-core/setup. The old bash script is no longer
referenced by package.json.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…se fix suggestion

Two fixes from QA testing:

1. startDockerDatabases: exec timeout (killed === true) was misclassified as
   a port conflict, causing setup to skip health polling when containers were
   still starting. Now timeouts fall through to health polling, and only real
   port-conflict stderr patterns trigger the early return. Also increased the
   docker-compose up timeout from 60s to 120s.

2. runPreflightChecks: standalone containers (no compose project label) got
   detected as port conflicts but showed an empty "Fix:" line. Now falls back
   to suggesting `docker rm -f <name>` for non-compose containers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…lth polling

Address two review comments from miles-kt-inkeep:

1. The "Docker not available but DB URLs set" fallback was meaningless because
   .env.example always provides localhost URLs. Replace existence check with
   isLocalDatabaseUrl() — localhost URLs now correctly require Docker, while
   external host URLs allow graceful Docker skip.

2. Add spicedb-postgres to Docker health polling alongside doltgres-db and
   postgres-db for consistency and early failure detection before auth init.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In GitHub Actions, databases run as job-level service containers accessible
on localhost — but `docker info` is not available to the runner. The
localhost detection from the previous commit would incorrectly exit because
it saw localhost URLs without Docker. Fix: treat CI environments as always
having externally managed databases, regardless of URL host.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

Re-review Summary

This is a re-review triggered by pull_request:synchronize. The head SHA (073a4716b) matches the previously approved state. No new PR-specific code changes since the last approval.

This PR has been thoroughly reviewed across 15+ prior review cycles, all resulting in approval. The comprehensive review history includes:

Review Cycle Focus Outcome
Initial review Full PR analysis 4 findings raised (3 Major, 1 Minor, 1 Consider)
Author responses Feedback triage 1 accepted (JWT error context), 4 declined with valid rationale
Fix verification JWT error logging Fix correctly implemented
Delta reviews (12×) Pre-flight checks, compose detection, cloud mode push, localhost detection, SpiceDB health polling, CI service containers All approved

Prior Feedback Status

All 5 prior automated inline threads are resolved:

Finding Resolution
🟠 Windows compatibility Declined ✓ — module is Unix-only throughout (sh, docker-compose, SIGTERM/SIGKILL)
🟠 JWT error context Fixed ✓ — logError() now surfaces diagnostic details
🟠 Server spawn failure detection Declined ✓ — out of scope, health-check timeout surfaces failures
🟡 copyFileSync errors Declined ✓ — unhandled exception appropriate for setup-critical path
💭 .mjs extension Declined ✓ — "type": "module" makes this cosmetic churn

Human Review Status

Human reviewer @miles-kt-inkeep approved on 2026-02-18T22:49:50Z with 3 comments:

  • ✅ Localhost URL detection — Fixed (added isLocalDatabaseUrl() helper)
  • ✅ SpiceDB health check — Fixed (added to parallel health polling)
  • 🟡 waitForServerReady duplication — Acknowledged as valid future cleanup (low priority)

🕐 Pending Recommendations (1)

  • 🟡 setup.ts:485 waitForServerReady duplicates functionality in e2e utils — valid observation for future consolidation, low priority

✅ APPROVE

Summary: This PR successfully unifies ~600 lines of duplicated setup logic into a shared @inkeep/agents-core/setup module. The architecture is clean, error handling is appropriate for CLI context, tests cover core scenarios (17 test cases), CI integration is correct, and documentation accurately reflects the new behavior. Human reviewer has approved. All substantive feedback has been addressed. Ship it! 🚀

Reviewers (0)

No reviewers dispatched — this sync event contains no new PR-specific code changes since the last approved review.

@github-actions github-actions bot deleted a comment from claude bot Feb 19, 2026
@nick-inkeep nick-inkeep merged commit 561659a into main Feb 19, 2026
11 checks passed
@nick-inkeep nick-inkeep deleted the feat/unified-setup-dev branch February 19, 2026 04:04
@github-actions
Copy link
Contributor

🔎💬 Inkeep AI search and chat service is syncing content for source 'Inkeep Agent Framework Docs'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments