feat: unify setup-dev flow into shared setup module#2054
Conversation
🦋 Changeset detectedLatest commit: 073a471 The changes in this PR will be included in the next version bump. This PR includes changesets to release 9 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
3b64f83 to
5c7bf5d
Compare
Extract shared setup logic into `packages/agents-core/src/setup/` so both the quickstart template and monorepo contributor setup import from a single source. This eliminates ~600 lines of duplicated setup code and gives monorepo contributors a seeded project after `pnpm setup-dev`. - Add `runSetup(config)` shared module with 8-step orchestrated flow: env check, JWT keys, Docker DBs, migrations, auth init, health detect, project push, cleanup - Refactor `create-agents-template/scripts/setup.js` from 623 to 42 lines - Replace `scripts/setup.sh` (bash) with `scripts/setup-dev.js` (Node.js) - Add `./setup` export to agents-core package.json - Add bypass secret auth support to cookbook inkeep.config.ts - Update contributing docs to reflect new setup capabilities - Add .setup-complete to .gitignore - Include 10 unit tests covering all setup paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use `pnpm turbo` instead of bare `turbo` (resolves PATH issue in sh -c) - Switch stdio from piped to 'ignore' + proc.unref() to prevent buffer blocking - Replace PID-based cleanup with process-group kill (-pid) for reliable tree cleanup - Increase waitForServerReady timeout to 180s for cold API startup - Track spawned PID in SpawnedServers for group-based cleanup - Update test mock to include unref() Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…entials The CLI's profile system overrides the config file's apiKey with stale keychain credentials, causing 401 on push. Setting INKEEP_CI=true and INKEEP_API_KEY forces CI mode which skips profile loading and sends the bypass secret as a Bearer token directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The old setup.sh was pure bash so it didn't need built packages. The new setup-dev.js imports from agents-core/dist/setup/index.js, which requires agents-core to be built first. Also pass --skip-push since the Cypress action has its own push step with retry logic. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CI uses service containers for databases, not docker-compose. The Cypress workflow now passes --cloud --skip-push so setup-dev only runs migrations and auth init without attempting Docker startup. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ured Instead of misusing the --cloud flag (which also skips DB URL validation and is semantically wrong for CI), detect when docker-compose is not available and continue setup if database URLs are already configured. This correctly handles CI environments where databases are service containers managed by the workflow runner. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The --cloud flag removal left a dangling isCloud reference. The monorepo contributor path always uses local Docker, so isCloud is hardcoded false. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address review feedback: use logError(msg, error) instead of logWarning to surface diagnostic information when JWT key generation fails, matching the established error handling pattern in this file. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Run upfront checks before attempting Docker startup to give developers clear, actionable feedback instead of hanging on port conflicts: - Check 1: Docker daemon is running (graceful fallback if DB URLs set) - Check 2: Docker Compose is available - Check 3: Required ports (5432, 5433, 5434) — identifies which containers from other Docker projects are occupying them, with exact `docker compose -p <project> down` commands to resolve Also adds a 60s timeout to `docker compose up -d` as a safety net for edge cases that slip through pre-flight. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…sages Read `com.docker.compose.project` label from container metadata instead of guessing from container name (which fails for hyphenated project names). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CI environments have `docker compose` (plugin) but not `docker-compose` (standalone). Track which variant is available during preflight checks and use it consistently in all compose commands. Also adds setup-dev documentation updates to contributing guide and troubleshooting page. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove the hard gate that skipped push when no API key was available. Now pushProject uses CI mode only when a bypass secret is present; otherwise it passes through process.env so the CLI can resolve auth from keychain/profile (cloud users who ran `inkeep login`). Template setup.js runs `inkeep init --no-interactive` + `inkeep login` before setup when in cloud mode without a bypass secret, matching the old template's cloud onboarding flow. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
setup-dev now uses scripts/setup-dev.js which delegates to @inkeep/agents-core/setup. The old bash script is no longer referenced by package.json. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…se fix suggestion Two fixes from QA testing: 1. startDockerDatabases: exec timeout (killed === true) was misclassified as a port conflict, causing setup to skip health polling when containers were still starting. Now timeouts fall through to health polling, and only real port-conflict stderr patterns trigger the early return. Also increased the docker-compose up timeout from 60s to 120s. 2. runPreflightChecks: standalone containers (no compose project label) got detected as port conflicts but showed an empty "Fix:" line. Now falls back to suggesting `docker rm -f <name>` for non-compose containers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…lth polling Address two review comments from miles-kt-inkeep: 1. The "Docker not available but DB URLs set" fallback was meaningless because .env.example always provides localhost URLs. Replace existence check with isLocalDatabaseUrl() — localhost URLs now correctly require Docker, while external host URLs allow graceful Docker skip. 2. Add spicedb-postgres to Docker health polling alongside doltgres-db and postgres-db for consistency and early failure detection before auth init. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In GitHub Actions, databases run as job-level service containers accessible on localhost — but `docker info` is not available to the runner. The localhost detection from the previous commit would incorrectly exit because it saw localhost URLs without Docker. Fix: treat CI environments as always having externally managed databases, regardless of URL host. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ed98c23 to
073a471
Compare
There was a problem hiding this comment.
PR Review Summary
(0) Total Issues | Risk: Low
Re-review Summary
This is a re-review triggered by pull_request:synchronize. The head SHA (073a4716b) matches the previously approved state. No new PR-specific code changes since the last approval.
This PR has been thoroughly reviewed across 15+ prior review cycles, all resulting in approval. The comprehensive review history includes:
| Review Cycle | Focus | Outcome |
|---|---|---|
| Initial review | Full PR analysis | 4 findings raised (3 Major, 1 Minor, 1 Consider) |
| Author responses | Feedback triage | 1 accepted (JWT error context), 4 declined with valid rationale |
| Fix verification | JWT error logging | Fix correctly implemented |
| Delta reviews (12×) | Pre-flight checks, compose detection, cloud mode push, localhost detection, SpiceDB health polling, CI service containers | All approved |
Prior Feedback Status
All 5 prior automated inline threads are resolved:
| Finding | Resolution |
|---|---|
| 🟠 Windows compatibility | Declined ✓ — module is Unix-only throughout (sh, docker-compose, SIGTERM/SIGKILL) |
| 🟠 JWT error context | Fixed ✓ — logError() now surfaces diagnostic details |
| 🟠 Server spawn failure detection | Declined ✓ — out of scope, health-check timeout surfaces failures |
| 🟡 copyFileSync errors | Declined ✓ — unhandled exception appropriate for setup-critical path |
| 💭 .mjs extension | Declined ✓ — "type": "module" makes this cosmetic churn |
Human Review Status
Human reviewer @miles-kt-inkeep approved on 2026-02-18T22:49:50Z with 3 comments:
- ✅ Localhost URL detection — Fixed (added
isLocalDatabaseUrl()helper) - ✅ SpiceDB health check — Fixed (added to parallel health polling)
- 🟡
waitForServerReadyduplication — Acknowledged as valid future cleanup (low priority)
🕐 Pending Recommendations (1)
- 🟡
setup.ts:485waitForServerReadyduplicates functionality in e2e utils — valid observation for future consolidation, low priority
✅ APPROVE
Summary: This PR successfully unifies ~600 lines of duplicated setup logic into a shared @inkeep/agents-core/setup module. The architecture is clean, error handling is appropriate for CLI context, tests cover core scenarios (17 test cases), CI integration is correct, and documentation accurately reflects the new behavior. Human reviewer has approved. All substantive feedback has been addressed. Ship it! 🚀
Reviewers (0)
No reviewers dispatched — this sync event contains no new PR-specific code changes since the last approved review.
|
🔎💬 Inkeep AI search and chat service is syncing content for source 'Inkeep Agent Framework Docs' |
Summary
Extracts the setup-dev flow into a shared
@inkeep/agents-core/setupmodule. The monorepo's bashsetup.sh(223 lines) and the template'ssetup.js(623 lines) are replaced by thin wrappers (~30-60 lines) that pass config to a singlerunSetup()entry point.Key decisions
Why a shared module in
agents-core(not a standalone package or script)The setup flow needs access to
loadEnvironmentFiles()fromagents-coreand must be importable by both the monorepo (scripts/setup-dev.js) and the published quickstart template (create-agents-template). Putting it inagents-core/setupavoids a new package while keeping it co-located with the env/auth/db primitives it depends on. Trade-off:agents-coregrows slightly, and the monorepo wrapper must buildagents-corebefore running setup (visible in the Cypress CI change).Config-driven callers instead of flags
Each caller passes a
SetupConfigobject (compose file path, migration commands, push target, server commands,isCloud,skipPush). This avoids a flag explosion on a single CLI and lets each context (monorepo contributor vs. quickstart user vs. cloud deploy) declare exactly what it needs. The shared module has zero awareness of who's calling it.Pre-flight checks before Docker startup
Rather than letting
docker-compose upfail cryptically on port conflicts, the module runslsof+docker ps+docker inspectto identify which Docker Compose project owns the conflicting port and tells the user the exactdocker compose -p <name> downcommand. For standalone (non-compose) containers, it suggestsdocker rm -f <name>. If the ports belong to our own project, startup is skipped (idempotent re-run).When Docker is unavailable, the module checks whether database URLs point to localhost or an external host. Localhost URLs (from
.env.exampledefaults) indicate Docker-managed DBs are expected, so setup exits with an error. External host URLs indicate the user is managing DBs externally, so Docker is skipped gracefully.Health polling: all three database services
After
docker-compose up -d, health polling checks all three services in parallel:doltgres-db(60s timeout),postgres-db(30s), andspicedb-postgres(30s). Setup only hard-fails if both primary databases (doltgres + postgres) fail health checks — a SpiceDB-only failure logs a warning but continues, sincewriteSpiceDbSchema()in auth init has its own 30-retry logic.Auth strategy: bypass secret vs. CLI keychain
When
INKEEP_AGENTS_MANAGE_API_BYPASS_SECRETis available, push uses CI mode (INKEEP_CI=true+INKEEP_API_KEY). When it's absent (cloud users), the CLI resolves auth from keychain/profile — the template pre-runsinkeep init --no-interactive+inkeep loginbefore callingrunSetup(). This means local contributors and cloud users follow different auth paths through the same code, controlled entirely by the presence of the bypass secret.Process group cleanup for temporarily spawned servers
When the API server is spawned for project push, it's started with
detached: trueso it gets its own process group. Cleanup useskill(-pid)(negative PID = group signal) followed by a 2-second grace period andSIGKILL. This prevents orphaned Node processes after setup completes.CI concurrency limit
Added
--concurrency=8topnpm checkin CI. On cache-miss runs, unlimited Turbo concurrency causes multiple Next.js builds to run simultaneously on a 4-vCPU runner, triggering OOM kills. This is tangential to the setup unification but was discovered and fixed during CI validation of this PR.User scenarios addressed
pnpm setup-devdocker compose -p <name> downcommand (ordocker rm -f <name>for standalone containers), continues with existing DBspnpm setup-dev --skip-pushdocker compose upslow on cold start@inkeep/agents-core/setup, different compose file and migration commandsinkeep init+inkeep loginbefore setup, push authenticates via CLI keychainagents-corefirst, runssetup-dev --skip-push(no project push needed for E2E)Changes
packages/agents-core/src/setup/setup.ts— Shared setup module (~730 lines). IncludesisLocalDatabaseUrl()helper for localhost detection and health polling for all 3 DB services.scripts/setup-dev.js— Monorepo wrapper (32 lines), replaces deletedscripts/setup.shcreate-agents-template/scripts/setup.js— Rewritten as thin wrapper.github/workflows/ci.yml—--concurrency=8onpnpm check.github/composite-actions/cypress-e2e/action.yml— Buildagents-corebefore setup;--skip-pushagents-cookbook/template-projects/inkeep.config.ts— AddedapiKeyfrom bypass secret env varagents-docs/— Updated contributing overview + added troubleshooting section for local env issues.gitignore— Added.setup-completemarker fileTest coverage
Unit tests (17 cases in
setup.test.ts)Covers the core
runSetup()logic with mockedexec/spawn/fetch:isCloud: trueINKEEP_CI,INKEEP_API_KEY,INKEEP_TENANT_ID)--skip-pushflag honoredTest plan
Manual QA scenarios tested on macOS (2026-02-18). These resist automation because they require real Docker, real database startups, and real process lifecycle behavior.
pnpm setup-devfrom scratch (afterdocker-compose down -v+ removing.setup-complete). All 8 steps completed: .env check, JWT skip (already configured), Docker startup with health polling, manage + runtime migrations, auth init, API server temporary spawn, weather-project push, server cleanup. · Why not a test: requires real Docker daemon, real database containers, real network health checks.pnpm setup-devimmediately after successful first run. JWT gen skipped ("already configured"), Dockerup -dis a no-op (containers already healthy), all three health checks pass (DoltgreSQL, PostgreSQL, SpiceDB PostgreSQL), migrations ran cleanly, project re-pushed successfully. · Why not a test: requires persistent state across runs (.setup-completefile, running containers).--skip-pushflag:pnpm setup-dev --skip-pushcompletes after Step 5 (auth init). No Step 6/7/8 output, no API server spawned, ends with "Database setup completed!" message. · Why not a test: verifiable via unit test (already covered), but manual run confirms real end-to-end flag plumbing.docker run -d --name conflict-test -p 5432:5432 postgres:16). Setup correctly detected:5432 (DoltgreSQL) → conflict-test, printed the conflict warning, skipped Docker startup, and proceeded to migrations. · Why not a test: requires real Docker containers, real port binding, reallsof+docker inspectcalls.isLocalDatabaseUrl()against 10 edge cases including false-positive guards (localhost in DB name, localhost in password). · Why not a test: could be a unit test; validated via inline Node script during QA.Bugs found and fixed during QA
Exec timeout misclassified as port conflict (fixed in
cee4e1eb2):startDockerDatabases()treatedkilled === true(Node exec timeout) as a port conflict, skipping health polling. On cold starts where Docker needed to pull images, this caused migrations to fail against containers that weren't ready. Fix: separated timeout from port-conflict handling — timeouts now fall through to health polling, and the exec timeout was increased from 60s to 120s.No fix suggestion for non-compose containers (fixed in
cee4e1eb2): Port conflict detection showed an empty "Fix:" section for standalone containers without acom.docker.compose.projectlabel. Fix: addeddocker rm -f <name>fallback for non-compose containers.Review feedback addressed
DB URL existence check was meaningless (fixed in
22d5934):.env.examplealways provides localhost DB URLs, so the "Docker not available but DB URLs are set" fallback always passed. Fix: replaced existence check withisLocalDatabaseUrl()— localhost URLs require Docker; external host URLs allow graceful Docker skip.SpiceDB PostgreSQL not health-checked (fixed in
22d5934): Health polling only checkeddoltgres-dbandpostgres-db, missingspicedb-postgres(port 5434). Fix: addedspicedb-postgresto thePromise.allSettledhealth polling with 30s timeout.Future considerations
waitForServerReadyexists in 4 variants across the repo (setup.ts,create-agents/e2e/utils.ts,agents-cli/dev.ts,agents-sdk/module-hosted-tool-manager.ts). Consolidation was considered but deferred — the implementations differ intentionally (exponential backoff vs. fixed interval, CI-aware logging vs. silent, HTML body check vs. status code, etc.). A unified function flexible enough for all cases would be more complex than the ~15-line inline versions. Revisit if more call sites emerge.🤖 Generated with Claude Code