diff --git a/docs/development/backend/mail.mdx b/docs/development/backend/mail.mdx
index 9486177..c540ea5 100644
--- a/docs/development/backend/mail.mdx
+++ b/docs/development/backend/mail.mdx
@@ -1,99 +1,198 @@
---
title: Email Integration Documentation
-description: Complete email system with Nodemailer, Pug templates, SMTP configuration, and type-safe helper methods for DeployStack Backend.
+description: Complete email system with Nodemailer, Pug templates, SMTP configuration, background job integration, and type-safe helper methods for DeployStack Backend.
---
+import { Callout } from 'fumadocs-ui/components/callout';
+
# Email Integration Documentation
-This document describes the email system integration in DeployStack, including the email service, template system, and usage examples.
+This document describes the email system integration in DeployStack, including the email service, template system, and recommended usage patterns.
## Overview
The email system provides a comprehensive solution for sending templated emails using:
+- **Background Job Queue**: Async email processing with automatic retry (Recommended)
- **Nodemailer**: For SMTP email delivery
- **Pug Templates**: For beautiful, maintainable email templates
- **Global Settings Integration**: Automatic SMTP configuration from global settings
- **Template Caching**: Performance optimization for template rendering
-## Architecture
+## Recommended: Sending Emails via Background Jobs
-```text
-src/email/
-├── emailService.ts # Main email service with SMTP integration
-├── templateRenderer.ts # Pug template compilation and rendering
-├── types.ts # TypeScript interfaces schemas
-├── index.ts # Module exports
-└── templates/
- ├── layouts/
- │ ├── base.pug # Main email layout
- │ ├── header.pug # Email header component
- │ └── footer.pug # Email footer component
- ├── welcome.pug # Welcome email template
- ├── password-reset.pug # Password reset template
- └── notification.pug # General notification template
-```
+
+**For all user-facing email operations, use the background job queue instead of direct email sending.**
+
-## SMTP Configuration
+The background job approach provides significant benefits over direct email sending:
-The email system automatically integrates with your existing SMTP global settings. Ensure the following settings are configured in the global settings:
+### Why Background Jobs?
-| Setting | Required | Description |
-|---------|----------|-------------|
-| `smtp.host` | ✅ | SMTP server hostname (e.g., smtp.gmail.com) |
-| `smtp.port` | ✅ | SMTP server port (587 for TLS, 465 for SSL) |
-| `smtp.username` | ✅ | SMTP authentication username |
-| `smtp.password` | ✅ | SMTP authentication password (encrypted) |
-| `smtp.secure` | ❌ | Use SSL/TLS connection (default: true) |
-| `smtp.from_name` | ❌ | Default sender name (default: DeployStack) |
-| `smtp.from_email` | ❌ | Default sender email (default: username) |
+**Performance Benefits:**
+- **Instant Response**: API endpoints return immediately without waiting for SMTP operations
+- **Non-Blocking**: Email sending doesn't delay critical user flows like registration or password reset
+- **Scalability**: Can queue hundreds of emails without impacting application performance
-## Basic Usage
+**Reliability Benefits:**
+- **Automatic Retry**: Failed emails automatically retry with exponential backoff (3 attempts by default)
+- **Error Isolation**: Email failures don't crash user-facing operations
+- **Rate Limiting**: Built-in support for respecting SMTP server rate limits
-```typescript
-import { EmailService } from '../email';
+**Observability Benefits:**
+- **Job Tracking**: All email operations tracked in the job queue database
+- **Comprehensive Logging**: Full logging of email operations, successes, and failures
+- **Status Monitoring**: Query job status and history for debugging
-const result = await EmailService.sendEmail({
+### Background Job Usage
+
+```typescript
+// RECOMMENDED: Queue email as background job
+await server.jobQueueService.createJob('send_email', {
to: 'user@example.com',
subject: 'Welcome to DeployStack',
template: 'welcome',
variables: {
userName: 'John Doe',
userEmail: 'user@example.com',
- loginUrl: 'https://app.deploystack.io/login'
+ loginUrl: 'https://cloud.deploystack.io/login'
}
});
+// Job is queued and returns immediately
+// Email will be sent by background worker within seconds
+```
+
+### Real-World Example: Registration Flow
+
+```typescript
+// In registration route - after user created successfully
+const jobQueueService = (server as any).jobQueueService;
+if (jobQueueService) {
+ await jobQueueService.createJob('send_email', {
+ to: user.email,
+ subject: 'Verify Your Email Address',
+ template: 'email-verification',
+ variables: {
+ userName: user.username,
+ userEmail: user.email,
+ verificationUrl: `https://cloud.deploystack.io/verify?token=${token}`,
+ expirationTime: '24 hours'
+ }
+ });
+ server.log.info(`Verification email queued for ${user.email}`);
+}
+
+// Registration completes instantly
+// Email sent by worker in background
+```
+
+### When to Use Background Jobs
+
+**Use background jobs for:**
+- User registration/verification emails
+- Password reset emails
+- Welcome emails
+- Notification emails
+- Batch email operations
+- Any email that doesn't require immediate confirmation
+
+### Monitoring Background Jobs
+
+```typescript
+// Check job status in database
+const jobs = await db.select()
+ .from(queueJobs)
+ .where(eq(queueJobs.type, 'send_email'))
+ .orderBy(desc(queueJobs.created_at))
+ .limit(10);
+
+// Job statuses: 'pending', 'processing', 'completed', 'failed'
+```
+
+For complete details on the job queue system, see the [Background Job Queue Documentation](/development/backend/job-queue).
+
+## Direct Email Sending (Advanced)
+
+
+**Direct email sending is only recommended for specific use cases requiring immediate feedback.**
+
+
+### When to Use Direct Sending
+
+Use `EmailService.sendEmail()` directly only for:
+- **Test Emails**: Admin panel SMTP testing requiring immediate success/failure feedback
+- **Synchronous Validation**: Flows where you must confirm email sent before proceeding
+- **Development/Debugging**: Local testing and troubleshooting
+
+### Direct Usage Example
+
+```typescript
+import { EmailService } from '../email';
+
+// Direct sending - blocks until complete
+const result = await EmailService.sendEmail({
+ to: 'admin@example.com',
+ subject: 'SMTP Test Email',
+ template: 'test',
+ variables: {
+ testDateTime: new Date().toISOString(),
+ adminUser: 'Admin',
+ appUrl: 'https://cloud.deploystack.io'
+ }
+}, request.log);
+
if (result.success) {
request.log.info('Email sent successfully');
+ return { success: true, messageId: result.messageId };
} else {
request.log.error('Failed to send email:', result.error);
+ return { success: false, error: result.error };
}
```
-## Type-Safe Helper Methods
+### Performance Consideration
-### Available Methods
-- `sendWelcomeEmail(options)` - Welcome email for new users
-- `sendPasswordResetEmail(options)` - Password reset instructions
-- `sendNotificationEmail(options)` - General notifications
+Direct email sending adds **2-5 seconds** of latency to your endpoint. This is acceptable for admin testing but **not acceptable** for user-facing operations like registration.
-```typescript
-// Example usage
-const result = await EmailService.sendWelcomeEmail({
- to: 'newuser@example.com',
- userName: 'Jane Smith',
- userEmail: 'newuser@example.com',
- loginUrl: 'https://app.deploystack.io/login'
-});
+## Architecture
+
+```text
+src/email/
+├── emailService.ts # Main email service with SMTP integration
+├── templateRenderer.ts # Pug template compilation and rendering
+├── types.ts # TypeScript interfaces and schemas
+├── index.ts # Module exports
+└── templates/
+ ├── layouts/
+ │ ├── base.pug # Main email layout
+ │ ├── header.pug # Email header component
+ │ └── footer.pug # Email footer component
+ ├── welcome.pug # Welcome email template
+ ├── email-verification.pug # Email verification template
+ ├── password-reset.pug # Password reset template
+ ├── password-changed.pug # Password change notification
+ ├── notification.pug # General notification template
+ └── test.pug # SMTP test template
+
+src/workers/
+└── emailWorker.ts # Background job worker for emails
```
-## Advanced Features
+## SMTP Configuration
+
+The email system automatically integrates with SMTP global settings. Configure these settings in the admin panel or global settings:
-- **Custom From Address**: Override default sender information
-- **Multiple Recipients**: Send to arrays of email addresses
-- **Attachments**: Include files with emails
-- **CC/BCC**: Additional recipient types supported
+| Setting | Required | Description |
+|---------|----------|-------------|
+| `smtp.enabled` | ✅ | Enable/disable email sending system-wide |
+| `smtp.host` | ✅ | SMTP server hostname (e.g., smtp.gmail.com) |
+| `smtp.port` | ✅ | SMTP server port (587 for TLS, 465 for SSL) |
+| `smtp.username` | ✅ | SMTP authentication username |
+| `smtp.password` | ✅ | SMTP authentication password (encrypted) |
+| `smtp.secure` | ❌ | Use SSL/TLS connection (default: true) |
+| `smtp.from_name` | ❌ | Default sender name (default: DeployStack) |
+| `smtp.from_email` | ❌ | Default sender email (default: username) |
## Template System
@@ -102,8 +201,11 @@ const result = await EmailService.sendWelcomeEmail({
| Template | Description | Required Variables |
|----------|-------------|-------------------|
| `welcome` | Welcome email for new users | `userName`, `userEmail`, `loginUrl` |
+| `email-verification` | Email address verification | `userName`, `userEmail`, `verificationUrl`, `expirationTime` |
| `password-reset` | Password reset instructions | `userName`, `resetUrl`, `expirationTime` |
+| `password-changed` | Password change notification | `userEmail`, `changeTime` |
| `notification` | General notification template | `title`, `message` |
+| `test` | SMTP configuration testing | `testDateTime`, `adminUser`, `appUrl` |
### Template Variables
@@ -119,11 +221,38 @@ To create custom templates:
1. Add new Pug template in `src/email/templates/`
2. Add TypeScript types in `src/email/types.ts`
-3. Use with `EmailService.sendEmail()`
+3. Use with background jobs or `EmailService.sendEmail()`
#### Color Guidelines
-When adding colors to email templates, follow the official DeployStack color system to maintain brand consistency. If you want to check what our primary colors are, visit the [UI Design System](/development/frontend/ui-design-system) page.
+When adding colors to email templates, follow the official DeployStack color system to maintain brand consistency. See the [UI Design System](/development/frontend/ui-design-system) page for color specifications.
+
+## Advanced Email Options
+
+Both background jobs and direct sending support advanced options:
+
+```typescript
+await jobQueueService.createJob('send_email', {
+ to: 'user@example.com',
+ subject: 'Important Notification',
+ template: 'notification',
+ variables: { title: 'Update', message: 'New features available' },
+
+ // Advanced options
+ from: {
+ name: 'DeployStack Support',
+ email: 'support@deploystack.io'
+ },
+ replyTo: 'support@deploystack.io',
+ cc: ['admin@deploystack.io'],
+ bcc: ['archive@deploystack.io'],
+ attachments: [{
+ filename: 'guide.pdf',
+ content: pdfBuffer,
+ contentType: 'application/pdf'
+ }]
+});
+```
## Utility Methods
@@ -147,12 +276,40 @@ if (status.success) {
## Error Handling
-The email service provides comprehensive error handling with specific error messages for:
+### Background Job Errors
+
+Background jobs automatically retry on failure:
+- **Attempt 1**: Immediate execution
+- **Attempt 2**: After 1 second (if first fails)
+- **Attempt 3**: After 2 seconds (if second fails)
+- **Final**: Job marked as 'failed' after 3 attempts
-- SMTP configuration issues
-- Missing templates
-- Invalid email addresses
-- Connection failures
+Check failed jobs in the database:
+
+```sql
+SELECT * FROM queue_jobs
+WHERE type = 'send_email'
+ AND status = 'failed'
+ORDER BY created_at DESC;
+```
+
+### Direct Sending Errors
+
+Direct sending provides immediate error feedback:
+
+```typescript
+const result = await EmailService.sendEmail(options, logger);
+if (!result.success) {
+ // Handle errors immediately
+ logger.error(`Email failed: ${result.error}`);
+
+ // Common errors:
+ // - 'SMTP not configured'
+ // - 'Invalid email address'
+ // - 'Connection timeout'
+ // - 'Authentication failed'
+}
+```
## Security & Performance
@@ -160,22 +317,68 @@ The email service provides comprehensive error handling with specific error mess
- Passwords encrypted in global settings
- Secure connections (TLS/SSL) supported
- Template security with escaped variable injection
+- No sensitive data in job queue payload
### Performance Features
- Template caching after first compilation
- Connection pooling (max 5 concurrent connections)
- Rate limiting (5 emails per 20 seconds)
+- Background processing doesn't block main thread
+
+## Migration Guide
+
+### Converting Existing Code to Background Jobs
+
+**Before (Blocking):**
+```typescript
+const result = await EmailService.sendWelcomeEmail({
+ to: user.email,
+ userName: user.name,
+ userEmail: user.email,
+ loginUrl: 'https://cloud.deploystack.io/login'
+}, request.log);
+```
+
+**After (Non-Blocking):**
+```typescript
+await server.jobQueueService.createJob('send_email', {
+ to: user.email,
+ subject: 'Welcome to DeployStack',
+ template: 'welcome',
+ variables: {
+ userName: user.name,
+ userEmail: user.email,
+ loginUrl: 'https://cloud.deploystack.io/login'
+ }
+});
+```
## Common Usage Patterns
### User Registration
-Send welcome emails after user account creation.
+Queue verification email as background job immediately after user creation.
### Password Reset
-Generate reset tokens and send secure reset links.
+Queue password reset email with secure token link.
### System Notifications
-Alert users about deployments, system updates, or important events.
+Queue notification emails for system events, deployments, or status changes.
+
+### Batch Operations
+Queue multiple emails with rate limiting to avoid SMTP throttling:
+
+```typescript
+for (let i = 0; i < users.length; i++) {
+ await jobQueueService.createJob('send_email', {
+ to: users[i].email,
+ subject: 'Monthly Newsletter',
+ template: 'notification',
+ variables: { ... }
+ }, {
+ scheduledFor: new Date(Date.now() + (i * 1000)) // 1 second apart
+ });
+}
+```
## API Reference
@@ -183,26 +386,26 @@ Alert users about deployments, system updates, or important events.
| Method | Description | Returns |
|--------|-------------|---------|
-| `sendEmail(options)` | Send an email using a template | `Promise` |
-| `sendWelcomeEmail(options)` | Send a welcome email | `Promise` |
-| `sendPasswordResetEmail(options)` | Send a password reset email | `Promise` |
-| `sendNotificationEmail(options)` | Send a notification email | `Promise` |
-| `testConnection()` | Test SMTP connection | `Promise<{success: boolean, error?: string}>` |
+| `sendEmail(options, logger)` | Send an email directly using a template | `Promise` |
+| `testConnection(logger)` | Test SMTP connection | `Promise<{success: boolean, error?: string}>` |
| `getSmtpStatus()` | Check SMTP configuration status | `Promise<{configured: boolean, error?: string}>` |
| `refreshConfiguration()` | Reload SMTP configuration | `Promise` |
| `getAvailableTemplates()` | Get list of available templates | `string[]` |
| `validateTemplate(template, variables)` | Validate template and variables | `Promise` |
-### TemplateRenderer Methods
+### Job Queue Service Methods
| Method | Description | Returns |
|--------|-------------|---------|
-| `render(options)` | Render a template with variables | `Promise` |
-| `validateTemplate(template, variables)` | Validate template | `Promise` |
-| `getAvailableTemplates()` | Get available templates | `string[]` |
-| `clearCache()` | Clear template cache | `void` |
-| `getTemplateMetadata(template)` | Get template metadata | `{description?: string, requiredVariables?: string[]}` |
+| `createJob(type, payload, options?)` | Create a background job | `Promise` |
+| `getJobStatus(jobId)` | Get job status and details | `Promise` |
----
+## Related Documentation
+
+- [Background Job Queue](/development/backend/job-queue) - Complete job queue system documentation
+- [Global Settings](/development/backend/global-settings) - SMTP and system configuration
+- [API Documentation](/development/backend/api) - REST API patterns and standards
+
+## Summary
-For more information about global settings configuration, see [Global Settings](/development/backend/global-settings).
+**For all user-facing email operations, use background jobs.** This provides instant response times, automatic retry, and better reliability. Reserve direct `EmailService.sendEmail()` calls for admin testing and debugging scenarios requiring immediate feedback.
diff --git a/docs/development/frontend/ui-design-syntax-highlighter.mdx b/docs/development/frontend/ui-design-syntax-highlighter.mdx
new file mode 100644
index 0000000..506872f
--- /dev/null
+++ b/docs/development/frontend/ui-design-syntax-highlighter.mdx
@@ -0,0 +1,189 @@
+---
+title: Syntax Highlighting
+description: Guide for using the CodeHighlight component to display syntax-highlighted code blocks.
+sidebar: Syntax Highlighting
+---
+
+# Syntax Highlighting
+
+The `CodeHighlight` component provides syntax highlighting for code blocks using Prism.js. It's a reusable component that handles the highlighting automatically without requiring manual Prism.js imports.
+
+## Component Location
+
+```
+services/frontend/src/components/ui/code-highlight/
+├── CodeHighlight.vue # Main component
+└── index.ts # Export file
+```
+
+## Features
+
+- **Automatic syntax highlighting** using Prism.js
+- **Multiple language support** (JSON, JavaScript, TypeScript, Bash, YAML)
+- **Clean default theme** with proper contrast
+- **Error handling** for invalid code or unsupported languages
+- **Zero setup** - just import and use
+
+## Supported Languages
+
+The component comes pre-configured with these languages:
+- `json` - JSON data
+- `javascript` - JavaScript code
+- `typescript` - TypeScript code
+- `bash` - Shell scripts and commands
+- `yaml` - YAML configuration files
+
+## Basic Usage
+
+```vue
+
+
+
+
+
+```
+
+## Props
+
+| Prop | Type | Default | Description |
+|------|------|---------|-------------|
+| `code` | `string` | required | The code to highlight |
+| `language` | `string` | `'javascript'` | Language for syntax highlighting |
+
+## Examples
+
+### JSON Data
+
+```vue
+
+
+
+
+
+```
+
+### JavaScript Code
+
+```vue
+
+
+
+
+
+```
+
+### TypeScript Code
+
+```vue
+
+
+
+
+
+```
+
+## Adding More Languages
+
+To add support for additional languages, edit the component and import the required Prism.js language component:
+
+```typescript
+// In CodeHighlight.vue
+import 'prismjs/components/prism-python'
+import 'prismjs/components/prism-sql'
+import 'prismjs/components/prism-css'
+```
+
+## Theme Customization
+
+The component uses the default Prism.js theme (`prism.css`). To use a different theme, change the import in `CodeHighlight.vue`:
+
+```typescript
+// Available themes:
+import 'prismjs/themes/prism.css' // Default light theme
+import 'prismjs/themes/prism-tomorrow.css' // Dark theme
+import 'prismjs/themes/prism-okaidia.css' // Monokai-like dark theme
+import 'prismjs/themes/prism-twilight.css' // Purple dark theme
+import 'prismjs/themes/prism-dark.css' // Simple dark theme
+```
+
+## Real-World Example
+
+Here's how the component is used in the job details page to display JSON payloads:
+
+```vue
+
+
+
+
+
Payload
+
+
+
+```
+
+## Error Handling
+
+The component handles errors gracefully:
+- If the specified language is not available, it displays the code without highlighting
+- If the code cannot be highlighted (syntax errors), it displays the raw code
+- No errors are thrown to the console
+
+## Best Practices
+
+1. **Format code before passing** - Use `JSON.stringify(data, null, 2)` for JSON
+2. **Choose the correct language** - Ensures proper syntax highlighting
+3. **Keep code snippets reasonable** - Very large code blocks may impact performance
+4. **Use template literals** for multi-line code strings in TypeScript
+
+## Related Documentation
+
+- [UI Design System](/development/frontend/ui-design-system) - Overall design patterns
diff --git a/docs/development/satellite/process-management.mdx b/docs/development/satellite/process-management.mdx
new file mode 100644
index 0000000..fd4f0a2
--- /dev/null
+++ b/docs/development/satellite/process-management.mdx
@@ -0,0 +1,344 @@
+---
+title: Process Management
+description: Technical implementation of stdio subprocess management for local MCP servers in DeployStack Satellite.
+sidebar: Satellite Development
+---
+
+import { Callout } from 'fumadocs-ui/components/callout';
+
+# stdio Process Management
+
+DeployStack Satellite implements stdio subprocess management for local MCP servers through the ProcessManager component. This system handles spawning, monitoring, and lifecycle management of MCP server processes with dual-mode operation for development and production environments.
+
+## Overview
+
+**Core Components:**
+- **ProcessManager**: Handles spawning, communication, and lifecycle of stdio-based MCP servers
+- **RuntimeState**: Maintains in-memory state of all processes with team-grouped tracking
+- **TeamIsolationService**: Validates team-based access control for process operations
+
+**Deployment Modes:**
+- **Development**: Direct spawn without isolation (cross-platform)
+- **Production**: nsjail isolation with resource limits (Linux only)
+
+## Process Spawning
+
+### Spawning Modes
+
+The system automatically selects the appropriate spawning mode based on environment:
+
+**Direct Spawn (Development):**
+- Standard Node.js `child_process.spawn()` without isolation
+- Full environment variable inheritance
+- No resource limits or namespace isolation
+- Works on all platforms (macOS, Windows, Linux)
+
+**nsjail Spawn (Production Linux):**
+- Resource limits: 50MB RAM, 60s CPU time, and one process per started MCP server
+- Namespace isolation: PID, mount, UTS, IPC
+- Filesystem isolation: Read-only mounts for `/usr`, `/lib`, `/lib64`, `/bin` with writable `/tmp`
+- Team-specific hostname: `mcp-{team_id}`
+- Non-root user (99999:99999)
+- Network access enabled
+
+
+**Mode Selection**: The system uses `process.env.NODE_ENV === 'production' && process.platform === 'linux'` to determine isolation mode. This ensures development works seamlessly on all platforms while production deployments get full security.
+
+
+### Process Configuration
+
+Processes are spawned using MCPServerConfig containing:
+- `installation_name`: Unique identifier in format `{server_slug}-{team_slug}-{installation_id}`
+- `installation_id`: Database UUID for the installation
+- `team_id`: Team owning the process
+- `command`: Executable command (e.g., `npx`, `node`)
+- `args`: Command arguments
+- `env`: Environment variables (credentials, configuration)
+
+## MCP Handshake Protocol
+
+After spawning, processes must complete an MCP handshake before becoming operational:
+
+**Two-Step Process:**
+1. **Initialize Request**: Sent to process via stdin
+ - Protocol version: 2025-11-05
+ - Client info: deploystack-satellite v1.0.0
+ - Capabilities: roots.listChanged=false, sampling={}
+2. **Initialized Notification**: Sent after successful initialization response
+
+**Handshake Requirements:**
+- 30-second timeout (accounts for npx package downloads)
+- Response must include `serverInfo` with name and version
+- Process marked 'failed' and terminated if handshake fails
+
+## stdio Communication Protocol
+
+### Message Format
+
+All communication uses newline-delimited JSON following JSON-RPC 2.0 specification:
+
+**stdin (Satellite → Process):**
+- Write JSON-RPC messages followed by `\n`
+- Requests include `id` field for response matching
+- Notifications omit `id` field (no response expected)
+
+**stdout (Process → Satellite):**
+- Buffer-based parsing accumulates chunks
+- Split on newlines to extract complete messages
+- Incomplete lines remain in buffer for next chunk
+- Parse complete lines as JSON
+
+**Message Types:**
+- **Requests** (with `id`): Expect response, tracked in active requests map
+- **Notifications** (no `id`): Fire-and-forget, no response tracking
+- **Responses**: Match `id` to active request, resolve or reject promise
+
+### Request/Response Handling
+
+**Active Request Tracking:**
+- Map of request ID → \{resolve, reject, timeout, startTime\}
+- Configurable timeout per request (default 30s)
+- Automatic cleanup on response or timeout
+
+**Request Flow:**
+1. Validate process status (must be 'starting' or 'running')
+2. Register timeout handler
+3. Write JSON-RPC message to stdin
+4. Wait for response via stdout parsing
+5. Resolve/reject promise based on response
+
+**Error Handling:**
+- Write errors: Immediate rejection
+- Timeout errors: Clean up active request, reject with timeout message
+- JSON-RPC errors: Extract `error.message` from response
+
+## Process Lifecycle
+
+### Lifecycle States
+
+**starting:**
+- Process spawned with handlers attached
+- MCP handshake in progress
+- Accepts handshake messages only
+
+**running:**
+- Handshake completed successfully
+- Ready for JSON-RPC requests
+- Tools discovered and cached
+
+**terminating:**
+- Graceful shutdown initiated
+- Active requests cancelled
+- Awaiting process exit
+
+**terminated:**
+- Process exited
+- Removed from tracking maps
+
+**failed:**
+- Spawn or handshake failure
+- Not operational
+
+### Graceful Termination
+
+Termination follows a two-phase approach:
+
+1. **SIGTERM Phase**: Send graceful shutdown signal
+2. **SIGKILL Phase**: Force kill if timeout exceeded (default 10s)
+
+**Cleanup Operations:**
+- Cancel all active requests with rejection
+- Clear active requests map
+- Remove from tracking maps (by ID, by name, by team)
+- Emit 'processTerminated' event
+
+## Auto-Restart System
+
+### Crash Detection
+
+The system detects crashes based on exit conditions:
+- Non-zero exit code
+- Process not in 'terminating' state
+- Unexpected signal termination
+
+### Restart Policy
+
+**Limits:**
+- Maximum 3 restart attempts in 5-minute window
+- After limit exceeded: Process marked 'permanently_failed' in RuntimeState
+
+**Backoff Delays:**
+- Process ran >60 seconds before crash: Immediate restart
+- Quick crashes: Exponential backoff (1s → 5s → 15s)
+
+**Restart Flow:**
+1. Detect crash with exit code and signal
+2. Check restart eligibility (3 attempts in 5 minutes)
+3. Apply backoff delay based on uptime
+4. Attempt restart via `spawnProcess()`
+5. Emit 'processRestarted' or 'restartLimitExceeded' event
+
+
+**Permanently Failed State**: After 3 failed restart attempts, processes enter a permanently_failed state and are tracked separately for reporting. They will not be restarted automatically and require manual intervention.
+
+
+## RuntimeState Integration
+
+RuntimeState maintains in-memory tracking of all MCP server processes:
+
+**Tracking Methods:**
+- By process ID (UUID)
+- By installation name (for lookups)
+- By team ID (for team-grouped operations)
+
+**RuntimeProcessInfo Fields:**
+- Extends ProcessInfo with: `installationId`, `installationName`, `teamId`
+- Health status: unknown/healthy/unhealthy
+- Last health check timestamp
+
+**Special Tracking:**
+- **Permanently Failed Map**: Separate storage for processes exceeding restart limits
+- **Team-Grouped Sets**: Map of team_id → Set of process IDs for heartbeat reporting
+
+**State Queries:**
+- Get all processes (includes permanently failed for reporting)
+- Get team processes (filter by team_id)
+- Get running team processes (status='running')
+- Get process count by status
+
+## Process Monitoring
+
+### Metrics Tracked
+
+Each process tracks operational metrics:
+- **Message count**: Total requests sent to process
+- **Error count**: Communication failures
+- **Last activity**: Timestamp of last message sent/received
+- **Uptime**: Calculated from start time
+- **Active requests**: Count of pending requests
+
+### Events Emitted
+
+The ProcessManager emits events for monitoring and integration:
+- `processSpawned`: New process started successfully
+- `processRestarted`: Process restarted after crash
+- `processTerminated`: Process shut down
+- `processExit`: Process exited (any reason)
+- `processError`: Spawn or runtime error
+- `serverNotification`: Notification received from MCP server
+- `restartLimitExceeded`: Max restart attempts reached
+- `restartFailed`: Restart attempt failed
+
+### Logging
+
+**stderr Handling:**
+- Logged at debug level (informational output, not errors)
+- MCP servers often write logs to stderr
+
+**stdout Parse Errors:**
+- Malformed JSON lines logged and skipped
+- Does not crash the process or satellite
+
+**Structured Logging:**
+- All operations include: `installation_name`, `installation_id`, `team_id`
+- Request tracking includes: `request_id`, `method`, `duration_ms`
+- Error context includes: error messages, exit codes, signals
+
+## Team Isolation
+
+### Installation Name Format
+
+Installation names follow strict format for team isolation:
+```
+{server_slug}-{team_slug}-{installation_id}
+```
+
+**Examples:**
+- `filesystem-john-R36no6FGoMFEZO9nWJJLT`
+- `context7-alice-S47mp8GHpNGFZP0oWKKMU`
+
+### Team Access Validation
+
+TeamIsolationService provides:
+- `extractTeamInfo()`: Parse installation name into components
+- `validateTeamAccess()`: Ensure request team matches process team
+- `isValidInstallationName()`: Validate name format
+
+**Team-Specific Features:**
+- RuntimeState groups processes by team_id
+- nsjail uses team-specific hostname: `mcp-{team_id}`
+- Heartbeat reports processes grouped by team
+
+## Performance Characteristics
+
+**Timing:**
+- Spawn time: 1-3 seconds (includes handshake and tool discovery)
+- Message latency: ~10-50ms for stdio communication
+- Handshake timeout: 30 seconds
+
+**Resource Usage:**
+- Memory per process: Base ~10-20MB (application-dependent, limited to 50MB in production)
+- Event-driven architecture: Handles multiple processes concurrently
+- CPU overhead: Minimal (background event loop processing)
+
+**Scalability:**
+- No hard limit on process count (bounded by system resources)
+- Team-grouped tracking enables efficient filtering
+- Permanent failure tracking prevents infinite restart loops
+
+## Development & Testing
+
+### Local Development
+
+**Development Mode:**
+- Uses direct spawn (no nsjail required)
+- Works on macOS, Windows, Linux
+- Full environment inheritance simplifies debugging
+
+**Debug Logging:**
+```bash
+# Enable detailed stdio communication logs
+LOG_LEVEL=debug npm run dev
+```
+
+### Testing Processes
+
+**Manual Testing Methods:**
+- `getAllProcesses()`: Inspect all active processes
+- `getServerStatus(installationName)`: Get detailed process status
+- `restartServer(installationName)`: Test restart functionality
+- `terminateProcess(processInfo)`: Test graceful shutdown
+
+**Platform Support:**
+- Development: All platforms (macOS/Windows/Linux)
+- Production: Linux only (nsjail requirement)
+
+## Security Considerations
+
+**Environment Injection:**
+- Credentials passed securely via environment variables
+- No credentials stored in process arguments or logs
+
+**Resource Limits (Production):**
+- nsjail enforces hard limits: 50MB RAM, 60s CPU, one process
+- Prevents resource exhaustion attacks
+
+**Namespace Isolation (Production):**
+- Complete process isolation per team
+- Separate PID, mount, UTS, IPC namespaces
+
+**Filesystem Jailing (Production):**
+- System directories mounted read-only
+- Only `/tmp` writable
+- Prevents filesystem tampering
+
+**Network Access:**
+- Enabled by default (MCP servers need external connectivity)
+- Can be disabled for higher security requirements
+
+## Related Documentation
+
+- [Satellite Architecture Design](/development/satellite/architecture) - Overall system architecture
+- [Tool Discovery Implementation](/development/satellite/tool-discovery) - How tools are discovered from processes
+- [Team Isolation Implementation](/development/satellite/team-isolation) - Team-based access control
+- [Backend Communication](/development/satellite/backend-communication) - Integration with Backend commands
diff --git a/docs/development/satellite/tool-discovery.mdx b/docs/development/satellite/tool-discovery.mdx
index ee055dd..814c0d7 100644
--- a/docs/development/satellite/tool-discovery.mdx
+++ b/docs/development/satellite/tool-discovery.mdx
@@ -1,6 +1,6 @@
---
title: Tool Discovery Implementation
-description: Technical implementation of remote MCP server tool discovery in DeployStack Satellite - architecture, components, and development patterns.
+description: Technical implementation of MCP server tool discovery in DeployStack Satellite - unified architecture supporting both HTTP/SSE remote servers and stdio subprocess servers.
sidebar: Satellite Development
---
@@ -8,412 +8,309 @@ import { Callout } from 'fumadocs-ui/components/callout';
# Tool Discovery Implementation
-DeployStack Satellite implements automatic tool discovery from MCP servers, providing dynamic tool availability without manual configuration. This system enables MCP clients to discover and execute tools through the satellite's unified interface.
+DeployStack Satellite implements automatic tool discovery from MCP servers across both HTTP/SSE remote endpoints and stdio subprocess servers. This unified system provides dynamic tool availability without manual configuration, enabling MCP clients to discover and execute tools through the satellite's interface.
-**Current Implementation**: Tool discovery currently supports HTTP/SSE remote MCP servers only. Future implementation will add stdio tool discovery from locally spawned MCP server processes (see Phase 2 in [Architecture](/development/satellite/architecture)). Both transport types will use the same caching and namespacing approach.
+**Current Implementation**: Tool discovery fully supports both HTTP/SSE remote MCP servers and stdio subprocess servers through a unified architecture. The `UnifiedToolDiscoveryManager` coordinates discovery across both transport types, merging tools into a single cache for seamless client access.
For information about the overall satellite architecture, see [Satellite Architecture Design](/development/satellite/architecture). For details about the MCP transport protocols that expose discovered tools, see [MCP Transport Protocols](/development/satellite/mcp-transport).
-## Current Implementation: HTTP Tool Discovery
-
-This document describes the current HTTP-based tool discovery system. The same architectural patterns will be extended to support stdio transport in the future.
-
## Technical Overview
-### Discovery Architecture
+### Unified Discovery Architecture
-Tool discovery operates as a startup-time process that queries configured remote MCP servers, caches discovered tools in memory, and exposes them through the satellite's MCP transport layer:
+Tool discovery operates through three coordinated managers that handle different transport types and merge results:
```
┌─────────────────────────────────────────────────────────────────────────────────┐
-│ Tool Discovery Architecture │
+│ Unified Tool Discovery Architecture │
│ │
-│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
-│ │ Remote Tool │ │ HTTP Proxy │ │ MCP Protocol │ │
-│ │ Discovery Mgr │ │ Manager │ │ Handler │ │
-│ │ │ │ │ │ │ │
-│ │ • Startup Query │ │ • Server Config │ │ • tools/list │ │
-│ │ • In-Memory │ │ • SSE Parsing │ │ • tools/call │ │
-│ │ Cache │ │ • Header Mgmt │ │ • Namespacing │ │
-│ │ • Tool Mapping │ │ • Error Handle │ │ • Route Proxy │ │
-│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
-│ │
-│ ┌─────────────────────────────────────────────────────────────────────────┐ │
-│ │ Discovery Data Flow │ │
-│ │ │ │
-│ │ Startup → Query Servers → Parse Tools → Cache → Namespace → Expose │ │
-│ │ │ │ │ │ │ │ │ │
-│ │ Config HTTP POST SSE Parse Memory Prefix MCP API │ │
-│ └─────────────────────────────────────────────────────────────────────────┘ │
+│ ┌─────────────────────────────────────────────────────────────────┐ │
+│ │ UnifiedToolDiscoveryManager │ │
+│ │ │ │
+│ │ • Coordinates both HTTP and stdio discovery │ │
+│ │ • Merges tools from both managers │ │
+│ │ • Single interface for MCP clients │ │
+│ └─────────────────────────────────────────────────────────────────┘ │
+│ │ │ │
+│ ▼ ▼ │
+│ ┌───────────────────────────┐ ┌───────────────────────────┐ │
+│ │ RemoteToolDiscoveryManager │ │ StdioToolDiscoveryManager │ │
+│ │ │ │ │ │
+│ │ • HTTP/SSE servers │ │ • stdio subprocesses │ │
+│ │ • Startup discovery │ │ • Post-spawn discovery │ │
+│ │ • SSE parsing │ │ • JSON-RPC over stdin/out │ │
+│ │ • Static configuration │ │ • Process lifecycle aware │ │
+│ └───────────────────────────┘ └───────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────────┘
```
### Core Components
-**RemoteToolDiscoveryManager:**
-- Queries remote MCP servers during satellite startup
-- Parses Server-Sent Events responses from external servers
-- Maintains in-memory cache of discovered tools with metadata
-- Provides namespaced tool access for conflict resolution
+**UnifiedToolDiscoveryManager:**
+- Coordinates discovery across HTTP/SSE and stdio transport types
+- Merges tools from both managers into unified cache
+- Routes discovery requests based on transport type
+- Provides single interface for MCP protocol handlers
-**HTTP Proxy Manager:**
-- Manages HTTP connections to external MCP servers
-- Handles server-specific headers and authentication
-- Processes both JSON and SSE response formats
-- Routes tool execution requests to appropriate servers
+**RemoteToolDiscoveryManager:**
+- Queries remote HTTP/SSE MCP servers during startup
+- Parses Server-Sent Events responses
+- Maintains in-memory cache with namespacing
+- Handles differential configuration updates
-**MCP Protocol Handler:**
-- Integrates cached tools into MCP transport layer
-- Handles tools/list requests with discovered tool metadata
-- Routes tools/call requests to correct remote servers
-- Manages tool name parsing and server resolution
+**StdioToolDiscoveryManager:**
+- Discovers tools from stdio subprocess MCP servers
+- Executes discovery after process spawn and handshake
+- Automatically clears tools on process termination
+- Tracks tools by server with namespacing
-## Discovery Process
+## Discovery Process by Transport Type
-### Startup Sequence
+### HTTP/SSE Discovery (Startup)
-Tool discovery executes during satellite initialization after HTTP Proxy Manager setup:
+Remote HTTP/SSE servers are discovered during satellite initialization:
```
-Server Start → Backend Connect → HTTP Proxy Init → Tool Discovery → Route Registration
- │ │ │ │ │
- Validate Test Conn Server Config Query Tools MCP Endpoints
- Config Required Load Enabled Parse Cache Ready to Serve
+Startup → Config Load → HTTP Servers → Query tools/list → Cache Tools → Ready
+ │ │ │ │ │ │
+ Init Enabled Only POST Request Parse Response Namespace Serve
```
-**Initialization Steps:**
-1. **HTTP Proxy Manager** loads server configurations from `mcp-servers.ts`
-2. **RemoteToolDiscoveryManager** queries each enabled server with `tools/list`
-3. **SSE Response Parsing** extracts tool definitions from Server-Sent Events
-4. **In-Memory Caching** stores tools with server association and namespacing
-5. **MCP Integration** exposes cached tools through transport endpoints
+**HTTP Discovery Flow:**
+1. Load enabled HTTP/SSE servers from dynamic configuration
+2. Query each server with `tools/list` JSON-RPC request
+3. Parse SSE or JSON responses
+4. Cache tools with namespacing (`server_slug-tool_name`)
+5. Expose through MCP transport endpoints
-### Server Configuration
+### stdio Discovery (Post-Spawn)
-Remote MCP servers are configured in `services/satellite/src/config/mcp-servers.ts`:
+stdio subprocess servers are discovered after process spawning:
-```typescript
-servers: {
- 'context7': {
- name: 'context7',
- type: 'http',
- url: 'https://mcp.context7.com/mcp',
- enabled: true,
- headers: {
- 'Accept': 'application/json, text/event-stream'
- }
- }
-}
+```
+Process Spawn → Handshake → Running → Discover Tools → Cache → Auto-Cleanup
+ │ │ │ │ │ │
+ Backend Cmd Initialize Status tools/list Namespace On Exit
```
-**Configuration Properties:**
-- **name**: Server identifier for namespacing and routing
-- **type**: Transport type (currently 'http' only)
-- **url**: Remote MCP server endpoint URL
-- **enabled**: Boolean flag for server activation
-- **headers**: Custom HTTP headers for server compatibility
-
-### Discovery Query Process
+**stdio Discovery Flow:**
+1. Process spawned via Backend command
+2. MCP handshake completes (initialize + initialized)
+3. Discovery triggered automatically after handshake
+4. Tools cached with namespacing (`server_slug-tool_name`)
+5. Tools cleared automatically on process termination
-The discovery manager queries each enabled server using standard MCP protocol:
+### Discovery Timing Differences
-```
-Discovery Manager Remote MCP Server
- │ │
- │──── POST /mcp ─────────────▶│ (tools/list request)
- │ │
- │◀─── SSE Response ──────────│ (Tool definitions)
- │ │
- │──── Parse Tools ───────────│ (Extract metadata)
- │ │
- │──── Cache Results ─────────│ (Store in memory)
-```
+**HTTP/SSE (Eager):**
+- Discovered at startup before serving requests
+- All HTTP tools available immediately
+- Configuration changes trigger rediscovery
-**Query Specifications:**
-- **Method**: HTTP POST with JSON-RPC 2.0 payload
-- **Headers**: Server-specific headers from configuration
-- **Timeout**: 45 seconds for documentation servers
-- **Response**: Server-Sent Events or JSON format
-- **Error Handling**: Graceful failure with logging
+**stdio (Lazy):**
+- Discovered after process spawn completes
+- Tools become available post-handshake
+- Process termination removes tools automatically
## Tool Caching Strategy
-### In-Memory Cache Design
+### Unified Cache Design
-Tools are cached in memory during startup for performance and reliability:
+Both transport types use identical caching and namespacing:
```typescript
-interface CachedTool {
- serverName: string; // Source server identifier
+interface UnifiedCachedTool {
+ serverName: string; // Installation name
originalName: string; // Tool name from server
- namespacedName: string; // Prefixed name (server-toolname)
+ namespacedName: string; // server_slug-tool_name
description: string; // Tool description
- inputSchema: object; // JSON Schema for parameters
+ inputSchema: object; // JSON Schema
+ transport: 'stdio' | 'http'; // Transport type for routing
+ discoveredAt?: Date; // Discovery timestamp (HTTP only)
}
```
**Cache Characteristics:**
-- **Startup Population**: Tools loaded once during initialization
-- **Memory Storage**: No persistent storage or database dependency
-- **Namespace Prefixing**: Prevents tool name conflicts between servers
-- **Metadata Preservation**: Complete tool definitions with schemas
+- **Unified Namespace**: Same format across both transport types
+- **Memory Storage**: No persistent storage or database
+- **Automatic Cleanup**: stdio tools removed on process exit
+- **Conflict Prevention**: server_slug ensures unique names
### Namespacing Strategy
-Tools are namespaced using server_slug for user-friendly names:
+Both HTTP and stdio tools use identical namespacing:
```
-Original Tool Name: "resolve-library-id"
-Server Slug: "context7"
-Namespaced Name: "context7-resolve-library-id"
-Internal Server: "context7-john-R36no6FGoMFEZO9nWJJLT"
+HTTP Tool Example:
+ Server Slug: "context7"
+ Original: "resolve-library-id"
+ Namespaced: "context7-resolve-library-id"
+
+stdio Tool Example:
+ Server Slug: "filesystem" (extracted from "filesystem-john-abc123")
+ Original: "read_file"
+ Namespaced: "filesystem-read_file"
```
**Namespacing Rules:**
- **Format**: `{server_slug}-{originalToolName}`
-- **Separator**: Single hyphen character
-- **User Display**: Friendly names using server_slug from configuration
-- **Internal Routing**: Uses full server name for team isolation
-- **Uniqueness**: Guaranteed unique names across all servers
+- **HTTP**: Uses `server_slug` from configuration
+- **stdio**: Extracts slug from installation name
+- **Routing**: Internal server names used for team isolation
+- **User Display**: Friendly namespaced names shown to clients
For team-based server resolution, see [Team Isolation Implementation](/development/satellite/team-isolation).
-## SSE Response Processing
+## Configuration Management
-### Server-Sent Events Parsing
+### Dynamic Configuration Updates
-Many MCP servers return responses in SSE format requiring specialized parsing:
+The unified manager handles configuration changes intelligently:
-```
-HTTP Response:
-Content-Type: text/event-stream
-
-event: message
-data: {"jsonrpc":"2.0","id":"1","result":{"tools":[...]}}
+**Differential Updates:**
+- Only discovers tools for added/modified servers
+- Preserves tools for unchanged servers
+- Removes tools for deleted servers
+- Minimizes network overhead and latency
-```
-
-**Parsing Implementation:**
-- **Line Processing**: Split response by newlines
-- **Data Extraction**: Extract content after `data: ` prefix
-- **JSON Parsing**: Parse extracted data as JSON-RPC response
-- **Error Handling**: Graceful failure for malformed responses
-
-### Response Format Handling
-
-The HTTP Proxy Manager handles both JSON and SSE response formats:
-
-```typescript
-const contentType = response.headers.get('content-type') || '';
-
-if (contentType.includes('text/event-stream')) {
- const sseText = await response.text();
- responseData = this.parseSSEResponse(sseText);
-} else {
- responseData = await response.json();
-}
-```
-
-**Format Detection:**
-- **Content-Type Header**: Determines response format
-- **SSE Processing**: Custom parser for event-stream responses
-- **JSON Fallback**: Standard JSON parsing for regular responses
-- **Error Recovery**: Handles parsing failures gracefully
+**Configuration Sources:**
+- HTTP/SSE: Static configuration from Backend polling
+- stdio: Dynamic spawning via Backend commands
+- Both: Support three-tier configuration system
## Tool Execution Flow
-### Request Routing
+### Transport-Aware Routing
-Tool execution requests are routed through the discovery system:
+Tool execution routes to the correct transport based on discovery:
```
-MCP Client Satellite Remote Server
- │ │ │
- │──── tools/call ──────────▶│ │
- │ (context7-resolve...) │ │
- │ │──── Parse Name ────────────│
- │ │ (server: context7) │
- │ │ (tool: resolve...) │
- │ │ │
- │ │──── POST /mcp ─────────────▶│
- │ │ (resolve-library-id) │
- │ │ │
- │ │◀─── SSE Response ──────────│
- │ │ │
- │◀─── JSON Response ───────│ │
+MCP Client → tools/call → Parse Name → Lookup Tool → Route by Transport
+ │ │ │ │ │
+ Request Namespaced Extract Slug Get Cache stdio/HTTP/SSE
```
-**Routing Process:**
-1. **Name Parsing**: Extract server name and tool name from namespaced request
-2. **Server Resolution**: Locate target server configuration
-3. **Request Translation**: Convert namespaced call to original tool name
-4. **Proxy Execution**: Forward request to remote server
-5. **Response Processing**: Parse SSE response and return to client
-
-### Tool Name Resolution
+**HTTP Transport:**
+- Routes to remote HTTP/SSE endpoint
+- Uses HTTP Proxy Manager
+- Handles SSE streaming responses
-The MCP Protocol Handler parses namespaced tool names for routing:
-
-```typescript
-const dashIndex = namespacedToolName.indexOf('-');
-const serverName = namespacedToolName.substring(0, dashIndex);
-const originalToolName = namespacedToolName.substring(dashIndex + 1);
-```
-
-**Resolution Logic:**
-- **First Hyphen**: Separates server name from tool name
-- **Server Lookup**: Validates server exists and is enabled
-- **Tool Validation**: Confirms tool exists in cache
-- **Error Handling**: Returns descriptive errors for invalid requests
+**stdio Transport:**
+- Routes to local subprocess
+- Uses ProcessManager JSON-RPC
+- Communicates over stdin/stdout
## Error Handling & Recovery
### Discovery Failures
-Tool discovery implements graceful failure handling:
-
-```
-Server Unreachable → Log Warning → Continue with Other Servers
-Parse Error → Log Details → Skip Malformed Tools
-Timeout → Log Timeout → Mark Server as Failed
-```
+Both managers implement graceful failure handling:
-**Failure Scenarios:**
-- **Network Errors**: Server unreachable or connection timeout
-- **Protocol Errors**: Invalid JSON-RPC responses or malformed data
-- **Parsing Errors**: SSE format issues or JSON parsing failures
-- **Configuration Errors**: Invalid server URLs or missing headers
+**HTTP Discovery:**
+- Server unreachable → Skip and continue
+- Parse errors → Log and skip malformed tools
+- Timeout → Mark server as failed
-### Runtime Error Recovery
+**stdio Discovery:**
+- Process not running → Error with status check
+- No tools returned → Empty array (valid response)
+- Communication failure → Process restart logic
-During tool execution, errors are handled at multiple levels:
+### Automatic Cleanup
-**HTTP Proxy Level:**
-- Connection failures with retry logic
-- Response parsing errors with fallback
-- Timeout handling with configurable limits
+stdio tools are automatically managed:
-**MCP Protocol Level:**
-- Invalid tool names with descriptive errors
-- Server resolution failures with available tool lists
-- JSON-RPC error propagation from remote servers
+**Process Lifecycle:**
+- **Spawn**: Tools discovered after handshake
+- **Running**: Tools available for execution
+- **Terminate**: Tools removed from cache automatically
## Development Considerations
-### Configuration Management
-
-Server configurations support environment variable substitution:
-
-```typescript
-headers: {
- 'Authorization': 'Bearer ${API_TOKEN}',
- 'Accept': 'application/json, text/event-stream'
-}
-```
-
-**Environment Processing:**
-- **Variable Substitution**: `${VAR_NAME}` replaced with environment values
-- **Missing Variables**: Warnings logged for undefined variables
-- **Security**: Sensitive tokens loaded from environment
-
### Debugging Support
-Comprehensive logging supports development and troubleshooting:
+The debug endpoint shows tools from both transport types:
-```
-[2025-09-10 16:04:40.695] INFO: Returning 2 cached tools from remote MCP servers
- component: "McpProtocolHandler"
- operation: "mcp_tools_list_success"
- tool_count: 2
- tools: ["context7-resolve-library-id", "context7-get-library-docs"]
+```bash
+curl http://localhost:3001/api/status/debug
```
-**Logging Categories:**
-- **Discovery Operations**: Server queries and tool caching
-- **Request Routing**: Tool name parsing and server resolution
-- **Response Processing**: SSE parsing and error handling
-- **Performance Metrics**: Response times and cache statistics
+**Debug Information:**
+- Tools grouped by transport type (HTTP/stdio)
+- Tools grouped by server name
+- Discovery statistics for both managers
+- Process status for stdio servers
-### Testing Strategies
-
-Tool discovery can be tested at multiple levels:
-
-**Unit Testing:**
-- SSE response parsing with various formats
-- Tool namespacing and name resolution logic
-- Configuration loading and validation
+
+**Security Notice**: The debug endpoint exposes detailed system information. Disable in production with `DEPLOYSTACK_STATUS_SHOW_MCP_DEBUG_ROUTE=false`.
+
-**Integration Testing:**
-- End-to-end tool discovery with mock servers
-- MCP protocol compliance with real clients
-- Error handling with network failures
+### Testing Strategies
-**Manual Testing:**
+**Unified Testing:**
```bash
-# Test tool discovery
+# Test tool listing (shows both HTTP and stdio tools)
curl -X POST http://localhost:3001/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":"1","method":"tools/list","params":{}}'
-# Test tool execution
+# Test HTTP tool execution
curl -X POST http://localhost:3001/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":"2","method":"tools/call","params":{"name":"context7-resolve-library-id","arguments":{"libraryName":"react"}}}'
+
+# Test stdio tool execution
+curl -X POST http://localhost:3001/mcp \
+ -H "Content-Type: application/json" \
+ -d '{"jsonrpc":"2.0","id":"3","method":"tools/call","params":{"name":"filesystem-read_file","arguments":{"path":"/tmp/test.txt"}}}'
```
## Performance Characteristics
-### Startup Performance
-
-Tool discovery adds minimal startup overhead:
-
-- **Discovery Time**: 2-5 seconds for typical server configurations
-- **Memory Usage**: ~1KB per discovered tool in cache
-- **Network Overhead**: Single HTTP request per configured server
-- **Failure Impact**: Individual server failures don't block startup
+### HTTP/SSE Performance
-### Runtime Performance
+- **Discovery Time**: 2-5 seconds at startup
+- **Memory**: ~1KB per tool
+- **Overhead**: Single HTTP request per server
+- **Caching**: Persistent until configuration change
-Cached tools provide optimal runtime performance:
+### stdio Performance
-- **Tool Listing**: O(1) memory lookup for tools/list requests
-- **Tool Execution**: Single HTTP proxy request to remote server
-- **No Database**: Eliminates database queries for tool metadata
-- **Memory Efficiency**: Minimal memory footprint for tool cache
+- **Discovery Time**: 1-2 seconds post-spawn
+- **Memory**: ~1KB per tool
+- **Overhead**: Single JSON-RPC request per process
+- **Caching**: Automatic cleanup on process exit
-### Scalability Considerations
+### Scalability
-The current implementation scales well for typical usage:
-
-- **Server Limit**: No hard limit on configured servers
-- **Tool Limit**: Memory-bound by available system RAM
-- **Concurrent Requests**: Limited by HTTP proxy connection pool
-- **Cache Invalidation**: Requires restart for configuration changes
+**Combined Limits:**
+- No hard server limit for either transport
+- Memory-bound by total tool count
+- HTTP: Limited by network connection pool
+- stdio: Limited by system process limits
-**Implementation Status**: Tool discovery is fully implemented and operational. The system successfully discovers tools from remote HTTP MCP servers, caches them in memory, and exposes them through both standard HTTP and SSE streaming transport protocols.
+**Implementation Status**: Tool discovery is fully operational for both HTTP/SSE remote servers and stdio subprocess servers. The unified manager successfully coordinates discovery, merges tools, and routes execution requests to the appropriate transport.
## Future Enhancements
-### Dynamic Discovery
-
-Planned enhancements for production deployment:
+### Dynamic Capabilities
-- **Runtime Refresh**: Periodic tool discovery without restart
-- **Configuration Hot-Reload**: Dynamic server configuration updates
-- **Health Monitoring**: Automatic server availability checking
-- **Cache Persistence**: Optional disk-based cache for faster startup
+**Planned Features:**
+- Runtime refresh for HTTP servers without restart
+- Configuration hot-reload for both transport types
+- Health monitoring with automatic server detection
+- Tool versioning support
### Advanced Features
-Additional capabilities under consideration:
-
-- **Tool Versioning**: Support for versioned tool definitions
-- **Load Balancing**: Distribute requests across multiple server instances
-- **Circuit Breakers**: Automatic failure detection and recovery
-- **Metrics Collection**: Detailed usage and performance analytics
+**Under Consideration:**
+- Load balancing across multiple server instances
+- Circuit breakers for automatic failure recovery
+- Detailed usage and performance analytics
+- Cache persistence for faster startup (HTTP only)
-The tool discovery implementation provides a solid foundation for dynamic MCP server integration while maintaining simplicity and reliability for development and production use.
+The unified tool discovery implementation provides a solid foundation for multi-transport MCP server integration while maintaining simplicity and reliability for development and production use.