diff --git a/AGENTS.md b/AGENTS.md index 6e9aeb023..140afc1d4 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -27,6 +27,9 @@ - Prefer file-scoped changes over project-wide modifications when possible - Always review your own code for consistency, maintainability, and testability - Always ask for clarifications if the request is ambiguous or lacks sufficient context +- Write transport-agnostic commands that work in both stdio and HTTP modes +- Keep commands stateless and thread-safe for multi-user remote scenarios +- Test commands with different RBAC permissions for OBO scenarios ## Don't - Use `subscriptionId` parameter name @@ -41,6 +44,11 @@ - Skip error handling or comprehensive tests - Use dashes in command group names (use concatenated lowercase) - Make project-wide changes when file-scoped changes suffice +- Check transport type in commands (stdio vs HTTP) +- Store per-request state in command instance fields +- Access HttpContext directly from commands +- Make transport-specific decisions in command logic +- Assume single-user scenarios when implementing services ## Commands @@ -593,6 +601,63 @@ All new toolsets must be AOT-compatible or excluded from native builds: - Implement `BaseAzureResourceService` for efficient Resource Graph queries - Follow retry policy patterns with `RetryPolicyOptions` +## Remote MCP Server Architecture + +Azure MCP Server supports **stdio** (local) and **HTTP** (remote) transports with different authentication models. + +### Key Differences: Stdio vs Remote HTTP + +| Aspect | Stdio Mode | Remote HTTP Mode | +|--------|-----------|------------------| +| **Concurrency** | Single user | Multiple concurrent users | +| **State Management** | Can use instance fields | Must be stateless | +| **Deployment** | Local binaries | Cloud hosting (App Service, AKS) | +| **Configuration** | Simple (no auth) | Requires Entra ID app registration | + +### Authentication Strategies + +**On-Behalf-Of (OBO) Flow:** +- Per-user authorization with audit trails +- User's RBAC permissions enforced +- Requires API permissions and admin consent +- Command: `--run-as-remote-http-service --outgoing-auth-strategy UseOnBehalfOf` + +**Hosting Environment Identity:** +- Service-level permissions using Managed Identity +- Simpler configuration, no token exchange overhead +- All users share server's permissions +- Command: `--run-as-remote-http-service --outgoing-auth-strategy UseHostingEnvironmentIdentity` + +### Command Implementation for Remote Mode + +**Critical Requirements:** +- Write transport-agnostic commands (work in both stdio and HTTP modes) +- Use `IAzureTokenCredentialProvider` for all Azure authentication +- Keep commands stateless and thread-safe (no instance field state) +- Test with different RBAC permissions for OBO scenarios +- Provide context-aware error messages for remote scenarios + +**Key Patterns:** +```csharp +// ✅ Correct: Authentication provider handles both modes +var credential = await GetCredentialAsync(null, CancellationToken.None); +var armClient = new ArmClient(credential); + +// ❌ Wrong: Don't check transport type or access HttpContext +if (Environment.GetEnvironmentVariable("ASPNETCORE_URLS") != null) { } +var httpContext = _httpContextAccessor.HttpContext; +``` + +### Security Best Practices + +1. Always use HTTPS in production +2. Implement least privilege RBAC +3. Use OBO for multi-tenant scenarios (preserves user identity) +4. Secure configuration secrets with Azure Key Vault +5. Enable Application Insights for monitoring +6. Validate token claims (audience, issuer, scopes) +7. Use Managed Identity when possible + ## External MCP Server Integration The Azure MCP Server can proxy to external MCP servers via `registry.json`: diff --git a/servers/Azure.Mcp.Server/TROUBLESHOOTING.md b/servers/Azure.Mcp.Server/TROUBLESHOOTING.md index bec5a7a83..e7c1c07dc 100644 --- a/servers/Azure.Mcp.Server/TROUBLESHOOTING.md +++ b/servers/Azure.Mcp.Server/TROUBLESHOOTING.md @@ -823,9 +823,90 @@ On Windows, Azure CLI stores credentials in an encrypted format that cannot be a > > SSE was deprecated in MCP `2025-03-26` due to [security vulnerabilities and architectural limitations](https://blog.fka.dev/blog/2025-06-06-why-mcp-deprecated-sse-and-go-with-streamable-http/). Users must discontinue use of SSE transport mode and upgrade to version `0.4.0` or newer to maintain compatibility with current MCP clients. +The Azure MCP Server supports remote hosting over HTTP using the **Streamable HTTP transport protocol**. For detailed configuration and deployment guidance, see the [Remote MCP Server section in CONTRIBUTING.md](https://github.com/microsoft/mcp/blob/main/CONTRIBUTING.md#remote-mcp-server-streamable-http-transport). + ### Streamable HTTP Transport -See the [contributing guide](https://github.com/microsoft/mcp/blob/main/CONTRIBUTING.md#run-the-azure-mcp-server-in-http-mode) for running the Azure MCP server with Streamable HTTP Transport. +- See the [contributing guide](https://github.com/microsoft/mcp/blob/main/CONTRIBUTING.md#run-the-azure-mcp-server-in-http-mode) for running the Azure MCP server with Streamable HTTP Transport. + +### Common Issues + +#### 401 Unauthorized - Invalid Token + +**Causes:** Token expired, wrong audience, or missing bearer token. + +**Resolution:** +1. Verify token acquisition: + ```bash + az account get-access-token --resource api:// + ``` + +2. Validate token claims at [jwt.ms](https://jwt.ms): + - `aud`: Must match server's ClientId + - `tid`: Must match server's TenantId + - `scp`: Must include `Mcp.Tools.ReadWrite` + +#### 403 Forbidden - Insufficient Permissions + +**Causes:** Missing scope/app role or user not assigned to application. + +**Resolution:** +1. For delegated permissions, check scope in token: + ```bash + az ad app permission admin-consent --id + ``` + +2. Verify user assignment in Azure Portal: + - Entra ID → Enterprise Applications → [Your App] → Users and groups + +#### OBO Token Exchange Failures + +**Causes:** Server app missing API permissions or client token lacks scopes. + +**Resolution:** +1. Grant Azure Management API permissions: + ```bash + az ad app permission add \ + --id \ + --api https://management.azure.com/ \ + --api-permissions user_impersonation=Scope + + az ad app permission admin-consent --id + ``` + +2. Add `knownClientApplications` to server's app manifest: + ```json + { + "knownClientApplications": [""] + } + ``` + +#### Azure Service Access Denied + +**Causes:** Missing RBAC permissions on Azure resources. + +**Resolution:** + +**For OBO (per-user):** +```bash +az role assignment create \ + --assignee user@domain.com \ + --role "Storage Blob Data Reader" \ + --scope /subscriptions//resourceGroups/ +``` + +**For Hosting Environment (managed identity):** +```bash +IDENTITY_ID=$(az webapp identity show \ + --name \ + --resource-group \ + --query principalId -o tsv) + +az role assignment create \ + --assignee $IDENTITY_ID \ + --role Reader \ + --scope /subscriptions/ +``` ## Logging and Diagnostics diff --git a/servers/Azure.Mcp.Server/docs/new-command.md b/servers/Azure.Mcp.Server/docs/new-command.md index eaf02803f..db087c30f 100644 --- a/servers/Azure.Mcp.Server/docs/new-command.md +++ b/servers/Azure.Mcp.Server/docs/new-command.md @@ -2116,6 +2116,340 @@ var subscriptionResource = armClient.GetSubscriptionResource(new ResourceIdentif -**Prevention**: Test AOT compilation early in development using `./eng/scripts/Build-Local.ps1 -BuildNative` -**Note**: Toolsets excluded from AOT builds are still available in regular builds and deployments +## Remote MCP Server Considerations + +When implementing commands for Azure MCP, consider how they will behave in **remote HTTP mode** with multiple concurrent users. Remote MCP servers support both **stdio** (local) and **HTTP** (remote) transports with different authentication models. + +### Authentication Strategies + +Azure MCP Server supports two outgoing authentication strategies when running in remote HTTP mode: + +#### 1. On-Behalf-Of (OBO) Flow + +**Use when:** Per-user authorization required, multi-tenant scenarios, audit trail with individual user identities + +**How it works:** +- Client authenticates user with Entra ID and sends bearer token +- MCP server validates incoming token +- Server exchanges user's token for downstream Azure service tokens +- Each Azure API call uses user's identity and permissions + +**Command Implementation Impact:** +```csharp +// No changes needed in command code! +// Authentication provider automatically handles OBO token acquisition +var credential = await _tokenCredentialProvider.GetTokenCredentialAsync(tenant, cancellationToken); + +// This credential will use OBO flow when configured +// User's RBAC permissions enforced on Azure resources +``` + +**Testing Considerations:** +- Ensure test users have appropriate RBAC permissions on Azure resources +- Test with multiple users having different permission levels +- Verify audit logs show correct user identity + +#### 2. Hosting Environment Identity + +**Use when:** Simplified deployment, service-level permissions sufficient, single-tenant scenarios + +**How it works:** +- MCP server uses its own identity (Managed Identity, Service Principal, etc.) +- All downstream Azure calls use server's credentials +- Behaves like `DefaultAzureCredential` in local stdio mode + +**Command Implementation Impact:** +```csharp +// No changes needed in command code! +// Authentication provider automatically uses server's identity +var credential = await _tokenCredentialProvider.GetTokenCredentialAsync(tenant, cancellationToken); + +// This credential will use server's Managed Identity when configured +// Server's RBAC permissions apply to all users +``` + +**Testing Considerations:** +- Grant server identity (Managed Identity or test user) necessary RBAC permissions +- All users share same permission level in this mode + +### Transport-Agnostic Command Design + +Commands should be **transport-agnostic** - they work identically in stdio and HTTP modes: + +**Good:** +```csharp +public sealed class StorageAccountGetCommand : SubscriptionCommand +{ + private readonly IStorageService _storageService; + + public StorageAccountGetCommand( + IStorageService storageService, + ILogger logger) + : base(logger) + { + _storageService = storageService; + } + + public override async Task ExecuteAsync( + CommandContext context, + ParseResult parseResult) + { + var options = BindOptions(parseResult); + + // Authentication provider handles both stdio and HTTP scenarios + var accounts = await _storageService.GetStorageAccountsAsync( + options.Subscription!, + options.ResourceGroup, + options.RetryPolicy); + + // Standard response format works for all transports + context.Response.Results = ResponseResult.Create( + new(accounts ?? []), + StorageJsonContext.Default.CommandResult); + + return context.Response; + } +} +``` + +**Bad:** +```csharp +// ❌ Don't check environment or make transport-specific decisions +public override async Task ExecuteAsync(...) +{ + // ❌ Don't do this - defeats purpose of abstraction + if (Environment.GetEnvironmentVariable("ASPNETCORE_URLS") != null) + { + // Different behavior for HTTP mode + } + + // ❌ Don't access HttpContext directly in commands + var httpContext = _httpContextAccessor.HttpContext; + if (httpContext != null) + { + // ❌ Don't branch on HTTP vs stdio + } +} +``` + +### Service Layer Best Practices + +When implementing services that call Azure, use `IAzureTokenCredentialProvider`: + +```csharp +public class StorageService : BaseAzureService, IStorageService +{ + public StorageService( + ITenantService tenantService, + ILogger logger) + : base(tenantService, logger) + { + } + + public async Task> GetStorageAccountsAsync( + string subscription, + string? resourceGroup, + RetryPolicyOptions? retryPolicy, + CancellationToken cancellationToken = default) + { + // ✅ Use base class methods that handle authentication and ARM client creation + var armClient = await CreateArmClientAsync(tenant: null, retryPolicy); + + // ✅ CreateArmClientAsync automatically uses appropriate auth strategy: + // - OBO flow in remote HTTP mode with --outgoing-auth-strategy UseOnBehalfOf + // - Server identity in remote HTTP mode with --outgoing-auth-strategy UseHostingEnvironmentIdentity + // - Local identity in stdio mode (Azure CLI, VS Code, etc.) + + // ... Azure SDK calls + } +} +``` + +### Multi-User and Concurrency + +Remote HTTP mode supports **multiple concurrent users**: + +**Thread Safety:** +- All commands must be **stateless** and **thread-safe** +- Don't store per-request state in command instance fields +- Use constructor injection for singleton services only +- Per-request data flows through `CommandContext` and options + +**Good:** +```csharp +public sealed class SqlDatabaseListCommand : SubscriptionCommand +{ + private readonly ISqlService _sqlService; // ✅ Singleton service, thread-safe + + public SqlDatabaseListCommand( + ISqlService sqlService, + ILogger logger) + : base(logger) + { + _sqlService = sqlService; + } + + public override async Task ExecuteAsync( + CommandContext context, + ParseResult parseResult) + { + // ✅ Options created per-request, no shared state + var options = BindOptions(parseResult); + + // ✅ Service calls are async and don't store request state + var databases = await _sqlService.ListDatabasesAsync( + options.Subscription!, + options.ResourceGroup, + options.Server); + + return context.Response; + } +} +``` + +**Bad:** +```csharp +public sealed class BadCommand : SubscriptionCommand +{ + // ❌ Don't store per-request state in command fields + private CommandContext? _currentContext; + private BadCommandOptions? _currentOptions; + + public override async Task ExecuteAsync( + CommandContext context, + ParseResult parseResult) + { + // ❌ Race condition with multiple concurrent requests + _currentContext = context; + _currentOptions = BindOptions(parseResult); + + // ❌ Another request might overwrite these before we use them + await Task.Delay(100); + return _currentContext.Response; + } +} +``` + +### Tenant Context Handling + +Some commands need tenant ID for Azure calls. Handle this correctly for both modes: + +```csharp +public async Task> GetResourcesAsync( + string subscription, + string? tenant, + RetryPolicyOptions? retryPolicy) +{ + // ✅ ITenantService handles tenant resolution for all modes + // - In On Behalf Of mode: Validates tenant matches user's token + // - In hosting environment mode: Uses provided tenant or default + // - In stdio mode: Uses Azure CLI/VS Code default tenant + + var credential = await GetCredentialAsync(tenant, CancellationToken.None); + + // ✅ If tenant is null, service will use default tenant + // ✅ If tenant is provided, service validates it's accessible + + var armClient = new ArmClient(credential); + // ... rest of implementation +} +``` + +### Error Handling for Remote Scenarios + +Add appropriate error messages for remote HTTP scenarios: + +```csharp +protected override string GetErrorMessage(Exception ex) => ex switch +{ + RequestFailedException reqEx when reqEx.Status == 401 => + "Authentication failed. In remote mode, ensure your token has the required " + + "Mcp.Tools.ReadWrite scope and sufficient RBAC permissions on Azure resources.", + + RequestFailedException reqEx when reqEx.Status == 403 => + "Authorization failed. Your user account lacks the required RBAC permissions. " + + "In remote mode with On Behalf Of flow, permissions come from the authenticated user's identity. Learn more at https://learn.microsoft.com/entra/identity-platform/v2-oauth2-on-behalf-of-flow", + + InvalidOperationException invEx when invEx.Message.Contains("tenant") => + "Tenant mismatch. In remote OBO mode, the requested tenant must match your " + + "authenticated user's tenant ID.", + + _ => base.GetErrorMessage(ex) +}; +``` + +### Testing Commands for Remote Mode + +When writing tests, consider both transport modes: + +**Unit Tests** (Always Required): +- Mock all external dependencies +- Test command logic in isolation +- No Azure resources required +- Fast execution + +**Live Tests** (Required for Azure Service Commands): +- Test against real Azure resources +- Verify Azure SDK integration +- Validate RBAC permissions +- Test both stdio and HTTP modes + +**Example Live Test Setup:** +```csharp +// Live tests should work in both modes by using appropriate credentials +public class StorageCommandLiveTests : IAsyncLifetime +{ + private readonly TestSettings _settings; + + public async Task InitializeAsync() + { + _settings = TestSettings.Load(); + + // Test infrastructure supports both modes: + // - Stdio mode: Uses Azure CLI/VS Code credentials + // - HTTP mode: Can simulate OBO or hosting environment identity + } + + [Fact] + public async Task ListStorageAccounts_ReturnsAccounts() + { + // Test works identically in both stdio and HTTP modes + var result = await CallToolAsync( + "azmcp_storage_account_list", + new { subscription = _settings.SubscriptionId }); + + Assert.NotNull(result); + } +} +``` + +### Documentation Requirements for Remote Mode + +When documenting new commands, include remote mode considerations: + +**In azmcp-commands.md:** +```markdown +## azmcp storage account list + +Lists storage accounts in a subscription. + +### Permissions + +**Stdio Mode:** +- Requires authenticated Azure identity (Azure CLI, VS Code, Managed Identity) +- Uses your local RBAC permissions + +**Remote HTTP Mode (OBO):** +- Requires authenticated user with `Mcp.Tools.ReadWrite` scope +- Uses authenticated user's RBAC permissions +- Audit logs show individual user identity + +**Remote HTTP Mode (Hosting Environment):** +- Requires authenticated user with `Mcp.Tools.ReadWrite` scope +- Uses MCP server's Managed Identity RBAC permissions +- All users share server's permission level +``` + ## Checklist Before submitting: