Skip to content

[FEATURE] User RBAC Authorization for Kagent #1270

@nujragan93

Description

@nujragan93

📋 Prerequisites

📝 Feature Summary

Add role-based access control (RBAC) authorization for Kagent API, UI, and CLI with OIDC group integration

❓ Problem Statement / Motivation

Current Limitation

    Kagent currently lacks authorization for its UI and API. All users have full access to all resources across all namespaces. The `NoopAuthorizer` allows everything - there are no permission checks.

    ## Security Risks
    This creates security risks in multi-tenant environments where:
    - Users should only see resources in their assigned namespaces
    - Different roles need different permission levels (admins vs readonly users)
    - Sensitive data (API keys, secrets) should be masked from non-admins
    - Unauthorized users can create, modify, or delete critical resources

    ## Who Is Affected
    - **Platform teams** running Kagent in shared/multi-tenant environments
    - **Security teams** requiring audit trails and access control
    - **End users** who need read-only access without modification permissions
    - **Administrators** who need to enforce least-privilege access

    ## Why This Is Needed
    - Compliance requirements for production deployments
    - Prevent accidental or malicious resource modifications
    - Enable safe delegation of access to junior team members or contractors
    - Support enterprise OIDC/SSO integration for unified access control

💡 Proposed Solution

Overview

    Implement Casbin-based RBAC authorization enforced at the API layer (applies to UI, CLI, and direct API calls).

    ## Core Components

    ### 1. Role System
    - **2 built-in roles**: `admin` (full access) and `readonly` (read-only)
    - Support for **multiple roles per user**
    - **Default policy**: authenticated users without explicit roles get `admin` access
    - Extensible architecture for custom roles via Casbin policy configuration

    ### 2. Permission Model
    ```
    ┌─────────────────┬───────┬──────────┐
    │ Resource/Action │ admin │ readonly │
    ├─────────────────┼───────┼──────────┤
    │ Agent           │       │          │
    │   create        │ ✓     │ ✗        │
    │   get/list      │ ✓     │ ✓        │
    │   update        │ ✓     │ ✗        │
    │   delete        │ ✓     │ ✗        │
    │   invoke (chat) │ ✓     │ ✗        │
    ├─────────────────┼───────┼──────────┤
    │ ModelConfig     │       │          │
    │   create        │ ✓     │ ✗        │
    │   get/list      │ ✓     │ ✓ (masked)│
    │   update        │ ✓     │ ✗        │
    │   delete        │ ✓     │ ✗        │
    ├─────────────────┼───────┼──────────┤
    │ MCPServer       │       │          │
    │   create        │ ✓     │ ✗        │
    │   get/list      │ ✓     │ ✓        │
    │   update        │ ✓     │ ✗        │
    │   delete        │ ✓     │ ✗        │
    └─────────────────┴───────┴──────────┘
    ```

    ### 3. Authorization Architecture
    - **Casbin RBAC model** with group policy support
    - **API-level enforcement** (backend validates all requests)
    - **Secret masking** for readonly users (API keys, credentials → `****`)
    - **OIDC group integration** via group policy (`g, group:platform-team, admin`)
    - **Multi-role support** via Casbin's native grouping mechanism

    ### 4. UI Integration
    - New endpoint: `GET /api/auth/permissions` returns user's roles and capabilities
    - Permission-based UI rendering (hide/show buttons based on role)
    - Backend still enforces all permissions (UI is UX-only)

    ### 5. Policy Configuration
    **Model (model.conf):**
    ```ini
    [request_definition]
    r = sub, obj, act

    [policy_definition]
    p = sub, obj, act

    [role_definition]
    g = _, _

    [policy_effect]
    e = some(where (p.eft == allow))

    [matchers]
    m = g(r.sub, p.sub) && r.obj == p.obj && r.act == p.act
    ```

    **Policy (policy.csv):**
    ```csv
    # Admin permissions
    p, admin, Agent, get
    p, admin, Agent, list
    p, admin, Agent, create
    p, admin, Agent, update
    p, admin, Agent, delete
    p, admin, Agent, invoke
    # ... (full permissions for ModelConfig, MCPServer)

    # Readonly permissions
    p, readonly, Agent, get
    p, readonly, Agent, list
    p, readonly, ModelConfig, get
    p, readonly, ModelConfig, list
    p, readonly, MCPServer, get
    p, readonly, MCPServer, list

    # Group assignments (examples)
    # g, group:kagent-admins, admin
    # g, group:kagent-users, readonly
    ```

    ### 6. Role Resolution Flow
    1. User authenticates → Principal contains `User.ID`, `User.Email`, `User.Groups` (from OIDC)
    2. Casbin looks up all roles assigned to user's groups via `g` policies
    3. User receives **all matching roles** (not just first match)
    4. Permission check passes if **ANY** role grants the action
    5. If no roles match, default to `admin` role

    ## Technical Details
    - Replace `NoopAuthorizer` with `CasbinAuthorizer`
    - Policy storage: ConfigMap (static), database adapter (dynamic), or runtime (OIDC mapping)
    - Casbin caching for performance
    - Unit tests covering all role combinations

    ## Extensibility
    - Custom roles: Add policy lines (no code changes)
    - New resource types: Add to policy configuration
    - Future: Agent-to-agent authorization (Principal.Agent already available)

🔄 Alternatives Considered

    ### Alternative 1: OPA (Open Policy Agent)
    - **Pros**: Rego policy language, broader policy use cases, CNCF project
    - **Cons**: Steeper learning curve, more complex for simple RBAC, overkill for current needs
    - **Decision**: Casbin is simpler, battle-tested for RBAC, sufficient for requirements

    ### Alternative 2: Kubernetes RBAC Only
    - **Pros**: Native Kubernetes integration, no external library
    - **Cons**: Cannot control API-level actions (only CRD access), no secret masking, limited to Kubernetes resources
    - **Decision**: Need application-level authorization, not just K8s resource access

🎯 Affected Service(s)

UI Service

📚 Additional Context

Assumptions

    - Authentication (OIDC, JWT, login flow) is handled separately and provides a `Principal` with:
      - `User.ID` (from OIDC `sub` claim)
      - `User.Email` (from OIDC `email` claim)
      - `User.Groups` (from OIDC `groups` claim)
    - Roles are NOT in the Principal - they are resolved at authorization time via Casbin

    ## Out of Scope
    - Authentication implementation (OIDC provider configuration)
    - Agent-to-agent authorization (future work)
    - Audit logging (separate enhancement)
    - Namespace-based resource filtering (all resources visible, but actions restricted)

    ## Risks & Mitigations
    | Risk | Impact | Mitigation |
    |------|--------|------------|
    | Casbin policy misconfiguration | High | Unit tests for all role combinations |
    | Performance impact of authz checks | Low | Casbin has built-in caching |
    | Policy ConfigMap size limits | Low | ConfigMap supports up to 1MB |
    | Breaking existing deployments | Medium | Keep NoopAuthorizer as fallback option |

    ## Open Questions
    1. Should we support policy hot-reload without restart?
    2. Should authorization decisions be logged for compliance/audit?
    3. Should there be a UI for creating custom roles beyond `admin` and `readonly`?

🙋 Are you willing to contribute?

  • I am willing to submit a PR for this feature

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions