|
1 | | -# NeMoGuard Safety Rails Example |
| 1 | +# NeMoGuard Safety Rails with Caching |
2 | 2 |
|
3 | | -This example showcases the use of NVIDIA's NeMoGuard NIMs for comprehensive AI safety including content moderation, topic control, and jailbreak detection. |
| 3 | +This example demonstrates how to configure NeMo Guardrails with caching support for multiple NVIDIA NeMoGuard NIMs, including content safety, topic control, and jailbreak detection. |
4 | 4 |
|
5 | | -## Configuration Files |
| 5 | +## Features |
6 | 6 |
|
7 | | -- `config.yml` - Defines the models configuration including the main LLM and three NeMoGuard NIMs for safety checks |
8 | | -- `prompts.yml` - Contains prompt templates for content safety and topic control checks |
| 7 | +- **Content Safety Checks**: Validates content against 23 safety categories (input and output) |
| 8 | +- **Topic Control**: Ensures conversations stay within allowed topics (input) |
| 9 | +- **Jailbreak Detection**: Detects and prevents jailbreak attempts (input) |
| 10 | +- **Per-Model Caching**: Each safety model has its own dedicated cache instance |
| 11 | +- **Thread Safety**: Fully thread-safe for use in multi-threaded web servers |
| 12 | +- **Cache Statistics**: Optional performance monitoring for each model |
| 13 | + |
| 14 | +## Folder Structure |
| 15 | + |
| 16 | +- `config.yml` - Main configuration file with model definitions, rails configuration, and cache settings |
| 17 | +- `prompts.yml` - Prompt templates for content safety and topic control checks |
| 18 | + |
| 19 | +## Configuration Overview |
| 20 | + |
| 21 | +### Basic Configuration with Caching |
| 22 | + |
| 23 | +```yaml |
| 24 | +models: |
| 25 | + - type: main |
| 26 | + engine: nim |
| 27 | + model: meta/llama-3.3-70b-instruct |
| 28 | + |
| 29 | + - type: content_safety |
| 30 | + engine: nim |
| 31 | + model: nvidia/llama-3.1-nemoguard-8b-content-safety |
| 32 | + cache: |
| 33 | + enabled: true |
| 34 | + maxsize: 10000 |
| 35 | + stats: |
| 36 | + enabled: true |
| 37 | + |
| 38 | + - type: topic_control |
| 39 | + engine: nim |
| 40 | + model: nvidia/llama-3.1-nemoguard-8b-topic-control |
| 41 | + cache: |
| 42 | + enabled: true |
| 43 | + maxsize: 10000 |
| 44 | + stats: |
| 45 | + enabled: true |
| 46 | + |
| 47 | + - type: jailbreak_detection |
| 48 | + engine: nim |
| 49 | + model: jailbreak_detect |
| 50 | + cache: |
| 51 | + enabled: true |
| 52 | + maxsize: 10000 |
| 53 | + stats: |
| 54 | + enabled: true |
| 55 | + |
| 56 | +rails: |
| 57 | + input: |
| 58 | + flows: |
| 59 | + - jailbreak detection model |
| 60 | + - content safety check input $model=content_safety |
| 61 | + - topic safety check input $model=topic_control |
| 62 | + |
| 63 | + output: |
| 64 | + flows: |
| 65 | + - content safety check output $model=content_safety |
| 66 | + |
| 67 | + config: |
| 68 | + jailbreak_detection: |
| 69 | + nim_base_url: "https://ai.api.nvidia.com" |
| 70 | + nim_server_endpoint: "/v1/security/nvidia/nemoguard-jailbreak-detect" |
| 71 | + api_key_env_var: NVIDIA_API_KEY |
| 72 | +``` |
9 | 73 |
|
10 | 74 | ## NeMoGuard NIMs Used |
11 | 75 |
|
12 | | -1. **Content Safety** (`nvidia/llama-3.1-nemoguard-8b-content-safety`) - Checks for unsafe content across 23 safety categories |
13 | | -2. **Topic Control** (`nvidia/llama-3.1-nemoguard-8b-topic-control`) - Ensures conversations stay within allowed topics |
14 | | -3. **Jailbreak Detection** - Detects and prevents jailbreak attempts (configured via `nim_server_endpoint`) |
| 76 | +### 1. Content Safety (`nvidia/llama-3.1-nemoguard-8b-content-safety`) |
| 77 | + |
| 78 | +Checks for unsafe content across 23 safety categories including violence, hate speech, sexual content, and more. |
| 79 | + |
| 80 | +**Cache Configuration:** |
| 81 | + |
| 82 | +```yaml |
| 83 | +- type: content_safety |
| 84 | + engine: nim |
| 85 | + model: nvidia/llama-3.1-nemoguard-8b-content-safety |
| 86 | + cache: |
| 87 | + enabled: true |
| 88 | + maxsize: 10000 |
| 89 | + stats: |
| 90 | + enabled: true |
| 91 | +``` |
| 92 | + |
| 93 | +### 2. Topic Control (`nvidia/llama-3.1-nemoguard-8b-topic-control`) |
| 94 | + |
| 95 | +Ensures conversations stay within allowed topics and prevents topic drift. |
| 96 | + |
| 97 | +**Cache Configuration:** |
| 98 | + |
| 99 | +```yaml |
| 100 | +- type: topic_control |
| 101 | + engine: nim |
| 102 | + model: nvidia/llama-3.1-nemoguard-8b-topic-control |
| 103 | + cache: |
| 104 | + enabled: true |
| 105 | + maxsize: 10000 |
| 106 | + stats: |
| 107 | + enabled: true |
| 108 | +``` |
| 109 | + |
| 110 | +### 3. Jailbreak Detection (`jailbreak_detect`) |
| 111 | + |
| 112 | +Detects and prevents jailbreak attempts that try to bypass safety measures. |
| 113 | + |
| 114 | +**IMPORTANT**: For jailbreak detection caching to work, the `type` and `model` **MUST** be set to these exact values: |
| 115 | + |
| 116 | +- `type: jailbreak_detection` |
| 117 | +- `model: jailbreak_detect` |
| 118 | + |
| 119 | +**Cache Configuration:** |
| 120 | + |
| 121 | +```yaml |
| 122 | +- type: jailbreak_detection |
| 123 | + engine: nim |
| 124 | + model: jailbreak_detect |
| 125 | + cache: |
| 126 | + enabled: true |
| 127 | + maxsize: 10000 |
| 128 | + stats: |
| 129 | + enabled: true |
| 130 | +``` |
| 131 | + |
| 132 | +The actual NIM endpoint is configured separately in the `rails.config` section: |
| 133 | + |
| 134 | +```yaml |
| 135 | +rails: |
| 136 | + config: |
| 137 | + jailbreak_detection: |
| 138 | + nim_base_url: "https://ai.api.nvidia.com" |
| 139 | + nim_server_endpoint: "/v1/security/nvidia/nemoguard-jailbreak-detect" |
| 140 | + api_key_env_var: NVIDIA_API_KEY |
| 141 | +``` |
| 142 | + |
| 143 | +## How It Works |
| 144 | + |
| 145 | +1. **User Input**: When a user sends a message, it goes through multiple safety checks: |
| 146 | + - Jailbreak detection evaluates for manipulation attempts |
| 147 | + - Content safety checks for unsafe content |
| 148 | + - Topic control validates topic adherence |
| 149 | + |
| 150 | +2. **Caching**: Each model has its own cache: |
| 151 | + - First check: API call to NeMoGuard NIM, result cached |
| 152 | + - Subsequent identical inputs: Cache hit, no API call needed |
| 153 | + |
| 154 | +3. **Response Generation**: If all input checks pass, the main model generates a response |
| 155 | + |
| 156 | +4. **Output Check**: The response is checked by content safety before returning to user |
| 157 | + |
| 158 | +## Cache Configuration Options |
| 159 | + |
| 160 | +### Default Behavior (No Caching) |
| 161 | + |
| 162 | +By default, caching is **disabled**. Models without cache configuration will have no caching. |
| 163 | + |
| 164 | +### Enabling Cache |
| 165 | + |
| 166 | +Add cache configuration to any model definition: |
| 167 | + |
| 168 | +```yaml |
| 169 | +cache: |
| 170 | + enabled: true # Enable caching |
| 171 | + maxsize: 10000 # Cache capacity (number of entries) |
| 172 | + stats: |
| 173 | + enabled: true # Enable statistics tracking |
| 174 | + log_interval: 300.0 # Log stats every 5 minutes (optional) |
| 175 | +``` |
| 176 | + |
| 177 | +### Cache Configuration Parameters |
| 178 | + |
| 179 | +- **enabled**: `true` to enable caching, `false` to disable |
| 180 | +- **maxsize**: Maximum number of entries in the cache (LRU eviction when full) |
| 181 | +- **stats.enabled**: Track cache hit/miss rates and performance metrics |
| 182 | +- **stats.log_interval**: How often to log statistics (in seconds, optional) |
| 183 | + |
| 184 | +## Architecture |
| 185 | + |
| 186 | +Each NeMoGuard model gets its own dedicated cache instance, providing: |
| 187 | + |
| 188 | +- **Isolated cache management** per model |
| 189 | +- **Different cache capacities** for different models |
| 190 | +- **Model-specific performance tuning** |
| 191 | +- **Thread-safe concurrent access** |
| 192 | + |
| 193 | +This architecture allows you to: |
| 194 | + |
| 195 | +- Set larger caches for frequently-used models |
| 196 | +- Disable caching for specific models |
| 197 | +- Monitor performance per model |
| 198 | + |
| 199 | +## Thread Safety |
| 200 | + |
| 201 | +The implementation is fully thread-safe: |
| 202 | + |
| 203 | +- **Concurrent Requests**: Safely handles multiple simultaneous safety checks |
| 204 | +- **Efficient Locking**: Uses RLock for minimal performance impact |
| 205 | +- **Atomic Operations**: Prevents duplicate LLM calls for the same content |
| 206 | + |
| 207 | +Suitable for: |
| 208 | + |
| 209 | +- Multi-threaded web servers (FastAPI, Flask, Django) |
| 210 | +- Concurrent request processing |
| 211 | +- High-traffic applications |
| 212 | + |
| 213 | +## Running the Example |
| 214 | + |
| 215 | +```bash |
| 216 | +export NVIDIA_API_KEY=your_api_key_here |
| 217 | +
|
| 218 | +nemoguardrails server --config examples/configs/nemoguards_cache/ |
| 219 | +``` |
| 220 | + |
| 221 | +## Benefits |
| 222 | + |
| 223 | +1. **Performance**: Avoid redundant NeMoGuard API calls for repeated inputs |
| 224 | +2. **Cost Savings**: Reduce API usage significantly |
| 225 | +3. **Flexibility**: Enable caching per model based on usage patterns |
| 226 | +4. **Clean Architecture**: Each model has its own dedicated cache |
| 227 | +5. **Scalability**: Easy to add new models with different caching strategies |
| 228 | +6. **Observability**: Cache statistics help monitor effectiveness |
| 229 | + |
| 230 | +## Tips |
| 231 | + |
| 232 | +- Start with moderate cache sizes (5,000-10,000 entries) and adjust based on usage |
| 233 | +- Enable stats logging to monitor cache effectiveness |
| 234 | +- Jailbreak detection typically has high cache hit rates |
| 235 | +- Content safety caching is most effective for chatbots with common queries |
| 236 | +- Topic control benefits from caching when topics are well-defined |
| 237 | +- Adjust cache sizes independently for each model based on their usage patterns |
15 | 238 |
|
16 | 239 | ## Documentation |
17 | 240 |
|
|
0 commit comments