Skip to content

Commit 6e8a5d7

Browse files
docs: Add mention of Nemotron (#1200)
Signed-off-by: Mike McKiernan <mmckiernan@nvidia.com>
1 parent 0787125 commit 6e8a5d7

File tree

1 file changed

+140
-8
lines changed

1 file changed

+140
-8
lines changed

docs/user-guides/configuration-guide.md

Lines changed: 140 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -119,14 +119,14 @@ For more details about the command and its usage, see the [CLI documentation](..
119119

120120
#### Using LLMs with Reasoning Traces
121121

122-
By default, reasoning models, such as [DeepSeek-R1](https://huggingface.co/collections/deepseek-ai/deepseek-r1-678e1e131c0169c0bc89728d), include the reasoning traces in the model response.
123-
DeepSeek models use `<think>` and `</think>` as tokens to identify the traces.
122+
By default, reasoning models, such as [DeepSeek-R1](https://huggingface.co/collections/deepseek-ai/deepseek-r1-678e1e131c0169c0bc89728d) and [NVIDIA Llama 3.1 Nemotron Ultra 253B V1](https://build.nvidia.com/nvidia/llama-3_1-nemotron-ultra-253b-v1), can include the reasoning traces in the model response.
123+
DeepSeek and the Nemotron family of models use `<think>` and `</think>` as tokens to identify the traces.
124124

125-
The reasoning traces and the tokens usually interfere with NeMo Guardrails and result in falsely triggering output guardrails for safe responses.
125+
The reasoning traces and the tokens can interfere with NeMo Guardrails and result in falsely triggering output guardrails for safe responses.
126126
To use these reasoning models, you can remove the traces and tokens from the model response with a configuration like the following example.
127127

128128
```{code-block} yaml
129-
:emphasize-lines: 5-
129+
:emphasize-lines: 5-8, 13-
130130
131131
models:
132132
- type: main
@@ -136,16 +136,148 @@ models:
136136
remove_reasoning_traces: True
137137
start_token: "<think>"
138138
end_token: "</think>"
139+
140+
- type: main
141+
engine: nim
142+
model: nvidia/llama-3.1-nemotron-ultra-253b-v1
143+
reasoning_config:
144+
remove_reasoning_traces: True
145+
146+
rails:
147+
output:
148+
apply_to_reasoning_traces: False
149+
```
150+
151+
```{list-table}
152+
:header-rows: 1
153+
154+
* - Field
155+
- Description
156+
- Default Value
157+
158+
* - `reasoning_config.remove_reasoning_traces`
159+
- When set to `True`, reasoning traces are omitted from internal tasks.
160+
- `True`
161+
162+
* - `reasoning_config.start_token`
163+
- Specifies the start token for the reasoning trace.
164+
- `<think>`
165+
166+
* - `reasoning_config.end_token`
167+
- Specifies the end token for the reasoning trace.
168+
- `</think>`
169+
170+
* - `rails.output.apply_to_reasoning_traces`
171+
- When set to `True`, output rails are always applied to the reasoning traces and the model response.
172+
The value of `remove_reasoning_traces` is ignored when this field is set to `True`.
173+
174+
By default, output rails are applied to the text of the model response only.
175+
- `False`
139176
```
140177

141178
The `reasoning_config` field for a model specifies the required configuration for a reasoning model that returns reasoning traces.
142179
By removing the traces, the guardrails runtime processes only the actual responses from the LLM.
143180

144-
You can specify the following parameters for a reasoning model:
181+
The following table summarizes the interaction between the `remove_reasoning_traces` and `apply_to_reasoning_traces` values:
182+
183+
```{list-table}
184+
:header-rows: 1
185+
186+
* - `remove_reasoning_traces`
187+
- `output.apply_to_reasoning_traces`
188+
- Outcome
189+
190+
* - Any
191+
- True
192+
- Reasoning traces are not removed and output rails are applied to the reasoning traces and the model response.
193+
The value of `remove_reasoning_traces` is ignored.
194+
195+
* - False
196+
- False
197+
- Reasoning traces are not removed from internal tasks where they do not impact Guardrails functionality.
198+
Output rails are applied to the reasoning traces and the model response.
199+
200+
* - True
201+
- False
202+
- Reasoning traces are removed from internal tasks where they could interfere with Guardrails.
203+
Output rails are applied to the model response only.
204+
```
205+
206+
Even when `remove_reasoning_traces` is `True`, end users can still receive the thinking traces from the Nemotron models by requesting the detailed thinking, as shown in the following example:
207+
208+
```{code-block} bash
209+
from nemoguardrails import LLMRails, RailsConfig
210+
211+
config = RailsConfig.from_path("./config")
212+
rails = LLMRails(config, verbose=True)
213+
messages = [
214+
{ "role": "system", "content": "detailed thinking on" },
215+
{ "role": "user", "content": "Tell me about Cape Hatteras National Seashore in 50 words or less." }
216+
]
217+
rails.generate(messages=messages)
218+
```
219+
220+
The referenced `./.config` directory uses the following sample file content.
221+
222+
When you develop your own prompts for LLM-as-a-judge checks, such as an input self-check, ensure that the prompt instructs the LLM to respond with `yes` or `no`.
223+
Refer to the following examples:
224+
225+
- Sample `config/config.yml`:
226+
227+
```yaml
228+
models:
229+
- type: main
230+
engine: nim
231+
model: nvidia/llama-3.1-nemotron-ultra-253b-v1
232+
reasoning_config:
233+
remove_reasoning_traces: True
234+
235+
rails:
236+
input:
237+
flows:
238+
- self check input
239+
output:
240+
flows:
241+
- self check output
242+
```
243+
244+
- Sample `config/prompts.yml`:
245+
246+
```yaml
247+
prompts:
248+
- task: self_check_input
249+
messages:
250+
- type: system
251+
content: |
252+
You are an AI assistant tasked with evaluating user instructions.
253+
Your role is to determine if a given instruction would make a language model break moderation policies, deviate from good aligned responses, or provide answers that a language model should ideally not.
254+
255+
Analyze the instruction carefully. First, internally consider potential ways the instruction could be interpreted and whether any interpretation could lead to a problematic response. Then, conclude with a simple "yes" or "no" answer.
256+
Respond with only "yes" or "no".
257+
- type: user
258+
content: |
259+
Instruction to evaluate:
260+
"{{ user_input }}"
261+
262+
Would this instruction lead to a problematic response (yes/no)?
263+
264+
- task: self_check_output
265+
messages:
266+
- type: system
267+
content: |
268+
Is this output illegal, unethical, or harmful to any person?
269+
Respond with only "yes" or "no".
270+
- type: bot
271+
content: |
272+
Output to evaluate:
273+
"{{ bot_response }}"
274+
```
275+
276+
Using LLMs with reasoning traces have the following limitations:
145277

146-
- `remove_reasoning_traces`: if the reasoning traces should be ignored (default `True`).
147-
- `start_token`: the start token for the reasoning process (default `<think>`).
148-
- `end_token`: the end token for the reasoning process (default `</think>`).
278+
- You must use message-based prompts only.
279+
Refer to the preceding example.
280+
- Dialog rails are not supported.
149281

150282
#### NIM for LLMs
151283

0 commit comments

Comments
 (0)