Skip to content

Commit 7833b7d

Browse files
committed
feat: add Nemotron model support with message-based prompts
1 parent 36d625e commit 7833b7d

File tree

6 files changed

+765
-1
lines changed

6 files changed

+765
-1
lines changed
Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# Nemotron Message-Based Prompts
2+
3+
This directory contains configurations for using Nemotron models with NeMo Guardrails.
4+
5+
## Message-Based Prompts with Detailed Thinking
6+
7+
NeMo Guardrails implements message-based prompts for Nemotron models with "detailed thinking" enabled for specific internal tasks:
8+
9+
### Tasks with Detailed Thinking Enabled
10+
11+
The following internal tasks include a `detailed thinking on` system message:
12+
13+
- `generate_bot_message` - When generating the final response
14+
- `generate_value` - When extracting information from user input
15+
- Other complex reasoning tasks like flow generation and continuation
16+
17+
### Tasks without Detailed Thinking
18+
19+
The following tasks use standard system messages without detailed thinking:
20+
21+
- `generate_user_intent` - When detecting user intent
22+
- `generate_next_steps` - When determining what bot actions to take
23+
24+
## Usage
25+
26+
To use Nemotron with NeMo Guardrails:
27+
28+
```python
29+
from nemoguardrails import LLMRails, RailsConfig
30+
31+
# Load the configuration
32+
config = RailsConfig.from_path("examples/configs/nemotron")
33+
34+
# Create the LLMRails instance
35+
rails = LLMRails(config)
36+
37+
# Generate a response
38+
response = rails.generate(messages=[
39+
{"role": "user", "content": "What is NeMo Guardrails?"}
40+
])
41+
print(response)
42+
```
43+
44+
When using a task that has "detailed thinking on" enabled, the model will show its reasoning process:
45+
46+
```
47+
{'role': 'assistant', 'content': '<think>\nOkay, the user is asking about NeMo Guardrails. Let me start by recalling what I know. NeMo is part of NVIDIA\'s tools, right? So, Guardrails must be a component related to that. I remember that NVIDIA has been working on AI frameworks and model development. Maybe Guardrails is part of the NeMo toolkit, which is used for building and training neural networks, especially for speech and language processing.\n\nWait, I think Guardrails are safety features or constraints that prevent models from generating harmful or inappropriate content. So, if NeMo Guardrails exist, they probably integrate these safety mechanisms into the model training or inference process. But I need to be precise here. I should check if NeMo Guardrails are specifically designed for their models like the ones in the NGC catalog.\n\nI remember that NVIDIA has LMOps tools, which might include Guardrails. Oh right, they announced RAPIDS Guardrails earlier, which is a library for adding safety features. Maybe NeMo Guardrails are a similar concept but tailored for the NeMo framework. So, they would allow developers to apply filters, classifiers, or rules to ensure the outputs are safe and comply with policies.\n\nBut wait, I should make sure not to confuse it with other guardrails. For example, some models use RLHF (Reinforcement Learning from Human Feedback), but Guardrails might be more about pre-defined rules or modular safety layers. The key points would be that NeMo Guardrails provide tools to manage model outputs, making it easier to deploy safe and responsible AI applications.\n\nAlso, considering the components of NeMo, which include pre-trained models and a framework for research, adding Guardrails would make sense as a feature to support ethical AI practices. They might include things like toxicity detection, fact-checking, or steering models away from sensitive topics.\n\nI should structure the answer to explain what NeMo is, introduce Guardrails as a safety component, mention their purpose, how they work (like modular components, customizable policies), and their benefits. Also, note that it\'s part of NVIDIA\'s efforts to provide tools for responsible AI deployment. Make sure to clarify if they are part of the main NeMo toolkit or an additional module, and maybe mention associated tools like RAPIDS Guardrails if relevant.\n\nWait, but I should verify if NeMo Guardrails are indeed a released feature or part of a specific version. If the information is not up-to-date, I need to state that clearly. The user might be referring to recent updates. Since my knowledge is current up to July 2024, if there were announcements around that time, include them. Otherwise, stick to known information.\n\nSo, putting it all together: NeMo Guardrails are safety features in NVIDIA\'s NeMo toolkit, designed to enforce ethical guidelines and prevent harmful outputs. They offer customizable policies, integration of safety models, and tools for auditing, helping developers deploy responsible AI applications. They might work alongside other NVIDIA tools to streamline the process of adding safety measures without compromising performance.\n</think>\n\nNeMo Guardrails is an open-source toolkit developed by NVIDIA that provides programmable guardrails for Large Language Models (LLMs). These guardrails are designed to make LLM-based applications safer and more reliable by controlling the output of the models in specific ways...'}
48+
```
49+
50+
### Controlling Detailed Thinking in Final Responses
51+
52+
As Nemotron is a hybrid reasoning model, users can toggle the "detailed thinking" feature for final responses similar to how it works on build.nvidia.com:
53+
54+
#### Enabling Detailed Thinking via System Message
55+
56+
To enable detailed thinking in the response, include a system message with "detailed thinking on":
57+
58+
```python
59+
response = rails.generate(messages=[
60+
{"role": "system", "content": "detailed thinking on"},
61+
{"role": "user", "content": "How is the weather today?"}
62+
])
63+
```
64+
65+
This will include the model's reasoning process in a `<think>...</think>` wrapper:
66+
67+
```
68+
{'role': 'assistant',
69+
'content': '<think>\n</think>I\'m sorry, but I don\'t know the weather. I\'m a large language model, I don\'t have access to real-time information or your location. However, I can guide you on how to check the weather! You can check the weather forecast for your area by:...'}
70+
```
71+
72+
#### Standard Mode (No Detailed Thinking)
73+
74+
Without the special system message, the model provides direct responses without showing its reasoning:
75+
76+
```python
77+
response = rails.generate(messages=[
78+
{"role": "user", "content": "How is the weather today?"}
79+
])
80+
```
81+
82+
Response:
83+
84+
```
85+
{'role': 'assistant',
86+
'content': 'The weather! Unfortunately, I don\'t have real-time access to current weather conditions or your location. I\'m a large language model...'}
87+
```
88+
89+
To remove the thinking traces from the internal tasks, you can use the `remove_thinking_traces` configuration option:
90+
91+
# TODO: add reference to the docs
92+
93+
```yaml
94+
remove_thinking_traces: true
95+
```
96+
97+
## Configuration Details
98+
99+
The `config.yml` file sets:
100+
101+
```yaml
102+
models:
103+
- type: main
104+
engine: nim
105+
model: nvidia/llama-3.1-nemotron-ultra-253b-v1
106+
107+
```
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
models:
2+
- type: main
3+
engine: nim
4+
model: nvidia/llama-3.1-nemotron-ultra-253b-v1
5+
reasoning_config:
6+
remove_reasoning_traces: False # set it true if you like

0 commit comments

Comments
 (0)