Skip to content

Conversation

@Pouyanpi
Copy link
Collaborator

Description

This PR adds support for Nemotron models in NeMo Guardrails with a message-based prompt approach. It implements a system for controlling detailed thinking traces both for internal tasks and in final user responses, similar to how it works on build.nvidia.com.

Key Changes

  • Added new nemotron.yml with message-based prompt templates
  • Implemented "detailed thinking on" system messages for complex reasoning tasks
  • Added system message handling in LLMRails to control thinking traces
  • Provided example configuration in examples/configs/nemotron

Features

Control Detailed Thinking via System Messages

As Nemotron is a hybrid reasoning model, users can toggle the "detailed thinking" feature for final responses:

# With detailed thinking enabled
response = rails.generate(messages=[
    {"role": "system", "content": "detailed thinking on"}, 
    {"role": "user", "content": "How is the weather today?"}
])

# Result includes thinking process
{'role': 'assistant',
 'content': '<think>\n</think>I\'m sorry, but I don\'t know the weather...'}

# Without detailed thinking (standard mode)
response = rails.generate(messages=[
    {"role": "user", "content": "How is the weather today?"}
])

# Result without thinking process
{'role': 'assistant',
 'content': 'The weather! Unfortunately, I don\'t have real-time access...'}

Configuration Options

Control detailed thinking for internal tasks:

models:
  - type: main
    engine: nim
    model: nvidia/llama-3.1-nemotron-ultra-253b-v1
    reasoning_config:
      remove_reasoning_traces: False # set to true to remove traces

Documentation

Added README.md with examples and explanations of the prompt system and thinking control options.

Testing

Added tests to verify:

  • Nemotron models use message-based prompts
  • Tasks like generate_bot_message have "detailed thinking on"
  • Tasks like generate_user_intent don't have "detailed thinking on"

@github-actions
Copy link
Contributor

Documentation preview

https://nvidia.github.io/NeMo-Guardrails/review/pr-1199

@Pouyanpi Pouyanpi added the enhancement New feature or request label May 15, 2025
@Pouyanpi Pouyanpi self-assigned this May 15, 2025
@Pouyanpi Pouyanpi added this to the v0.14.0 milestone May 15, 2025
@Pouyanpi Pouyanpi force-pushed the feat/nemotron-reasoning-support branch from c3600f9 to 2117f0f Compare May 15, 2025 21:38
@codecov-commenter
Copy link

codecov-commenter commented May 16, 2025

Codecov Report

Attention: Patch coverage is 91.66667% with 1 line in your changes missing coverage. Please review.

Project coverage is 68.65%. Comparing base (36d625e) to head (2b37d04).
Report is 3 commits behind head on develop.

Files with missing lines Patch % Lines
nemoguardrails/llm/prompts.py 87.50% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1199      +/-   ##
===========================================
+ Coverage    68.43%   68.65%   +0.21%     
===========================================
  Files          161      161              
  Lines        15943    15978      +35     
===========================================
+ Hits         10910    10969      +59     
+ Misses        5033     5009      -24     
Flag Coverage Δ
python 68.65% <91.66%> (+0.21%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
nemoguardrails/rails/llm/llmrails.py 87.21% <100.00%> (+0.09%) ⬆️
nemoguardrails/llm/prompts.py 90.47% <87.50%> (-0.32%) ⬇️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for Nemotron models in NeMo Guardrails using message‐based prompt templates with a detailed thinking feature. Key changes include adding new YAML prompt definitions for Nemotron and DeepSeek, implementing system message conversion in LLMRails, and updating test suites and configuration files accordingly.

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/test_system_message_conversion.py Added tests to ensure system messages are converted correctly.
tests/test_nemotron_prompt_modes.py Added tests verifying detailed thinking behavior for Nemotron prompts.
nemoguardrails/rails/llm/llmrails.py Modified to convert system messages to SystemMessage events appropriately.
nemoguardrails/llm/prompts/nemotron_reasoning.yml New prompt definitions with detailed thinking enabled in specific tasks.
nemoguardrails/llm/prompts/llama3.yml Updated prompt definitions to include new model identifiers.
nemoguardrails/llm/prompts.py Updated scoring logic in prompt selection for improved model matching.
examples/configs/nemotron/config.yml New configuration file for Nemotron model support.
examples/configs/nemotron/README.md Added documentation for using Nemotron message-based prompts.
Comments suppressed due to low confidence (1)

nemoguardrails/llm/prompts.py:108

  • [nitpick] The added 'break' after detecting a substring match may prevent subsequent models in the list from being evaluated. Consider reviewing the order of model checks or accumulating all candidate scores to ensure the best matching prompt is selected.
elif _model in model:

Copy link
Collaborator

@mikemckiernan mikemckiernan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks so much!

@Pouyanpi Pouyanpi force-pushed the feat/nemotron-reasoning-support branch from 8303cb1 to 7ce964d Compare May 16, 2025 15:13
Copy link
Collaborator Author

@Pouyanpi Pouyanpi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Pouyanpi added 3 commits May 16, 2025 17:18
Signed-off-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com>
Signed-off-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com>
Signed-off-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com>
Signed-off-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com>
Signed-off-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com>
@Pouyanpi Pouyanpi merged commit 07422c1 into develop May 17, 2025
19 checks passed
@Pouyanpi Pouyanpi deleted the feat/nemotron-reasoning-support branch May 17, 2025 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants