-
Notifications
You must be signed in to change notification settings - Fork 478
Description
Is this a new feature, an improvement, or a change to existing functionality?
Improvement
How would you describe the priority of this feature request
High
Please provide a clear description of problem this feature solves
ReAct Agent: Strict Text Parsing Fails; Should Send Tool Schemas to LLM
Summary
The react_agent has two issues that cause unreliable tool calling:
- Strict regex parsing - The parser requires exact text format; minor deviations fail
- Not sending tool schemas to LLM - Without tool schemas in the API request, the LLM cannot return structured
tool_calls
Issue 1: Strict Regex Parsing
The parser in output_parser.py#L75-L100 uses strict regex:
regex = r"Action\s*\d*\s*:[\s]*(.*?)\s*Action\s*\d*\s*Input\s*\d*\s*:[\s]*(.*?)(?=\s*[\n|\s]\s*Observation\b|$)"Minor format deviations cause parsing failures:
| LLM Output | Result |
|---|---|
Action: add\nAction Input: {"a": 2} |
OK |
Action: add\nInput: {"a": 2} |
FAILS - "Input" not "Action Input" |
action: add\nAction Input: {"a": 2} |
FAILS - lowercase "action" |
Quick fix: Make regex more lenient (case-insensitive, flexible whitespace, etc.)
Issue 2: Not Sending Tool Schemas to LLM
The LLM only populates tool_calls when it receives tool schemas via the API's tools parameter. Currently, react_agent doesn't send tool schemas - it only prompts for text format.
NAT uses LangChain internally. Compare how the two agents bind the LLM:
# react_agent - only binds stop sequence, no tool schemas
# https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/main/src/nat/agent/react_agent/agent.py#L129
self.llm.bind(stop=["Observation:"])
# Result: tool_calls = [] (empty), must parse content with regex
# tool_calling_agent - binds tool schemas
# https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/main/src/nat/agent/tool_calling_agent/agent.py#L75
self.bound_llm = llm.bind_tools(tools)
# Result: tool_calls = [{name, args}] (structured data)Key insight: bind_tools sends tool schemas in the API tools parameter, which triggers the LLM to return structured tool_calls. Without it, we must rely on text parsing.
How These Issues Relate
Current flow:
┌─────────┐ No tool schemas sent ┌─────────┐
│ Agent │ ─────────────────────────────► │ LLM │
│ │ ◄───────────────────────────── │ │
└─────────┘ {content: "Thought: ...\n └─────────┘
│ Action: add\n │
│ Action Input: {a: 2}", │ LLM doesn't know
│ tool_calls: []} │ to use tool_calls
▼ │
Must regex-parse Action/Action Input ◄─────┘
from content ──► Issue 1 (strict) Issue 2 (no schemas)
Proposed flow:
┌─────────┐ {tools: [{name: "add",...}]} ┌─────────┐
│ Agent │ ─────────────────────────────► │ LLM │
│ │ ◄───────────────────────────── │ │
└─────────┘ {content: "I need to add...", └─────────┘
│ tool_calls: [{name: "add",
│ args: {a: 2}}]}
│
├─► Action: tool_calls[0]["name"] ──► Structured, no parsing
├─► Args: tool_calls[0]["args"] ──► Structured, no parsing
└─► Thought: content ──► Optional, no parsing needed
Reproducer
Same model, same task - text parsing fails, native tool calling succeeds:
#!/usr/bin/env python3
import asyncio, os, tempfile, random
from nat.builder.function import FunctionGroup
from nat.builder.framework_enum import LLMFrameworkEnum
from nat.cli.register_workflow import register_function_group
from nat.data_models.function import FunctionGroupBaseConfig
from nat.runtime.loader import load_workflow
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
# Use a tool the LLM cannot answer without calling (random number)
SECRET = random.randint(1000, 9999)
class SimpleToolConfig(FunctionGroupBaseConfig, name="secret_tool"):
pass
@register_function_group(config_type=SimpleToolConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])
async def secret_tool(config, builder):
group = FunctionGroup(config=config)
async def get_secret_number(input: dict) -> str:
"""Get the secret number. Must be called to know the value."""
return f"The secret number is {SECRET}"
group.add_function(name="get_secret_number", fn=get_secret_number, description=get_secret_number.__doc__)
yield group
@tool
def get_secret_number() -> str:
"""Get the secret number. Must be called to know the value."""
return f"The secret number is {SECRET}"
NAT_CONFIG = """
function_groups:
secret_tool:
_type: secret_tool
llms:
nemotron:
_type: azure_openai
api_key: $NV_INFER
model_name: nvidia/nvidia/Nemotron-3-Nano-30B-A3B
azure_deployment: nvidia/nvidia/Nemotron-3-Nano-30B-A3B
azure_endpoint: https://inference-api.nvidia.com
api_version: latest
workflow:
_type: react_agent
llm_name: nemotron
tool_names: [secret_tool]
"""
async def main():
query = "What is the secret number?"
# NAT react_agent (text parsing)
with tempfile.NamedTemporaryFile(mode='w', suffix='.yaml', delete=False) as f:
f.write(NAT_CONFIG)
config_path = f.name
async with load_workflow(config_path) as workflow:
async with workflow.run(query) as runner:
nat_answer = await runner.result()
print(nat_answer)
nat_failed = "Invalid Format" in nat_answer or "Missing 'Action" in nat_answer
# Native tool calling (same model)
llm = ChatOpenAI(base_url="https://inference-api.nvidia.com/v1",
model="nvidia/nvidia/Nemotron-3-Nano-30B-A3B",
api_key=os.getenv("NV_INFER"))
llm_with_tools = llm.bind_tools([get_secret_number])
response = llm_with_tools.invoke(query)
print(f"react_agent (text parsing): {'FAILED' if nat_failed else 'OK'}")
print(f"Native tool calling: OK - tool_calls={response.tool_calls}")
if __name__ == "__main__":
asyncio.run(main())Output:
[AGENT] Failed to parse agent output after 1 attempts, consider enabling or increasing parse_agent_response_max_retries
Failed to determine whether agent is calling a tool: list index out of range
Traceback (most recent call last):
File "/llmtech/data-explorer-agent/.venv/lib/python3.11/site-packages/nat/agent/react_agent/agent.py", line 252, in conditional_edge
agent_output = state.agent_scratchpad[-1]
~~~~~~~~~~~~~~~~~~~~~~^^^^
IndexError: list index out of range
[AGENT] Ending graph traversal
Invalid Format: Missing 'Action:' after 'Thought:'
react_agent (text parsing): FAILED
Native tool calling: OK - tool_calls=[{'name': 'get_secret_number', 'args': {}, 'id': 'chatcmpl-tool-910f1f1176ed49ea9a02d3b15a480202', 'type': 'tool_call'}]
References
NAT Code
react_agentoutput parser - Current regex-based parsingreact_agentprompt - Current text format instructionstool_calling_agent- Already usesbind_tools
External References
- OpenAI Function Calling Guide
- OpenAI Chat API - tools parameter
- LangChain bind_tools - Reference implementation
Describe your ideal solution
Suggestions
For Issue 1 (Quick Fix)
Make regex parsing more lenient:
- Case-insensitive matching
- Flexible whitespace
- Accept variations like "Input:" vs "Action Input:"
For Issue 2 (Better Fix)
Add option to use native tool calling:
workflow:
_type: react_agent
llm_name: nemotron
tool_names: [my_tools]
use_native_tool_calling: true # Use bind_tools + tool_callsThis would:
- Use
llm.bind_tools()to send tool schemas via API - Extract tool calls from
response.tool_callsinstead of regex parsing - Still capture "Thought" reasoning in
response.content - Keep the Thought → Action → Observation loop intact
Note: NAT's tool_calling_agent already uses bind_tools - this approach could be adapted for react_agent.
Additional context
No response
Code of Conduct
- I agree to follow this project's Code of Conduct
- I have searched the open feature requests and have found no duplicates for this feature request