Skip to content

ReAct Agent: Improve Text Parsing; Should Send Tool Schemas to LLM #1308

@daxiongshu

Description

@daxiongshu

Is this a new feature, an improvement, or a change to existing functionality?

Improvement

How would you describe the priority of this feature request

High

Please provide a clear description of problem this feature solves

ReAct Agent: Strict Text Parsing Fails; Should Send Tool Schemas to LLM

Summary

The react_agent has two issues that cause unreliable tool calling:

  1. Strict regex parsing - The parser requires exact text format; minor deviations fail
  2. Not sending tool schemas to LLM - Without tool schemas in the API request, the LLM cannot return structured tool_calls

Issue 1: Strict Regex Parsing

The parser in output_parser.py#L75-L100 uses strict regex:

regex = r"Action\s*\d*\s*:[\s]*(.*?)\s*Action\s*\d*\s*Input\s*\d*\s*:[\s]*(.*?)(?=\s*[\n|\s]\s*Observation\b|$)"

Minor format deviations cause parsing failures:

LLM Output Result
Action: add\nAction Input: {"a": 2} OK
Action: add\nInput: {"a": 2} FAILS - "Input" not "Action Input"
action: add\nAction Input: {"a": 2} FAILS - lowercase "action"

Quick fix: Make regex more lenient (case-insensitive, flexible whitespace, etc.)

Issue 2: Not Sending Tool Schemas to LLM

The LLM only populates tool_calls when it receives tool schemas via the API's tools parameter. Currently, react_agent doesn't send tool schemas - it only prompts for text format.

NAT uses LangChain internally. Compare how the two agents bind the LLM:

# react_agent - only binds stop sequence, no tool schemas
# https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/main/src/nat/agent/react_agent/agent.py#L129
self.llm.bind(stop=["Observation:"])
# Result: tool_calls = [] (empty), must parse content with regex

# tool_calling_agent - binds tool schemas
# https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/main/src/nat/agent/tool_calling_agent/agent.py#L75
self.bound_llm = llm.bind_tools(tools)
# Result: tool_calls = [{name, args}] (structured data)

Key insight: bind_tools sends tool schemas in the API tools parameter, which triggers the LLM to return structured tool_calls. Without it, we must rely on text parsing.

How These Issues Relate

Current flow:
┌─────────┐  No tool schemas sent          ┌─────────┐
│  Agent  │ ─────────────────────────────► │   LLM   │
│         │ ◄───────────────────────────── │         │
└─────────┘  {content: "Thought: ...\n     └─────────┘
     │              Action: add\n               │
     │              Action Input: {a: 2}",      │ LLM doesn't know
     │        tool_calls: []}                   │ to use tool_calls
     ▼                                          │
  Must regex-parse Action/Action Input    ◄─────┘
  from content ──► Issue 1 (strict)              Issue 2 (no schemas)


Proposed flow:
┌─────────┐  {tools: [{name: "add",...}]}  ┌─────────┐
│  Agent  │ ─────────────────────────────► │   LLM   │
│         │ ◄───────────────────────────── │         │
└─────────┘  {content: "I need to add...", └─────────┘
     │        tool_calls: [{name: "add",
     │                      args: {a: 2}}]}
     │
     ├─► Action: tool_calls[0]["name"] ──► Structured, no parsing
     ├─► Args: tool_calls[0]["args"]   ──► Structured, no parsing
     └─► Thought: content              ──► Optional, no parsing needed

Reproducer

Same model, same task - text parsing fails, native tool calling succeeds:

#!/usr/bin/env python3
import asyncio, os, tempfile, random
from nat.builder.function import FunctionGroup
from nat.builder.framework_enum import LLMFrameworkEnum
from nat.cli.register_workflow import register_function_group
from nat.data_models.function import FunctionGroupBaseConfig
from nat.runtime.loader import load_workflow
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool

# Use a tool the LLM cannot answer without calling (random number)
SECRET = random.randint(1000, 9999)

class SimpleToolConfig(FunctionGroupBaseConfig, name="secret_tool"):
    pass

@register_function_group(config_type=SimpleToolConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])
async def secret_tool(config, builder):
    group = FunctionGroup(config=config)
    async def get_secret_number(input: dict) -> str:
        """Get the secret number. Must be called to know the value."""
        return f"The secret number is {SECRET}"
    group.add_function(name="get_secret_number", fn=get_secret_number, description=get_secret_number.__doc__)
    yield group

@tool
def get_secret_number() -> str:
    """Get the secret number. Must be called to know the value."""
    return f"The secret number is {SECRET}"

NAT_CONFIG = """
function_groups:
  secret_tool:
    _type: secret_tool
llms:
  nemotron:
    _type: azure_openai
    api_key: $NV_INFER
    model_name: nvidia/nvidia/Nemotron-3-Nano-30B-A3B
    azure_deployment: nvidia/nvidia/Nemotron-3-Nano-30B-A3B
    azure_endpoint: https://inference-api.nvidia.com
    api_version: latest
workflow:
  _type: react_agent
  llm_name: nemotron
  tool_names: [secret_tool]
"""

async def main():
    query = "What is the secret number?"

    # NAT react_agent (text parsing)
    with tempfile.NamedTemporaryFile(mode='w', suffix='.yaml', delete=False) as f:
        f.write(NAT_CONFIG)
        config_path = f.name
    async with load_workflow(config_path) as workflow:
        async with workflow.run(query) as runner:
            nat_answer = await runner.result()
    print(nat_answer)
    nat_failed = "Invalid Format" in nat_answer or "Missing 'Action" in nat_answer

    # Native tool calling (same model)
    llm = ChatOpenAI(base_url="https://inference-api.nvidia.com/v1",
                     model="nvidia/nvidia/Nemotron-3-Nano-30B-A3B",
                     api_key=os.getenv("NV_INFER"))
    llm_with_tools = llm.bind_tools([get_secret_number])
    response = llm_with_tools.invoke(query)

    print(f"react_agent (text parsing): {'FAILED' if nat_failed else 'OK'}")
    print(f"Native tool calling: OK - tool_calls={response.tool_calls}")

if __name__ == "__main__":
    asyncio.run(main())

Output:

[AGENT] Failed to parse agent output after 1 attempts, consider enabling or increasing parse_agent_response_max_retries
Failed to determine whether agent is calling a tool: list index out of range
Traceback (most recent call last):
  File "/llmtech/data-explorer-agent/.venv/lib/python3.11/site-packages/nat/agent/react_agent/agent.py", line 252, in conditional_edge
    agent_output = state.agent_scratchpad[-1]
                   ~~~~~~~~~~~~~~~~~~~~~~^^^^
IndexError: list index out of range
[AGENT] Ending graph traversal
Invalid Format: Missing 'Action:' after 'Thought:'

react_agent (text parsing): FAILED
Native tool calling: OK - tool_calls=[{'name': 'get_secret_number', 'args': {}, 'id': 'chatcmpl-tool-910f1f1176ed49ea9a02d3b15a480202', 'type': 'tool_call'}]

References

NAT Code

External References

Describe your ideal solution

Suggestions

For Issue 1 (Quick Fix)

Make regex parsing more lenient:

  • Case-insensitive matching
  • Flexible whitespace
  • Accept variations like "Input:" vs "Action Input:"

For Issue 2 (Better Fix)

Add option to use native tool calling:

workflow:
  _type: react_agent
  llm_name: nemotron
  tool_names: [my_tools]
  use_native_tool_calling: true  # Use bind_tools + tool_calls

This would:

  1. Use llm.bind_tools() to send tool schemas via API
  2. Extract tool calls from response.tool_calls instead of regex parsing
  3. Still capture "Thought" reasoning in response.content
  4. Keep the Thought → Action → Observation loop intact

Note: NAT's tool_calling_agent already uses bind_tools - this approach could be adapted for react_agent.

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
  • I have searched the open feature requests and have found no duplicates for this feature request

Metadata

Metadata

Labels

Needs TriageNeed team to review and classifyfeature requestNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions