-
Notifications
You must be signed in to change notification settings - Fork 690
feat: add pythonic tool call parser support #2726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds a Pythonic tool-call parser using rustpython-parser, introduces a dedicated config module for parser types and presets, updates routing to enable Pythonic parsing, and reorganizes public re-exports. Also updates imports and adds a Cargo dependency. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Caller
participant Parsers as parsers::detect_and_parse_tool_call
participant Router as parsers::try_tool_call_parse
participant Py as pythonic_parser
participant Json as json_parser
Caller->>Parsers: detect_and_parse_tool_call(message)
note right of Parsers: Build parser_map incl. "pythonic", presets
Parsers->>Router: try_tool_call_parse(message, ToolCallConfig)
alt format == Pythonic
Router->>Py: try_tool_call_parse_pythonic(message)
note over Py: Strip python tags<br/>Regex bracket extraction<br/>AST parse calls<br/>Serialize kwargs
Py-->>Router: Vec<ToolCallResponse>, remaining
else format == Json (etc.)
Router->>Json: try_tool_call_parse_json(message)
Json-->>Router: Vec<ToolCallResponse>, remaining
end
Router-->>Parsers: Results
Parsers-->>Caller: Results
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
lib/parsers/src/tool_calling/json_parser.rs (1)
139-146: Docstring contradicts behavior: code returns all items, not the last one.The “Note on List Handling” states “only the last item” is returned. Below, both list branches collect and return all items in order. This discrepancy is user-facing and can mislead integrators.
Either update docs or change behavior. If the new multi-tool path is intentional (it matches
tools.rsaggregate/stream helpers), update the docs:-/// When the input contains a list of tool calls (either with `parameters` or `arguments`), -/// only the **last item** in the list is returned. This design choice assumes that the -/// most recent tool call in a list is the one to execute. +/// When the input contains a list of tool calls (either with `parameters` or `arguments`), +/// all items are returned in order. Upstream callers can choose which tool(s) to execute.
🧹 Nitpick comments (13)
lib/parsers/Cargo.toml (1)
38-38: Confirm rustpython-parser version and feature flags; consider pinning with caret and documenting MSRV.0.4.0 appears to be the latest released version of rustpython-parser (published Aug 6, 2024). Double‑check if a newer 0.4.x exists before merging and whether you need any crate features (e.g., lalrpop, location) for AST ranges. Also consider using a caret requirement (
^0.4) unless you intentionally want a strict pin.As of today (August 26, 2025), lib.rs still lists 0.4.0 as latest. Please verify on crates.io/lib.rs:
If you decide to keep a strict pin, add a short comment explaining why (e.g., API stability, MSRV).
lib/parsers/src/tool_calling/json_parser.rs (3)
13-15: Remove “Remove this line” comments.These TODO-like comments are easy to forget and add noise. Since you’ve already removed the re-exports, delete these lines entirely.
Apply this diff:
-// Remove this line: -// pub use super::config::{JsonParserConfig, ToolCallConfig, ToolCallParserType}; -
282-289: Inline comments also claim “pop the last item” but code returns all items.The inline comments in both list branches say “pop the last item,” but the code pushes all items to
results.Apply this diff to align comments with behavior:
- // We pop the last item in the list to use. + // We return all items collected in order.And similarly in the Arguments list branch:
- // Again, we take the last item for processing. + // We return all items collected in order.Also applies to: 302-307
197-229: Token handling: consider validating config pairs and short‑circuiting once a match is found.You zip start/end tokens; if lengths diverge, some tokens are silently ignored. Also, once a match succeeds, you
break, which is good—ensure you cover all configured start-only tokens before giving up.
- Validate
tool_call_start_tokens.len() == tool_call_end_tokens.len()up front and warn if not.- Consider checking all start-only tokens first, then paired tokens, to avoid missing single-token matches if the first pair fails.
lib/parsers/src/tool_calling/tools.rs (1)
47-77: Stream conversion looks correct; ensure id semantics are consistent across parsers.You set
idin chunks toSome(parsed.id). The JSON parser uses UUIDs; the Pythonic parser currently usescall-<index>. Consider unifying id shape across parsers for downstream consumers.lib/parsers/src/tool_calling/mod.rs (1)
12-15: Re-exports align with the new structure.Nice consolidation. Follow-up: consider adding a module-level doc comment outlining when to choose JSON vs Pythonic formats and how to override presets.
lib/parsers/src/tool_calling/pythonic_parser.rs (3)
12-17: Strip all known Python tags; consider making this configurable.You remove
<|python_start|>/<|python_end|>, but elsewhere we reference<|python_tag|>. Add that tag here to avoid surprises, and consider passing tokens via config later.Apply this minimal diff:
fn strip_text(message: &str) -> String { // Remove unexpected python tags if any message .replace("<|python_start|>", "") .replace("<|python_end|>", "") + .replace("<|python_tag|>", "") }Longer-term: accept a config of start/end tokens similar to the JSON parser.
77-84: Unify call id format with JSON parser; avoid predictable ids.JSON parser uses UUIDs; here you use
call-<index>. Consider UUIDs for consistency.Apply this diff and import
uuid::Uuid:-use super::response::{CalledFunction, ToolCallResponse, ToolCallType}; +use super::response::{CalledFunction, ToolCallResponse, ToolCallType}; +use uuid::Uuid; @@ - res.push(ToolCallResponse { - id: format!("call-{}", idx + 1), + res.push(ToolCallResponse { + id: format!("call-{}", Uuid::new_v4()),
127-150: Tests pass for current behavior; update if you adopt numeric ints.If you switch to numeric ints in
const_expr, adjust assertions:- assert_eq!(args["a"], "1"); - assert_eq!(args["b"], "2"); + assert_eq!(args["a"], 1); + assert_eq!(args["b"], 2); @@ - assert_eq!(args["x"], "3"); + assert_eq!(args["x"], 3);lib/parsers/src/tool_calling/config.rs (3)
5-16: Make enum ergonomic and future-proof (derive Eq/Copy; consider non_exhaustive).Deriving Eq/PartialEq and Copy on a C-like enum improves matching in tests and config code. Marking it non_exhaustive avoids semver breakage when adding new formats.
Apply:
-#[derive(Clone, Debug, serde::Serialize, serde::Deserialize)] +#[derive(Clone, Copy, Debug, PartialEq, Eq, serde::Serialize, serde::Deserialize)] +#[non_exhaustive] pub enum ToolCallParserType {
38-49: Document and validate start/end token pairing to prevent silent misparsing.The JSON parser zips start and end token vectors; mismatched lengths silently drop extras. Add a lightweight validator to catch this early in debug builds.
Add a validator inside the impl:
impl ToolCallConfig { + /// Debug-only validation to ensure token arrays line up. + #[inline] + pub fn debug_validate(&self) { + #[cfg(debug_assertions)] + { + debug_assert_eq!( + self.json.tool_call_start_tokens.len(), + self.json.tool_call_end_tokens.len(), + "tool_call_start_tokens and tool_call_end_tokens must have equal length" + ); + } + }Then call
config.debug_validate()at the start of try_tool_call_parse (parsers.rs) before matching on the format.
18-36: Remove or Wire Up Unusedparallel_tool_calls_*_tokensFieldsI ran a grep search and confirmed that neither
parallel_tool_calls_start_tokensnorparallel_tool_calls_end_tokensis ever read or referenced outside of their declaration and default initialization inconfig.rs.• Since these fields aren’t consumed by any parser or JSON‐path logic, they serve no purpose in the current code.
• To minimize surface area and avoid dead code, it’s best to remove both fields (and their entries in theDefaultimpl).
• If you intend to implement parallel tool‐calls support soon, you can leave a// TODOin its place or document the planned usage—but avoid shipping unused public API.Let me know if you’d rather wire up the JSON‐path logic now instead of pruning these fields.
lib/parsers/src/tool_calling/parsers.rs (1)
15-18: Optional: Validate config invariants early.Call the proposed ToolCallConfig::debug_validate() here to catch mismatched token arrays during development.
match config.format { - ToolCallParserType::Json => { + ToolCallParserType::Json => { + config.debug_validate(); let (results, normal_content) = try_tool_call_parse_json(message, &config.json)?; Ok((results, normal_content)) }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (7)
lib/parsers/Cargo.toml(1 hunks)lib/parsers/src/tool_calling/config.rs(1 hunks)lib/parsers/src/tool_calling/json_parser.rs(1 hunks)lib/parsers/src/tool_calling/mod.rs(1 hunks)lib/parsers/src/tool_calling/parsers.rs(4 hunks)lib/parsers/src/tool_calling/pythonic_parser.rs(1 hunks)lib/parsers/src/tool_calling/tools.rs(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
lib/parsers/src/tool_calling/pythonic_parser.rs (2)
lib/parsers/src/tool_calling/json_parser.rs (5)
serde_json(85-85)serde_json(254-254)serde_json(270-270)serde_json(283-283)serde_json(302-302)lib/parsers/src/tool_calling/parsers.rs (1)
extract_name_and_args(72-75)
lib/parsers/src/tool_calling/tools.rs (1)
lib/parsers/src/tool_calling/parsers.rs (1)
detect_and_parse_tool_call(36-63)
lib/parsers/src/tool_calling/mod.rs (1)
lib/parsers/src/tool_calling/parsers.rs (2)
detect_and_parse_tool_call(36-63)try_tool_call_parse(9-33)
lib/parsers/src/tool_calling/parsers.rs (3)
lib/parsers/src/tool_calling/json_parser.rs (1)
try_tool_call_parse_json(155-311)lib/parsers/src/tool_calling/pythonic_parser.rs (1)
try_tool_call_parse_pythonic(105-125)lib/parsers/src/tool_calling/config.rs (1)
pythonic(131-136)
🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/2726/merge) by ayushag-nv.
lib/parsers/src/tool_calling/pythonic_parser.rs
[error] 1-1: Command: pre-commit run --show-diff-on-failure --color=always --all-files. Trailing whitespace detected by the pre-commit 'trailing-whitespace' hook in lib/parsers/src/tool_calling/pythonic_parser.rs; pre-commit exited with code 1.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: pre-merge-rust (lib/runtime/examples)
- GitHub Check: pre-merge-rust (lib/bindings/python)
- GitHub Check: pre-merge-rust (.)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (5)
lib/parsers/src/tool_calling/json_parser.rs (1)
10-10: Import path change looks good.Switching to
super::config::JsonParserConfigmatches the new module layout.lib/parsers/src/tool_calling/tools.rs (2)
10-42: Aggregate conversion preserves order and types. LGTM.The mapping to
ChatCompletionMessageToolCalllooks correct and stable. Good logging around parser selection.
4-5: No internal stale imports detectedI’ve run project-wide ripgrep searches and confirmed there are no remaining
tool_calling::json_parser::*ortool_calling::parsers::*imports in the codebase.
- All old re-exports have been removed internally.
- Before publishing this change, please manually verify that any external crates or downstream projects aren’t relying on the removed re-exported paths.
lib/parsers/src/tool_calling/mod.rs (1)
4-4: Module layout additions are good.Adding
configandpythonic_parsermodules here makes the surface coherent.Also applies to: 7-7
lib/parsers/src/tool_calling/parsers.rs (1)
69-70: Import path change looks good.Bringing JsonParserConfig from super::super::config keeps tests consistent with the new module layout.
Signed-off-by: ayushag <ayushag@nvidia.com>
Signed-off-by: ayushag <ayushag@nvidia.com>
Signed-off-by: ayushag <ayushag@nvidia.com>
Signed-off-by: ayushag <ayushag@nvidia.com>
Signed-off-by: ayushag <ayushag@nvidia.com>
Signed-off-by: mohammedabdulwahhab <furkhan324@berkeley.edu> Co-authored-by: Ryan McCormick <rmccormick@nvidia.com> Signed-off-by: ayushag <ayushag@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com> Signed-off-by: ayushag <ayushag@nvidia.com>
Signed-off-by: ayushag <ayushag@nvidia.com>
…glang (#2713) Signed-off-by: ayushag <ayushag@nvidia.com>
Signed-off-by: ayushag <ayushag@nvidia.com>
Signed-off-by: ayushag <ayushag@nvidia.com>
80cea68 to
3af260c
Compare
Signed-off-by: Ayush Agarwal <ayushag@nvidia.com>
|
please comment in the pythonic parser any existing limitations with the parser if it doesn't work completely |
|
Code Rabbit makes some excellent points. Particularly we can't use |
@grahamking sounds good. I will address those. |
Signed-off-by: ayushag <ayushag@nvidia.com>
Signed-off-by: ayushag <ayushag@nvidia.com>
Signed-off-by: ayushag <ayushag@nvidia.com>
Signed-off-by: ayushag <ayushag@nvidia.com>
|
Closing this in favor of : #2788 |
Pull Request Summary by devActivityMetricsAchievements
|
Overview:
Implements basic pythonic parser using AST parsing. Supports all three cases for function parsing
Details:
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit
New Features
Refactor
Chores