chore: frontend API changes for thinking budget #2848

bhuvan002 · 2025-09-03T21:42:41Z

Overview:

This PR adds a new field max_thinking_tokens under the NvExt in the OpenAI schema

Details:

This only adds the fields and currently acts as passthrough/no-op. The actual logic will be added to the backends separately in future PRs.

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: #xxx

Summary by CodeRabbit

New Features
- Added an optional “Max Thinking Tokens” parameter in request extensions and stop conditions to carry a thinking-token budget. It can be set via builders and is passed through the pipeline. Note: not enforced yet.
Tests
- Added unit tests to validate parsing and propagation of the “Max Thinking Tokens” value from requests into stop conditions, covering both presence and absence scenarios.

coderabbitai · 2025-09-03T21:44:56Z

Walkthrough

Introduces an optional max_thinking_tokens field across nvext, OpenAI stop-conditions provider, and common StopConditions. Adds a getter in the provider, plumbs the value through to StopConditions, and adds tests validating JSON extraction and propagation. No enforcement or behavioral changes; defaults to None when unspecified.

Changes

Cohort / File(s)	Summary
Protocol plumbing: StopConditions field `lib/llm/src/protocols/common.rs`	Added `pub max_thinking_tokens: Option<u32>` to `StopConditions` with docs; no logic changes.
OpenAI provider passthrough `lib/llm/src/protocols/openai.rs`	Added `get_max_thinking_tokens(&self) -> Option<u32>` on `OpenAIStopConditionsProvider`; forwards value into `common::StopConditions`.
NvExt surface + builder + tests `lib/llm/src/protocols/openai/nvext.rs`	Added `pub max_thinking_tokens: Option<u32>` to `NvExt`; builder gains setter; tests updated to cover default None and explicit Some(1024).
Integration test for extraction `lib/llm/tests/test_common_ext.rs`	New test `test_max_thinking_tokens_extraction` verifies JSON parsing and propagation into `StopConditions` (Some when present, None when absent).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant OpenAI API Layer as OpenAI API Layer
  participant NvExt Parser as NvExt Parser
  participant StopCond Provider as StopConditionsProvider
  participant Common StopConditions as Common StopConditions

  Client->>OpenAI API Layer: Send request (may include nvext.max_thinking_tokens)
  OpenAI API Layer->>NvExt Parser: Parse nvext
  NvExt Parser-->>OpenAI API Layer: NvExt { max_thinking_tokens: Option<u32> }

  OpenAI API Layer->>StopCond Provider: Build stop conditions
  StopCond Provider->>StopCond Provider: get_max_thinking_tokens()
  StopCond Provider->>Common StopConditions: Construct with max_thinking_tokens: Option<u32>
  Common StopConditions-->>OpenAI API Layer: StopConditions (no enforcement)
  OpenAI API Layer-->>Client: Proceed with request handling (unchanged flow)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

feat: Add frontend support for min_tokens and ignore_eos (outside of nvext) and Structured Output / Guided Decoding #2380 — Similar changes to OpenAI stop-conditions plumbing to carry additional config through nvext and provider layers.

Poem

A rabbit taps its clever paws—tick, tick—
Counting thoughts with a gentle trick.
A token budget, softly spun,
Optional now, the wiring done.
Through nvext fields the number hops—
Into StopConditions it neatly drops.
Future carrots: limit-thought crops! 🥕✨

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (4)

lib/llm/src/protocols/common.rs (1)

261-264: Good addition; passthrough field is safe and non-breaking

Adding max_thinking_tokens: Option<u32> to StopConditions with clear docs and default None keeps backward compatibility. No enforcement in apply_ignore_eos, which is fine for now.

If you expect 0 to be invalid, consider documenting that explicitly or validating upstream (e.g., only accept values ≥ 1). Also, when enforcement arrives, clarify interactions with min_tokens/max_tokens in the doc comment.
lib/llm/src/protocols/openai/nvext.rs (1)
104-109: NvExt passthrough looks correct and consistent with existing options

Serde defaults/skip and the builder strip_option pattern match the rest of NvExt. Field name and docs are clear.

Add lightweight validation to prevent 0 (if not meaningful) using validator like other fields.
@@
-    #[serde(default, skip_serializing_if = "Option::is_none")]
-    #[builder(default, setter(strip_option))]
-    pub max_thinking_tokens: Option<u32>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    #[builder(default, setter(strip_option))]
+    #[validate(range(min = 1))]
+    pub max_thinking_tokens: Option<u32>,
And a focused test:
@@
 mod tests {
@@
     fn test_valid_top_k_values() {
@@
     }
+
+    #[test]
+    fn test_invalid_max_thinking_tokens_value() {
+        let nv_ext = NvExt::builder().max_thinking_tokens(0).build().unwrap();
+        assert!(nv_ext.validate().is_err());
+    }
lib/llm/tests/test_common_ext.rs (1)
187-219: Nice propagation test from NvExt → StopConditions

Covers presence and absence. Consider mirroring this for the Completions flow for symmetry.

Here’s a minimal companion test:
@@
 #[test]
 fn test_max_thinking_tokens_extraction() {
@@
 }
+
+#[test]
+fn test_max_thinking_tokens_extraction_in_completions() {
+    use dynamo_llm::protocols::openai::completions::NvCreateCompletionRequest;
+    let json_str = r#"{
+        "model": "test-model",
+        "prompt": "Hello",
+        "nvext": { "max_thinking_tokens": 512 }
+    }"#;
+    let request: NvCreateCompletionRequest = serde_json::from_str(json_str).unwrap();
+    assert_eq!(request.nvext.as_ref().unwrap().max_thinking_tokens, Some(512));
+    let stop_conditions = request.extract_stop_conditions().unwrap();
+    assert_eq!(stop_conditions.max_thinking_tokens, Some(512));
+}
lib/llm/src/protocols/openai.rs (1)

71-75: Plumbing looks correct; future-proofing note

get_max_thinking_tokens() reads from nvext and returns Option<u32> as intended; using and_then is appropriate.

You propagate it into StopConditions without altering other behavior.

If you later introduce a root-level counterpart (CommonExt) for this field, mirror the precedence pattern used in get_ignore_eos (e.g., choose_with_deprecation) to avoid ambiguity.

Also applies to: 161-161, 178-179

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c920cbd and b5c95be.

📒 Files selected for processing (4)

lib/llm/src/protocols/common.rs (1 hunks)
lib/llm/src/protocols/openai.rs (3 hunks)
lib/llm/src/protocols/openai/nvext.rs (4 hunks)
lib/llm/tests/test_common_ext.rs (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

lib/llm/tests/test_common_ext.rs (3)

lib/llm/src/protocols/openai.rs (2)

nvext (43-43)

nvext (53-53)

lib/llm/src/protocols/openai/nvext.rs (1)

nvext (21-21)

lib/llm/src/protocols/openai/chat_completions.rs (3)

nvext (79-81)

nvext (139-141)

nvext (256-258)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Build and Test - dynamo
GitHub Check: pre-merge-rust (lib/bindings/python)
GitHub Check: pre-merge-rust (lib/runtime/examples)
GitHub Check: pre-merge-rust (.)

🔇 Additional comments (1)

lib/llm/src/protocols/openai/nvext.rs (1)

166-166: Tests cover default and explicit values

Defaults asserting None and a positive value (1024) look good and exercise builder ergonomics.

Also applies to: 182-183, 204-205

Signed-off-by: Bhuvan Agrawal <11240550+bhuvan002@users.noreply.github.com>

copy-pr-bot · 2025-09-04T01:58:36Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: Bhuvan Agrawal <11240550+bhuvan002@users.noreply.github.com>

Signed-off-by: Bhuvan Agrawal <11240550+bhuvan002@users.noreply.github.com> Signed-off-by: nnshah1 <neelays@nvidia.com>

Signed-off-by: Bhuvan Agrawal <11240550+bhuvan002@users.noreply.github.com>

bhuvan002 requested a review from a team as a code owner September 3, 2025 21:42

pull-request-size bot added the size/M label Sep 3, 2025

github-actions bot added the chore label Sep 3, 2025

bhuvan002 force-pushed the bhuvan/thinking-budget branch from 80809a7 to b5c95be Compare September 3, 2025 21:43

copy-pr-bot bot temporarily deployed to GITLAB September 3, 2025 21:43 Inactive

copy-pr-bot bot temporarily deployed to GITLAB September 3, 2025 21:44 Inactive

ayushag-nv approved these changes Sep 3, 2025

View reviewed changes

coderabbitai bot reviewed Sep 3, 2025

View reviewed changes

paulhendricks approved these changes Sep 3, 2025

View reviewed changes

bhuvan002 enabled auto-merge (squash) September 3, 2025 22:02

chore: frontend API changes for thinking budget

9fd65cf

Signed-off-by: Bhuvan Agrawal <11240550+bhuvan002@users.noreply.github.com>

bhuvan002 force-pushed the bhuvan/thinking-budget branch from b5c95be to 9fd65cf Compare September 4, 2025 01:58

bhuvan002 merged commit 8d30753 into main Sep 4, 2025
11 checks passed

bhuvan002 deleted the bhuvan/thinking-budget branch September 4, 2025 02:32

dillon-cullinan pushed a commit that referenced this pull request Sep 5, 2025

chore: frontend API changes for thinking budget (#2848)

f4d49b0

Signed-off-by: Bhuvan Agrawal <11240550+bhuvan002@users.noreply.github.com>

nnshah1 pushed a commit that referenced this pull request Sep 8, 2025

chore: frontend API changes for thinking budget (#2848)

07dc4bb

Signed-off-by: Bhuvan Agrawal <11240550+bhuvan002@users.noreply.github.com> Signed-off-by: nnshah1 <neelays@nvidia.com>

bhuvan002 added a commit that referenced this pull request Sep 10, 2025

chore: frontend API changes for thinking budget (#2848)

5e111d9

Signed-off-by: Bhuvan Agrawal <11240550+bhuvan002@users.noreply.github.com>

nv-nmailhot pushed a commit that referenced this pull request Sep 11, 2025

chore: frontend API changes for thinking budget (#2848) (#2992)

188f5a9

Signed-off-by: Bhuvan Agrawal <11240550+bhuvan002@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: frontend API changes for thinking budget #2848

chore: frontend API changes for thinking budget #2848

Uh oh!

bhuvan002 commented Sep 3, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 3, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

copy-pr-bot bot commented Sep 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chore: frontend API changes for thinking budget #2848

chore: frontend API changes for thinking budget #2848

Uh oh!

Conversation

bhuvan002 commented Sep 3, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

copy-pr-bot bot commented Sep 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bhuvan002 commented Sep 3, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 3, 2025 •

edited

Loading