-
Notifications
You must be signed in to change notification settings - Fork 689
chore: frontend API changes for thinking budget #2848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
80809a7 to
b5c95be
Compare
WalkthroughIntroduces an optional max_thinking_tokens field across nvext, OpenAI stop-conditions provider, and common StopConditions. Adds a getter in the provider, plumbs the value through to StopConditions, and adds tests validating JSON extraction and propagation. No enforcement or behavioral changes; defaults to None when unspecified. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client
participant OpenAI API Layer as OpenAI API Layer
participant NvExt Parser as NvExt Parser
participant StopCond Provider as StopConditionsProvider
participant Common StopConditions as Common StopConditions
Client->>OpenAI API Layer: Send request (may include nvext.max_thinking_tokens)
OpenAI API Layer->>NvExt Parser: Parse nvext
NvExt Parser-->>OpenAI API Layer: NvExt { max_thinking_tokens: Option<u32> }
OpenAI API Layer->>StopCond Provider: Build stop conditions
StopCond Provider->>StopCond Provider: get_max_thinking_tokens()
StopCond Provider->>Common StopConditions: Construct with max_thinking_tokens: Option<u32>
Common StopConditions-->>OpenAI API Layer: StopConditions (no enforcement)
OpenAI API Layer-->>Client: Proceed with request handling (unchanged flow)
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Poem
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
lib/llm/src/protocols/common.rs (1)
261-264: Good addition; passthrough field is safe and non-breakingAdding
max_thinking_tokens: Option<u32>toStopConditionswith clear docs and defaultNonekeeps backward compatibility. No enforcement inapply_ignore_eos, which is fine for now.If you expect
0to be invalid, consider documenting that explicitly or validating upstream (e.g., only accept values ≥ 1). Also, when enforcement arrives, clarify interactions withmin_tokens/max_tokensin the doc comment.lib/llm/src/protocols/openai/nvext.rs (1)
104-109: NvExt passthrough looks correct and consistent with existing optionsSerde defaults/skip and the builder
strip_optionpattern match the rest of NvExt. Field name and docs are clear.Add lightweight validation to prevent
0(if not meaningful) usingvalidatorlike other fields.@@ - #[serde(default, skip_serializing_if = "Option::is_none")] - #[builder(default, setter(strip_option))] - pub max_thinking_tokens: Option<u32>, + #[serde(default, skip_serializing_if = "Option::is_none")] + #[builder(default, setter(strip_option))] + #[validate(range(min = 1))] + pub max_thinking_tokens: Option<u32>,And a focused test:
@@ mod tests { @@ fn test_valid_top_k_values() { @@ } + + #[test] + fn test_invalid_max_thinking_tokens_value() { + let nv_ext = NvExt::builder().max_thinking_tokens(0).build().unwrap(); + assert!(nv_ext.validate().is_err()); + }lib/llm/tests/test_common_ext.rs (1)
187-219: Nice propagation test from NvExt → StopConditionsCovers presence and absence. Consider mirroring this for the Completions flow for symmetry.
Here’s a minimal companion test:
@@ #[test] fn test_max_thinking_tokens_extraction() { @@ } + +#[test] +fn test_max_thinking_tokens_extraction_in_completions() { + use dynamo_llm::protocols::openai::completions::NvCreateCompletionRequest; + let json_str = r#"{ + "model": "test-model", + "prompt": "Hello", + "nvext": { "max_thinking_tokens": 512 } + }"#; + let request: NvCreateCompletionRequest = serde_json::from_str(json_str).unwrap(); + assert_eq!(request.nvext.as_ref().unwrap().max_thinking_tokens, Some(512)); + let stop_conditions = request.extract_stop_conditions().unwrap(); + assert_eq!(stop_conditions.max_thinking_tokens, Some(512)); +}lib/llm/src/protocols/openai.rs (1)
71-75: Plumbing looks correct; future-proofing note
get_max_thinking_tokens()reads fromnvextand returnsOption<u32>as intended; usingand_thenis appropriate.- You propagate it into
StopConditionswithout altering other behavior.If you later introduce a root-level counterpart (CommonExt) for this field, mirror the precedence pattern used in
get_ignore_eos(e.g.,choose_with_deprecation) to avoid ambiguity.Also applies to: 161-161, 178-179
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
lib/llm/src/protocols/common.rs(1 hunks)lib/llm/src/protocols/openai.rs(3 hunks)lib/llm/src/protocols/openai/nvext.rs(4 hunks)lib/llm/tests/test_common_ext.rs(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
lib/llm/tests/test_common_ext.rs (3)
lib/llm/src/protocols/openai.rs (2)
nvext(43-43)nvext(53-53)lib/llm/src/protocols/openai/nvext.rs (1)
nvext(21-21)lib/llm/src/protocols/openai/chat_completions.rs (3)
nvext(79-81)nvext(139-141)nvext(256-258)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build and Test - dynamo
- GitHub Check: pre-merge-rust (lib/bindings/python)
- GitHub Check: pre-merge-rust (lib/runtime/examples)
- GitHub Check: pre-merge-rust (.)
🔇 Additional comments (1)
lib/llm/src/protocols/openai/nvext.rs (1)
166-166: Tests cover default and explicit valuesDefaults asserting
Noneand a positive value (1024) look good and exercise builder ergonomics.Also applies to: 182-183, 204-205
Signed-off-by: Bhuvan Agrawal <11240550+bhuvan002@users.noreply.github.com>
b5c95be to
9fd65cf
Compare
Signed-off-by: Bhuvan Agrawal <11240550+bhuvan002@users.noreply.github.com>
Signed-off-by: Bhuvan Agrawal <11240550+bhuvan002@users.noreply.github.com> Signed-off-by: nnshah1 <neelays@nvidia.com>
Signed-off-by: Bhuvan Agrawal <11240550+bhuvan002@users.noreply.github.com>
Overview:
This PR adds a new field
max_thinking_tokensunder the NvExt in the OpenAI schemaDetails:
This only adds the fields and currently acts as passthrough/no-op. The actual logic will be added to the backends separately in future PRs.
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit