Skip to content

Conversation

@paulhendricks
Copy link
Member

@paulhendricks paulhendricks commented Jun 25, 2025

Overview:

Refactoring to use async_openai::types::Logprobs

Details:

This change is to be consistent with previous renaming and refactoring for protocols:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Summary by CodeRabbit

  • New Features

    • Added support for the "content_filter" finish reason in completion responses, allowing more detailed feedback when content is filtered.
  • Refactor

    • Unified completion choice handling to use official async_openai types, replacing legacy custom structures.
    • Improved type safety and consistency by switching from string-based finish reasons to strongly typed enums throughout the completion and aggregation logic.
  • Bug Fixes

    • Enhanced propagation of finish reasons, ensuring accurate mapping and display in streaming and aggregated responses.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Jun 25, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jun 26, 2025

Walkthrough

The changes refactor completion choice handling in the LLM codebase to use the official async_openai crate types for choices and finish reasons. Custom string-based and legacy structs are replaced with strongly typed enums and structs, ensuring type safety and simplifying conversions. The FinishReason enum is extended with a ContentFilter variant, and related conversions and mappings are updated accordingly.

Changes

File(s) Change Summary
lib/llm/src/engines.rs Switched finish reason from string "stop" to enum CompletionFinishReason::Stop in EchoEngineFull's async completion logic.
lib/llm/src/protocols/common.rs Added ContentFilter to FinishReason; implemented conversions to/from async_openai::types::CompletionFinishReason.
lib/llm/src/protocols/openai/chat_completions/delta.rs Added mapping for FinishReason::ContentFilter to OpenAI API's enum variant.
lib/llm/src/protocols/openai/completions.rs Removed legacy CompletionChoice struct; unified to use async_openai::types::Choice and updated related traits and conversions.
lib/llm/src/protocols/openai/completions/aggregator.rs Updated aggregator to use async_openai::types::Choice and direct enum mappings for finish reasons; updated tests and conversions.
lib/llm/src/protocols/openai/completions/delta.rs Refactored to use async_openai::types::Choice and enum-based finish reasons; updated method signatures and mappings.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Engine
    participant DeltaGen
    participant OpenAI_Types

    Client->>Engine: NvCreateCompletionRequest
    Engine->>DeltaGen: Stream characters, yield responses
    DeltaGen->>OpenAI_Types: Create Choice (typed finish reason)
    OpenAI_Types-->>DeltaGen: Typed Choice
    DeltaGen-->>Engine: Annotated response
    Engine-->>Client: Completion with typed finish reason
Loading

Poem

In fields of code, where enums grow,
The rabbits hopped and let old strings go.
With Choice and reason, strong and clear,
Type safety blooms—no bugs to fear!
Content filters join the dance,
As OpenAI types enhance.
🐇✨


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
lib/llm/src/protocols/openai/completions.rs (1)

287-288: TODO tracked for logprobs aggregation.

The TODO comment indicates future work needed for aggregating logprobs. This aligns with the PR objective mentioning the use of async_openai::types::Logprobs.

Would you like me to create an issue to track the implementation of logprobs aggregation?

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c95031e and d5d1a52.

📒 Files selected for processing (6)
  • lib/llm/src/engines.rs (1 hunks)
  • lib/llm/src/protocols/common.rs (3 hunks)
  • lib/llm/src/protocols/openai/chat_completions/delta.rs (1 hunks)
  • lib/llm/src/protocols/openai/completions.rs (4 hunks)
  • lib/llm/src/protocols/openai/completions/aggregator.rs (9 hunks)
  • lib/llm/src/protocols/openai/completions/delta.rs (4 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: kthui
PR: ai-dynamo/dynamo#1424
File: lib/runtime/src/pipeline/network/egress/push_router.rs:204-209
Timestamp: 2025-06-13T22:07:24.843Z
Learning: The codebase uses async-nats version 0.40, not the older nats crate. Error handling should use async_nats::error::Error variants, not nats::Error variants.
lib/llm/src/engines.rs (1)
Learnt from: kthui
PR: ai-dynamo/dynamo#1424
File: lib/runtime/src/pipeline/network/egress/push_router.rs:204-209
Timestamp: 2025-06-13T22:07:24.843Z
Learning: The codebase uses async-nats version 0.40, not the older nats crate. Error handling should use async_nats::error::Error variants, not nats::Error variants.
lib/llm/src/protocols/openai/completions.rs (1)
Learnt from: ishandhanani
PR: ai-dynamo/dynamo#1626
File: lib/llm/src/preprocessor.rs:238-239
Timestamp: 2025-06-24T20:59:35.725Z
Learning: In lib/llm/src/preprocessor.rs, the `sampling_options` call in the `preprocess_request` method is placed in the common section after the match statement on `request.prompt_input_type()`, meaning it applies to both `PromptInput::Tokens` and `PromptInput::Text` request types.
🧬 Code Graph Analysis (1)
lib/llm/src/engines.rs (1)
lib/llm/src/protocols/openai/completions/delta.rs (1)
  • create_choice (81-112)
🔇 Additional comments (16)
lib/llm/src/protocols/openai/chat_completions/delta.rs (1)

206-208: LGTM! Clean addition of ContentFilter mapping.

The new mapping for ContentFilter follows the established pattern and correctly maps the internal enum variant to the OpenAI API equivalent.

lib/llm/src/engines.rs (1)

241-241: Excellent type safety improvement.

Replacing the string literal with the strongly typed CompletionFinishReason::Stop enum variant eliminates potential runtime errors from invalid finish reason strings and aligns with the broader refactoring to use async_openai types.

lib/llm/src/protocols/common.rs (3)

68-70: Well-structured enum extension.

The addition of the ContentFilter variant follows the established pattern with proper serde serialization support.


80-80: Consistent Display implementation.

The Display implementation for ContentFilter follows the same pattern as other variants.


100-125: Robust bidirectional enum conversions.

The From trait implementations correctly handle all enum variants with sensible mappings:

  • Multiple internal variants (EoS, Stop, Cancelled) map to OpenAI's Stop, which is appropriate
  • ContentFilter and Length map directly to their OpenAI equivalents
  • Error maps to Stop as a reasonable fallback
  • The reverse conversion is straightforward since OpenAI has fewer variants

This provides a clean interface between internal and external enum representations.

lib/llm/src/protocols/openai/completions/delta.rs (3)

85-85: Improved method signature with type safety.

Changing the parameter type from string to Option<async_openai::types::CompletionFinishReason> eliminates potential runtime errors from invalid finish reason strings and provides compile-time type checking.


100-105: Consistent use of official async_openai types.

Using async_openai::types::Choice instead of a custom struct aligns with the refactoring goals and ensures compatibility with the OpenAI API specification.


127-127: Simplified finish reason conversion.

The use of Into::into leverages the new trait implementations from common.rs, eliminating manual string conversions and reducing error-prone code.

lib/llm/src/protocols/openai/completions/aggregator.rs (4)

101-103: Proper index type handling.

The casting between u32 (OpenAI API) and u64 (internal storage) is handled consistently for both entry creation and DeltaChoice construction.


114-125: Type-safe finish reason handling.

The conversion from async_openai::types::CompletionFinishReason to internal FinishReason now uses pattern matching on typed enums instead of string parsing, eliminating potential runtime errors and improving maintainability. All OpenAI finish reason variants are properly handled.


160-171: Clean conversion to official OpenAI types.

The From<DeltaChoice> implementation correctly converts to async_openai::types::Choice using the new trait implementations, with proper index type casting back to u32 for the OpenAI API.


203-207: Well-updated test code.

The test code has been properly updated to:

  • Use async_openai::types::Choice instead of custom structs
  • Parse finish reasons through the internal enum with proper error handling
  • Assert against typed enum variants instead of strings
  • Maintain test coverage for the new type system

The tests now provide better validation of the typed enum behavior.

Also applies to: 216-221, 276-279, 307-310, 324-336, 360-371

lib/llm/src/protocols/openai/completions.rs (4)

52-52: LGTM! Type migration aligns with PR objectives.

The migration from legacy CompletionChoice to async_openai::types::Choice is consistent with the goal of using official async_openai types throughout the codebase.


79-83: LGTM! Trait implementation updated for the new type.

The ContentProvider implementation correctly adapts to use async_openai::types::Choice, maintaining the same content extraction logic.


204-218: LGTM! Method signature updated consistently.

The make_response method signature correctly reflects the new async_openai::types::Choice type, maintaining consistency with the CompletionResponse struct changes.


274-301: LGTM! Conversion implementation properly handles the new type.

The implementation correctly converts from the internal streaming response to async_openai::types::Choice. The safety comment for the u32 cast is appreciated, and the error handling for missing text is appropriate.

@paulhendricks paulhendricks enabled auto-merge (squash) June 26, 2025 15:49
@paulhendricks paulhendricks merged commit 7b7b6a6 into main Jun 26, 2025
11 checks passed
@paulhendricks paulhendricks deleted the phendricks/refactor-choice-and-finish-reason branch June 26, 2025 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants