Skip to content

Include usage information in LengthFinishReasonError #1700

Closed
@dabure

Description

@dabure

Confirm this is a feature request for the Python library and not the underlying OpenAI API.

  • This is a feature request for the Python library

Describe the feature or improvement you're requesting

Situation
When calling AsyncOpenAI(...).beta.chat.completions.parse(..., response_format=SomePydanticModel), the OpenAI library raises LengthFinishReasonError when finish_reason == "length" and raises ContentFilterFinishReasonError when finish_reason == "content_filter", without providing any information as to what the response contained.

Complication
Because there is no way to retrieve any information about the response, I cannot programmatically save information about the context. For example, I cannot access and track information from the usage object in the chat completion response.

Desired behavior
As a library user, I always want to know details about responses from LLM calls that costs tokens for me. More specifically, I want to inspect usage to know how many tokens I "wasted" calling the LLM, for instance when max_tokens was set to a too low value for the LLM to generate a complete structured output. This enables me to track and control costs.

I see two potential solutions:

  1. Stop raising exceptions for these scenarios and always return a chat completion object. I believe this is the behavior in the non-beta version of the chat completion call
  2. Return the response as an attribute in the exception object so that it can be used by the calling programmer

Version used

  • 1.44.1 (latest on PyPI at the time of writing)

Code location

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions