Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Streamed [DONE] response in RestAPI should have token data #3061

Open
TNT3530 opened this issue Dec 10, 2024 · 0 comments
Open
Labels
feature request New feature or request

Comments

@TNT3530
Copy link

TNT3530 commented Dec 10, 2024

🚀 Feature

Similar to a standard request to the API, the final stream chunk should have a usage field to give prompt/generated token counts. This can be sent with the final [DONE] response in the streamed chunks, or as its own dedicated chunk.

Motivation

While the usage field in the standard response is of great use for tracking token use, streamed responses have no such count for input tokens. Counting the output chunks works for output tokens, but prompt remains client side counted (inaccurate) at best.

Alternatives

Counting tokens on the client side is an option, but exact counts will be off as the tokenizers and templates can be different

Additional context

While this is out of spec for the OpenAI definition, it will be quite useful!

@TNT3530 TNT3530 added the feature request New feature or request label Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant