[Feature Request] Streamed [DONE] response in RestAPI should have token data #3061

TNT3530 · 2024-12-10T03:32:18Z

🚀 Feature

Similar to a standard request to the API, the final stream chunk should have a usage field to give prompt/generated token counts. This can be sent with the final [DONE] response in the streamed chunks, or as its own dedicated chunk.

Motivation

While the usage field in the standard response is of great use for tracking token use, streamed responses have no such count for input tokens. Counting the output chunks works for output tokens, but prompt remains client side counted (inaccurate) at best.

Alternatives

Counting tokens on the client side is an option, but exact counts will be off as the tokenizers and templates can be different

Additional context

While this is out of spec for the OpenAI definition, it will be quite useful!

TNT3530 added the feature request New feature or request label Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Streamed [DONE] response in RestAPI should have token data #3061

[Feature Request] Streamed [DONE] response in RestAPI should have token data #3061

TNT3530 commented Dec 10, 2024 •

edited

Loading

[Feature Request] Streamed [DONE] response in RestAPI should have token data #3061

[Feature Request] Streamed [DONE] response in RestAPI should have token data #3061

Comments

TNT3530 commented Dec 10, 2024 • edited Loading

🚀 Feature

Motivation

Alternatives

Additional context

TNT3530 commented Dec 10, 2024 •

edited

Loading