You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Similar to a standard request to the API, the final stream chunk should have a usage field to give prompt/generated token counts. This can be sent with the final [DONE] response in the streamed chunks, or as its own dedicated chunk.
Motivation
While the usage field in the standard response is of great use for tracking token use, streamed responses have no such count for input tokens. Counting the output chunks works for output tokens, but prompt remains client side counted (inaccurate) at best.
Alternatives
Counting tokens on the client side is an option, but exact counts will be off as the tokenizers and templates can be different
Additional context
While this is out of spec for the OpenAI definition, it will be quite useful!
The text was updated successfully, but these errors were encountered:
🚀 Feature
Similar to a standard request to the API, the final stream chunk should have a usage field to give prompt/generated token counts. This can be sent with the final [DONE] response in the streamed chunks, or as its own dedicated chunk.
Motivation
While the usage field in the standard response is of great use for tracking token use, streamed responses have no such count for input tokens. Counting the output chunks works for output tokens, but prompt remains client side counted (inaccurate) at best.
Alternatives
Counting tokens on the client side is an option, but exact counts will be off as the tokenizers and templates can be different
Additional context
While this is out of spec for the OpenAI definition, it will be quite useful!
The text was updated successfully, but these errors were encountered: