Skip to content

Report total input tokens with cached and non-cached breakdown #14524

@jacob314

Description

@jacob314

Not showing the uncached token count prominently makes the total token usage seem much higher than it is as cached tokens are very cheap.

Gemini 3.0 makes a large # of small requests that much better utilize cached tokens which makes the total token count very misleading.
We should instead show the number of non-cached tokens as the input tokens as that is the meaningful stat for most users.

We should likely indicate the # of cached tokens at least for API Key users but need to make sure we don't do it in a way that causes people to be confused about cost of overall efficiency.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions