How does the token usage counter work? #166

ThePipeFixer · 2025-09-15T00:15:07Z

ThePipeFixer
Sep 15, 2025

I am just curios about the token usage counter and how its possible that we can get up to 2million tokens of usage before compacting? the context window for gpt5 is 400k so I am just a little confused how these numbers add up. Obviously I am misunderstanding something just wanted someone to explain this to me as I was curious whats going on here.

Answered by zemaj

Sep 18, 2025

Hi @ThePipeFixer — great question! The number in the bottom bar is a running lifetime usage counter for the session, not the size of the prompt we’re about to send. Every time the model finishes a turn we take the usage payload from the Responses API and add it to total_token_usage, which simply keeps summing non-cached input and output tokens via TokenUsage::blended_total and TokenUsageInfo::append_last_usage (codex-rs/core/src/protocol.rs:650, codex-rs/protocol/src/protocol.rs:612). The TUI then renders that cumulative figure in bold (codex-rs/tui/src/bottom_pane/chat_composer.rs:1631) using the helper that merges each new report into the running tally (codex-rs/tui/src/chatwidget.rs:12968

View full answer

zemaj · 2025-09-18T10:24:48Z

zemaj
Sep 18, 2025
Maintainer

Hi @ThePipeFixer — great question! The number in the bottom bar is a running lifetime usage counter for the session, not the size of the prompt we’re about to send. Every time the model finishes a turn we take the usage payload from the Responses API and add it to total_token_usage, which simply keeps summing non-cached input and output tokens via TokenUsage::blended_total and TokenUsageInfo::append_last_usage (codex-rs/core/src/protocol.rs:650, codex-rs/protocol/src/protocol.rs:612). The TUI then renders that cumulative figure in bold (codex-rs/tui/src/bottom_pane/chat_composer.rs:1631) using the helper that merges each new report into the running tally (codex-rs/tui/src/chatwidget.rs:12968). That’s why the counter can climb well past the model’s context window—it’s essentially a spend meter for the whole conversation.

The “(% left)” overlay is calculated differently: after each turn we look only at the most recent usage block, back out reasoning tokens that don’t stay in context, and compare that value to the active model’s context window to estimate remaining headroom (codex-rs/core/src/protocol.rs:658). So the percentage is the thing to watch if you’re worried about the live prompt hitting 400k tokens.

As for compaction, there’s a separate knob called model_auto_compact_token_limit. We read it from your profile or the model metadata (codex-rs/core/src/config.rs:1554, codex-rs/core/src/client.rs:128). Our hosted config currently sets that to roughly 2 million tokens for GPT-5, which lines up with what you’re seeing: we let the cumulative usage run high, then trigger /compact to summarize once you cross that budget, and continue with the condensed history. If you run Codex with your own config you can lower that limit to force earlier summaries.

Hope that clears up the mismatch between the big number in the footer and the model’s context window—let me know if anything’s still fuzzy!

1 reply

zemaj Sep 18, 2025
Maintainer

Having said all that... I've been having a think and I don't feel this is the right approach. Most people expect that token count to be turn based not session based. It's what other CLIs show. It's confused me as well!

In the next release this number will show the turn based token count, so you'll get a much better idea of when you're getting close to the 400k limit.

FYI GPT-5 does noticeably reduce performance once you pass the 50% mark on token usage. Both in speed and recall from the earlier parts of the session. GPT-5-codex is not quite as bad, but still not ideal. I do run it more often past 50%.

If you're working on a difficult problem always start a new session first. If you're running in circles and past the 50% context try again in a new session and you'll probably see much better results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How does the token usage counter work? #166

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How does the token usage counter work? #166

Uh oh!

ThePipeFixer Sep 15, 2025

Replies: 1 comment · 1 reply

Uh oh!

zemaj Sep 18, 2025 Maintainer

Uh oh!

zemaj Sep 18, 2025 Maintainer

ThePipeFixer
Sep 15, 2025

Replies: 1 comment 1 reply

zemaj
Sep 18, 2025
Maintainer

zemaj Sep 18, 2025
Maintainer