How does the token usage counter work? #166
-
|
I am just curios about the token usage counter and how its possible that we can get up to 2million tokens of usage before compacting? the context window for gpt5 is 400k so I am just a little confused how these numbers add up. Obviously I am misunderstanding something just wanted someone to explain this to me as I was curious whats going on here. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
|
Hi @ThePipeFixer — great question! The number in the bottom bar is a running lifetime usage counter for the session, not the size of the prompt we’re about to send. Every time the model finishes a turn we take the usage payload from the Responses API and add it to The “(% left)” overlay is calculated differently: after each turn we look only at the most recent usage block, back out reasoning tokens that don’t stay in context, and compare that value to the active model’s context window to estimate remaining headroom ( As for compaction, there’s a separate knob called Hope that clears up the mismatch between the big number in the footer and the model’s context window—let me know if anything’s still fuzzy! |
Beta Was this translation helpful? Give feedback.
Hi @ThePipeFixer — great question! The number in the bottom bar is a running lifetime usage counter for the session, not the size of the prompt we’re about to send. Every time the model finishes a turn we take the usage payload from the Responses API and add it to
total_token_usage, which simply keeps summing non-cached input and output tokens viaTokenUsage::blended_totalandTokenUsageInfo::append_last_usage(codex-rs/core/src/protocol.rs:650,codex-rs/protocol/src/protocol.rs:612). The TUI then renders that cumulative figure in bold (codex-rs/tui/src/bottom_pane/chat_composer.rs:1631) using the helper that merges each new report into the running tally (codex-rs/tui/src/chatwidget.rs:12968