-
Notifications
You must be signed in to change notification settings - Fork 5.7k
fix: use the internal token usage tracker when using litellm #4183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the final PR Bugbot will review for you during this billing cycle
Your free Bugbot reviews will reset on January 28
Details
Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
| start_time=0, | ||
| end_time=0, | ||
| ) | ||
| self._track_token_usage_internal(usage_info) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usage object passed where dict expected causes AttributeError
The _track_token_usage_internal method expects a dict[str, Any] parameter and calls .get() on it. However, usage_info from getattr(response, "usage", None) is a litellm Usage object (a Pydantic model with attributes like prompt_tokens), not a dict. Pydantic models don't have a .get() method, which will cause an AttributeError at runtime. Other providers (OpenAI, Anthropic, Azure, Gemini) correctly use extraction functions like _extract_openai_token_usage() that convert the usage object to a dict before passing it to _track_token_usage_internal.
| start_time=0, | ||
| end_time=0, | ||
| ) | ||
| self._track_token_usage_internal(usage_info) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Token usage tracked multiple times per callback
The _track_token_usage_internal(usage_info) call is placed inside the for callback in callbacks: loop, meaning token usage will be tracked once for each callback that has log_success_event. If multiple callbacks are registered, this will cause token counts to be inflated (doubled, tripled, etc.). The tracking call belongs outside the callback loop since token usage from a single response should only be counted once, regardless of how many callbacks are notified.
| start_time=0, | ||
| end_time=0, | ||
| ) | ||
| self._track_token_usage_internal(usage_info) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Token tracking skipped when no callbacks provided
The internal token usage tracking call is placed inside the if callbacks and len(callbacks) > 0: block, so it only executes when callbacks are provided. If call() is invoked without callbacks (a common use case), _track_token_usage_internal is never called and the internal token counter stays at zero. This is inconsistent with the streaming path where token tracking happens independently of callbacks (lines 930-932). The tracking logic belongs outside the callbacks conditional block.
|
Hey @thiagomoretto, There is an active PR opened for this, #4172 |
Thanks! flagged here and stopped working on this in favor of yours. |
Note
Ensures internal token accounting stays in sync when using LiteLLM.
self._track_token_usage_internal(usage_info)in non-streaming callback path ofLLMafter logging successtest_llm_call_with_string_input_and_callbacksnow validatesllm.get_token_usage_summary()matchesTokenCalcHandlermetricsWritten by Cursor Bugbot for commit 2df8973. This will update automatically on new commits. Configure here.