Skip to content

Conversation

@thiagomoretto
Copy link
Contributor

@thiagomoretto thiagomoretto commented Jan 6, 2026

Note

Ensures internal token accounting stays in sync when using LiteLLM.

  • Calls self._track_token_usage_internal(usage_info) in non-streaming callback path of LLM after logging success
  • Test test_llm_call_with_string_input_and_callbacks now validates llm.get_token_usage_summary() matches TokenCalcHandler metrics

Written by Cursor Bugbot for commit 2df8973. This will update automatically on new commits. Configure here.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the final PR Bugbot will review for you during this billing cycle

Your free Bugbot reviews will reset on January 28

Details

Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

start_time=0,
end_time=0,
)
self._track_token_usage_internal(usage_info)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usage object passed where dict expected causes AttributeError

The _track_token_usage_internal method expects a dict[str, Any] parameter and calls .get() on it. However, usage_info from getattr(response, "usage", None) is a litellm Usage object (a Pydantic model with attributes like prompt_tokens), not a dict. Pydantic models don't have a .get() method, which will cause an AttributeError at runtime. Other providers (OpenAI, Anthropic, Azure, Gemini) correctly use extraction functions like _extract_openai_token_usage() that convert the usage object to a dict before passing it to _track_token_usage_internal.

Fix in Cursor Fix in Web

start_time=0,
end_time=0,
)
self._track_token_usage_internal(usage_info)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Token usage tracked multiple times per callback

The _track_token_usage_internal(usage_info) call is placed inside the for callback in callbacks: loop, meaning token usage will be tracked once for each callback that has log_success_event. If multiple callbacks are registered, this will cause token counts to be inflated (doubled, tripled, etc.). The tracking call belongs outside the callback loop since token usage from a single response should only be counted once, regardless of how many callbacks are notified.

Fix in Cursor Fix in Web

start_time=0,
end_time=0,
)
self._track_token_usage_internal(usage_info)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Token tracking skipped when no callbacks provided

The internal token usage tracking call is placed inside the if callbacks and len(callbacks) > 0: block, so it only executes when callbacks are provided. If call() is invoked without callbacks (a common use case), _track_token_usage_internal is never called and the internal token counter stays at zero. This is inconsistent with the streaming path where token tracking happens independently of callbacks (lines 930-932). The tracking logic belongs outside the callbacks conditional block.

Fix in Cursor Fix in Web

@Vidit-Ostwal
Copy link
Contributor

Hey @thiagomoretto, There is an active PR opened for this, #4172
Feel free to review it and let me know if this needs to be updated in any case.

@thiagomoretto
Copy link
Contributor Author

Hey @thiagomoretto, There is an active PR opened for this, #4172 Feel free to review it and let me know if this needs to be updated in any case.

Thanks! flagged here and stopped working on this in favor of yours.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants