Fix token tracking race condition in threading-based async execution #4169

devin-ai-integration · 2026-01-03T17:28:04Z

Summary

This PR fixes the race condition described in issue #4168 where token tracking was inaccurate when multiple async tasks with async_execution=True from the same agent ran concurrently.

The root cause was that tokens_before was captured when tasks were queued (in the main thread), but tokens_after was captured sequentially in _process_async_tasks after calling future.result(). This caused later tasks to be credited with tokens from earlier tasks that ran in parallel.

The fix introduces:

Per-agent locks to serialize async task execution for the same agent, ensuring token deltas can be accurately attributed
Token capture callback that captures both tokens_before and tokens_after inside the worker thread (after acquiring the lock), not when the task is queued
New data models (AgentTokenMetrics, TaskTokenMetrics, WorkflowTokenMetrics) for detailed per-task token tracking
Updated TaskOutput to include per-task usage_metrics

Updates since last revision

Fixed lint errors across multiple files:
- B023 (late-binding closure issues) in _aexecute_tasks using default arguments
- W293 (whitespace in blank lines) in docstrings across crew.py, task.py, and usage_metrics.py
- B007/PERF102 in calculate_usage_metrics by using .values() instead of .items()

Review & Testing Checklist for Human

Verify backward compatibility: The execute_async method signature changed to accept optional token_capture_callback and agent_execution_lock parameters. The return type also changed to potentially return a tuple. Verify this doesn't break existing code that calls execute_async directly.
Review the late-binding closure pattern in _execute_tasks (line ~1194): def create_token_callback(agent: Any = exec_data.agent) - ensure the default argument correctly captures the agent value in the loop
Review the async wrapper pattern in _aexecute_tasks (line ~964): async def _wrapped_task_execution(_task=task, _agent=agent, ...) - verify default args correctly bind loop variables
Test with real LLM calls: Run a crew with multiple async tasks from the same agent and verify token metrics are accurate (not duplicated across tasks)
Check base branch: This PR is based on feat/per-user-token-tracing from PR Feature: CrewAI Token Tracking Enhancement #4132. Verify this is the intended target branch.

Recommended test plan:

Create a crew with 2+ async tasks assigned to the same agent
Run crew.kickoff() and inspect result.tasks_output[i].usage_metrics for each task
Verify each task shows distinct token counts (not cumulative/duplicated values)

Notes

The per-agent lock intentionally reduces parallelism for tasks from the same agent to ensure correctness. Tasks from different agents still run concurrently.
There's a known limitation in calculate_usage_metrics where agents with the same role cannot be distinguished (see TODO comment)
One test job (tests 3.13) was cancelled in CI but all other checks passed

Link to Devin run: https://app.devin.ai/sessions/80968e6ecc774e45ad35f833cf8d2ea0
Requested by: João (joao@crewai.com)

Resolved 4 review comments from Cursor Bugbot: 1. Added token tracking for async tasks in _execute_tasks and _process_async_tasks 2. Fixed task key collision by including task_id in the key 3. Added token tracking for _aexecute_tasks paths (both sync and async) 4. Fixed agent metrics to be keyed by agent_id to handle multiple agents with same role All async tasks now capture tokens_before/after and attach metrics properly. Task metrics now use unique keys to prevent overwriting. Agent metrics properly track separate agents with same role.

…sy23/crewAI-telemetry into feat/per-user-token-tracing

Resolved race condition where concurrent async tasks from same agent would get incorrect token attribution. Solution wraps async task execution to capture tokens_after immediately upon task completion, before other concurrent tasks can interfere. Changes: - Wrapped async task execution to return (result, tokens_after) tuple - Updated _aprocess_async_tasks to unwrap and use captured tokens_after - Updated type hints for pending_tasks to reflect new signature Note: Threading-based async_execution still has similar race condition as it's harder to wrap threaded execution. Will track separately.

This commit fixes the race condition described in issue #4168 where token tracking was inaccurate when multiple async tasks from the same agent ran concurrently. The fix introduces: 1. Per-agent locks to serialize async task execution for accurate token tracking when multiple async tasks from the same agent run concurrently 2. Token capture callback that captures both tokens_before and tokens_after inside the thread (after acquiring the lock), not when the task is queued 3. Updated _process_async_tasks to handle the new return type from execute_async which now returns (TaskOutput, tokens_before, tokens_after) This ensures that token deltas are accurately attributed to each task even when multiple async tasks from the same agent overlap in execution. Tests added: - test_async_task_token_tracking_uses_per_agent_lock - test_async_task_token_callback_captures_tokens_inside_thread - test_async_task_per_agent_lock_serializes_execution Co-Authored-By: João <joao@crewai.com>

cursor · 2026-01-03T17:28:07Z

You have run out of free Bugbot PR reviews for this billing cycle. This will reset on January 28.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

devin-ai-integration · 2026-01-03T17:28:08Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

Co-Authored-By: João <joao@crewai.com>

Devasy and others added 8 commits December 20, 2025 18:08

feat: add detailed token metrics tracking for agents and tasks

56b538c

feat: enhance per-agent token metrics accuracy by aggregating task data

8586061

Merge branch 'main' into feat/per-user-token-tracing

a0c2662

Merge branch 'feat/per-user-token-tracing' of https://github.com/Deva…

9bbf53e

…sy23/crewAI-telemetry into feat/per-user-token-tracing

Merge branch 'main' into feat/per-user-token-tracing

314642f

devin-ai-integration bot and others added 3 commits January 3, 2026 17:31

Fix lint errors (B023, W293, B007, PERF102)

563e2ec

Co-Authored-By: João <joao@crewai.com>

Fix W293 lint errors in task.py docstrings

5b8e42c

Co-Authored-By: João <joao@crewai.com>

Fix W293 lint errors in usage_metrics.py docstrings

e022caa

Co-Authored-By: João <joao@crewai.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix token tracking race condition in threading-based async execution #4169

Fix token tracking race condition in threading-based async execution #4169

devin-ai-integration bot commented Jan 3, 2026 •

edited

Loading

Uh oh!

cursor bot commented Jan 3, 2026

Uh oh!

devin-ai-integration bot commented Jan 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix token tracking race condition in threading-based async execution #4169

Are you sure you want to change the base?

Fix token tracking race condition in threading-based async execution #4169

Conversation

devin-ai-integration bot commented Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Updates since last revision

Review & Testing Checklist for Human

Notes

Uh oh!

cursor bot commented Jan 3, 2026

Uh oh!

devin-ai-integration bot commented Jan 3, 2026

🤖 Devin AI Engineer

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

devin-ai-integration bot commented Jan 3, 2026 •

edited

Loading