-
Notifications
You must be signed in to change notification settings - Fork 468
feat(crewai): support tracing crewAI flows #14082
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 277 ± 5 ms. The average import time from base is: 280 ± 5 ms. The import time difference between this PR and base is: -3.9 ± 0.2 ms. Import time breakdownThe following import paths have shrunk:
|
Performance SLOsCandidate: yunkim/crewai-flow (0ddc3d9) 🔵 No Baseline Data (24 suites)🔵 coreapiscenario - 12/12 (2 unstable)🔵 No baseline data available for this suite
|
02c9c18 to
c28dfbf
Compare
sabrenner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couple thoughts, about to look over the linking logic but looks great so far!
sabrenner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did a closer pass at the span linking stuff, honestly i thought it made sense! just some clarification questions and points to guide some cleanup, but ill take another look in a bit to double-check everything after you get a chance to address some of the comments 😄
ncybul
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, I think I may have found one bug that needs to be addressed and then had some other questions / comments!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! lgtm now, i think i understand the linking logic a lot better. good to approve when opened comments are resolved 😄
ncybul
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, awesome work this is super cool!
[MLOB-2806] This PR adds support for APM and LLMObs tracing of CrewAI flows, including the overall flow execution and individual start/listener method execution, as well as span linking of which methods triggered other methods in the flow (includes support for linking conditional AND/OR and router method triggers). (Due to some type hinting issues, I've removed `CrewAIIntegration` import from `llmobs/_integrations/__init__.py`. Python 3.8 has issues with subscripting WeakKeyDictionary, but CrewAI (and also LangGraph) only run on Python 3.9/3.10+ so we didn't know this was an issue until now) Note (assist for reading test assertions): span linking/event tests in `test_crewai_llmobs.py` are based on a complex flow that looks like this: <img width="974" height="627" alt="Screenshot 2025-07-30 at 4 23 59 PM" src="https://github.com/user-attachments/assets/c784fe1b-6574-4735-a632-ec63a245ffc9" /> Additional note: tested [manually](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Aml-app%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZhc-Jpa5Z3YfgAAABhBWmhjLUpwYUFBRHBkZlpaLTBvekFBQUEAAAAkMDE5ODVjZjktYTNkYS00ZGEzLWE3NGYtOTQ1NmI4YWU2Mjg1AAAAfQ%22%7D%5D&spanId=14559885504901134809&start=1753820708729&end=1753907108729&paused=false) and verified that crews are compatible with being run inside flows (as is a documented use case in CrewAI): <img width="707" height="685" alt="Screenshot 2025-07-30 at 4 25 48 PM" src="https://github.com/user-attachments/assets/cc694c63-e043-4d36-a58d-5d65b31616cc" /> ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) [MLOB-2806]: https://datadoghq.atlassian.net/browse/MLOB-2806?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[MLOB-2806] This PR adds support for APM and LLMObs tracing of CrewAI flows, including the overall flow execution and individual start/listener method execution, as well as span linking of which methods triggered other methods in the flow (includes support for linking conditional AND/OR and router method triggers). (Due to some type hinting issues, I've removed `CrewAIIntegration` import from `llmobs/_integrations/__init__.py`. Python 3.8 has issues with subscripting WeakKeyDictionary, but CrewAI (and also LangGraph) only run on Python 3.9/3.10+ so we didn't know this was an issue until now) Note (assist for reading test assertions): span linking/event tests in `test_crewai_llmobs.py` are based on a complex flow that looks like this: <img width="974" height="627" alt="Screenshot 2025-07-30 at 4 23 59 PM" src="https://github.com/user-attachments/assets/c784fe1b-6574-4735-a632-ec63a245ffc9" /> Additional note: tested [manually](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Aml-app%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZhc-Jpa5Z3YfgAAABhBWmhjLUpwYUFBRHBkZlpaLTBvekFBQUEAAAAkMDE5ODVjZjktYTNkYS00ZGEzLWE3NGYtOTQ1NmI4YWU2Mjg1AAAAfQ%22%7D%5D&spanId=14559885504901134809&start=1753820708729&end=1753907108729&paused=false) and verified that crews are compatible with being run inside flows (as is a documented use case in CrewAI): <img width="707" height="685" alt="Screenshot 2025-07-30 at 4 25 48 PM" src="https://github.com/user-attachments/assets/cc694c63-e043-4d36-a58d-5d65b31616cc" /> ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) [MLOB-2806]: https://datadoghq.atlassian.net/browse/MLOB-2806?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[MLOB-2806] This PR adds support for APM and LLMObs tracing of CrewAI flows, including the overall flow execution and individual start/listener method execution, as well as span linking of which methods triggered other methods in the flow (includes support for linking conditional AND/OR and router method triggers). (Due to some type hinting issues, I've removed `CrewAIIntegration` import from `llmobs/_integrations/__init__.py`. Python 3.8 has issues with subscripting WeakKeyDictionary, but CrewAI (and also LangGraph) only run on Python 3.9/3.10+ so we didn't know this was an issue until now) Note (assist for reading test assertions): span linking/event tests in `test_crewai_llmobs.py` are based on a complex flow that looks like this: <img width="974" height="627" alt="Screenshot 2025-07-30 at 4 23 59 PM" src="https://github.com/user-attachments/assets/c784fe1b-6574-4735-a632-ec63a245ffc9" /> Additional note: tested [manually](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Aml-app%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZhc-Jpa5Z3YfgAAABhBWmhjLUpwYUFBRHBkZlpaLTBvekFBQUEAAAAkMDE5ODVjZjktYTNkYS00ZGEzLWE3NGYtOTQ1NmI4YWU2Mjg1AAAAfQ%22%7D%5D&spanId=14559885504901134809&start=1753820708729&end=1753907108729&paused=false) and verified that crews are compatible with being run inside flows (as is a documented use case in CrewAI): <img width="707" height="685" alt="Screenshot 2025-07-30 at 4 25 48 PM" src="https://github.com/user-attachments/assets/cc694c63-e043-4d36-a58d-5d65b31616cc" /> ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) [MLOB-2806]: https://datadoghq.atlassian.net/browse/MLOB-2806?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[MLOB-2806] This PR adds support for APM and LLMObs tracing of CrewAI flows, including the overall flow execution and individual start/listener method execution, as well as span linking of which methods triggered other methods in the flow (includes support for linking conditional AND/OR and router method triggers). (Due to some type hinting issues, I've removed `CrewAIIntegration` import from `llmobs/_integrations/__init__.py`. Python 3.8 has issues with subscripting WeakKeyDictionary, but CrewAI (and also LangGraph) only run on Python 3.9/3.10+ so we didn't know this was an issue until now) Note (assist for reading test assertions): span linking/event tests in `test_crewai_llmobs.py` are based on a complex flow that looks like this: <img width="974" height="627" alt="Screenshot 2025-07-30 at 4 23 59 PM" src="https://github.com/user-attachments/assets/c784fe1b-6574-4735-a632-ec63a245ffc9" /> Additional note: tested [manually](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Aml-app%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZhc-Jpa5Z3YfgAAABhBWmhjLUpwYUFBRHBkZlpaLTBvekFBQUEAAAAkMDE5ODVjZjktYTNkYS00ZGEzLWE3NGYtOTQ1NmI4YWU2Mjg1AAAAfQ%22%7D%5D&spanId=14559885504901134809&start=1753820708729&end=1753907108729&paused=false) and verified that crews are compatible with being run inside flows (as is a documented use case in CrewAI): <img width="707" height="685" alt="Screenshot 2025-07-30 at 4 25 48 PM" src="https://github.com/user-attachments/assets/cc694c63-e043-4d36-a58d-5d65b31616cc" /> ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) [MLOB-2806]: https://datadoghq.atlassian.net/browse/MLOB-2806?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[MLOB-2806] This PR adds support for APM and LLMObs tracing of CrewAI flows, including the overall flow execution and individual start/listener method execution, as well as span linking of which methods triggered other methods in the flow (includes support for linking conditional AND/OR and router method triggers). (Due to some type hinting issues, I've removed `CrewAIIntegration` import from `llmobs/_integrations/__init__.py`. Python 3.8 has issues with subscripting WeakKeyDictionary, but CrewAI (and also LangGraph) only run on Python 3.9/3.10+ so we didn't know this was an issue until now) Note (assist for reading test assertions): span linking/event tests in `test_crewai_llmobs.py` are based on a complex flow that looks like this: <img width="974" height="627" alt="Screenshot 2025-07-30 at 4 23 59 PM" src="https://github.com/user-attachments/assets/c784fe1b-6574-4735-a632-ec63a245ffc9" /> Additional note: tested [manually](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Aml-app%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZhc-Jpa5Z3YfgAAABhBWmhjLUpwYUFBRHBkZlpaLTBvekFBQUEAAAAkMDE5ODVjZjktYTNkYS00ZGEzLWE3NGYtOTQ1NmI4YWU2Mjg1AAAAfQ%22%7D%5D&spanId=14559885504901134809&start=1753820708729&end=1753907108729&paused=false) and verified that crews are compatible with being run inside flows (as is a documented use case in CrewAI): <img width="707" height="685" alt="Screenshot 2025-07-30 at 4 25 48 PM" src="https://github.com/user-attachments/assets/cc694c63-e043-4d36-a58d-5d65b31616cc" /> ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) [MLOB-2806]: https://datadoghq.atlassian.net/browse/MLOB-2806?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[MLOB-2806] This PR adds support for APM and LLMObs tracing of CrewAI flows, including the overall flow execution and individual start/listener method execution, as well as span linking of which methods triggered other methods in the flow (includes support for linking conditional AND/OR and router method triggers). (Due to some type hinting issues, I've removed `CrewAIIntegration` import from `llmobs/_integrations/__init__.py`. Python 3.8 has issues with subscripting WeakKeyDictionary, but CrewAI (and also LangGraph) only run on Python 3.9/3.10+ so we didn't know this was an issue until now) Note (assist for reading test assertions): span linking/event tests in `test_crewai_llmobs.py` are based on a complex flow that looks like this: <img width="974" height="627" alt="Screenshot 2025-07-30 at 4 23 59 PM" src="https://github.com/user-attachments/assets/c784fe1b-6574-4735-a632-ec63a245ffc9" /> Additional note: tested [manually](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Aml-app%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZhc-Jpa5Z3YfgAAABhBWmhjLUpwYUFBRHBkZlpaLTBvekFBQUEAAAAkMDE5ODVjZjktYTNkYS00ZGEzLWE3NGYtOTQ1NmI4YWU2Mjg1AAAAfQ%22%7D%5D&spanId=14559885504901134809&start=1753820708729&end=1753907108729&paused=false) and verified that crews are compatible with being run inside flows (as is a documented use case in CrewAI): <img width="707" height="685" alt="Screenshot 2025-07-30 at 4 25 48 PM" src="https://github.com/user-attachments/assets/cc694c63-e043-4d36-a58d-5d65b31616cc" /> ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) [MLOB-2806]: https://datadoghq.atlassian.net/browse/MLOB-2806?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
MLOB-2806
This PR adds support for APM and LLMObs tracing of CrewAI flows, including the overall flow execution and individual start/listener method execution, as well as span linking of which methods triggered other methods in the flow (includes support for linking conditional AND/OR and router method triggers).
(Due to some type hinting issues, I've removed
CrewAIIntegrationimport fromllmobs/_integrations/__init__.py. Python 3.8 has issues with subscripting WeakKeyDictionary, but CrewAI (and also LangGraph) only run on Python 3.9/3.10+ so we didn't know this was an issue until now)Note (assist for reading test assertions): span linking/event tests in

test_crewai_llmobs.pyare based on a complex flow that looks like this:Additional note: tested manually and verified that crews are compatible with being run inside flows (as is a documented use case in CrewAI):

Checklist
Reviewer Checklist