-
Notifications
You must be signed in to change notification settings - Fork 467
fix(langchain): fix patching langchain throws an ImportError
#14687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 209 ± 3 ms. The average import time from base is: 214 ± 4 ms. The import time difference between this PR and base is: -4.7 ± 0.2 ms. Import time breakdownThe following import paths have shrunk:
|
Performance SLOsComparing candidate sabrenner/langchain-patching-fix (a2c291c) with baseline main (e13717f) 📈 Performance Regressions (1 suite)📈 iastaspectsospath - 24/24✅ ospathbasename_aspectTime: ✅ 4.145µs (SLO: <10.000µs 📉 -58.6%) vs baseline: -0.4% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.7% ✅ ospathbasename_noaspectTime: ✅ 1.082µs (SLO: <10.000µs 📉 -89.2%) vs baseline: +0.2% Memory: ✅ 37.611MB (SLO: <39.000MB -3.6%) vs baseline: +4.7% ✅ ospathjoin_aspectTime: ✅ 6.104µs (SLO: <10.000µs 📉 -39.0%) vs baseline: -0.3% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +4.9% ✅ ospathjoin_noaspectTime: ✅ 2.307µs (SLO: <10.000µs 📉 -76.9%) vs baseline: +1.3% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.9% ✅ ospathnormcase_aspectTime: ✅ 3.465µs (SLO: <10.000µs 📉 -65.3%) vs baseline: +0.6% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0% ✅ ospathnormcase_noaspectTime: ✅ 0.572µs (SLO: <10.000µs 📉 -94.3%) vs baseline: +0.5% Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.9% ✅ ospathsplit_aspectTime: ✅ 4.773µs (SLO: <10.000µs 📉 -52.3%) vs baseline: +1.6% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.8% ✅ ospathsplit_noaspectTime: ✅ 1.582µs (SLO: <10.000µs 📉 -84.2%) vs baseline: -0.4% Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +5.0% ✅ ospathsplitdrive_aspectTime: ✅ 3.662µs (SLO: <10.000µs 📉 -63.4%) vs baseline: ~same Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ ospathsplitdrive_noaspectTime: ✅ 0.698µs (SLO: <10.000µs 📉 -93.0%) vs baseline: -0.2% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.7% ✅ ospathsplitext_aspectTime: ✅ 5.108µs (SLO: <10.000µs 📉 -48.9%) vs baseline: 📈 +13.8% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0% ✅ ospathsplitext_noaspectTime: ✅ 1.377µs (SLO: <10.000µs 📉 -86.2%) vs baseline: -0.7% Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +5.2% 🟡 Near SLO Breach (4 suites)🟡 djangosimple - 30/30✅ appsecTime: ✅ 20.472ms (SLO: <22.300ms -8.2%) vs baseline: ~same Memory: ✅ 65.500MB (SLO: <67.000MB -2.2%) vs baseline: +4.7% ✅ exception-replay-enabledTime: ✅ 1.345ms (SLO: <1.450ms -7.3%) vs baseline: +0.1% Memory: ✅ 64.495MB (SLO: <67.000MB -3.7%) vs baseline: +4.8% ✅ iastTime: ✅ 20.456ms (SLO: <22.250ms -8.1%) vs baseline: ~same Memory: ✅ 65.577MB (SLO: <67.000MB -2.1%) vs baseline: +5.0% ✅ profilerTime: ✅ 15.235ms (SLO: <16.550ms -7.9%) vs baseline: -0.4% Memory: ✅ 53.744MB (SLO: <54.500MB 🟡 -1.4%) vs baseline: +4.9% ✅ resource-renamingTime: ✅ 20.550ms (SLO: <21.750ms -5.5%) vs baseline: +0.4% Memory: ✅ 65.485MB (SLO: <67.000MB -2.3%) vs baseline: +4.8% ✅ span-code-originTime: ✅ 26.150ms (SLO: <28.200ms -7.3%) vs baseline: -0.1% Memory: ✅ 67.625MB (SLO: <69.500MB -2.7%) vs baseline: +4.9% ✅ tracerTime: ✅ 20.455ms (SLO: <21.750ms -6.0%) vs baseline: -0.2% Memory: ✅ 65.563MB (SLO: <67.000MB -2.1%) vs baseline: +4.9% ✅ tracer-and-profilerTime: ✅ 22.037ms (SLO: <23.500ms -6.2%) vs baseline: +0.2% Memory: ✅ 66.610MB (SLO: <67.500MB 🟡 -1.3%) vs baseline: +5.0% ✅ tracer-dont-create-db-spansTime: ✅ 19.329ms (SLO: <21.500ms 📉 -10.1%) vs baseline: ~same Memory: ✅ 65.490MB (SLO: <66.000MB 🟡 -0.8%) vs baseline: +4.8% ✅ tracer-minimalTime: ✅ 16.614ms (SLO: <17.500ms -5.1%) vs baseline: ~same Memory: ✅ 65.166MB (SLO: <66.000MB 🟡 -1.3%) vs baseline: +4.8% ✅ tracer-nativeTime: ✅ 20.469ms (SLO: <21.750ms -5.9%) vs baseline: ~same Memory: ✅ 71.490MB (SLO: <72.500MB 🟡 -1.4%) vs baseline: +4.9% ✅ tracer-no-cachesTime: ✅ 18.407ms (SLO: <19.650ms -6.3%) vs baseline: ~same Memory: ✅ 65.183MB (SLO: <67.000MB -2.7%) vs baseline: +4.6% ✅ tracer-no-databasesTime: ✅ 18.814ms (SLO: <20.100ms -6.4%) vs baseline: +0.1% Memory: ✅ 65.157MB (SLO: <67.000MB -2.8%) vs baseline: +4.8% ✅ tracer-no-middlewareTime: ✅ 20.186ms (SLO: <21.500ms -6.1%) vs baseline: +0.2% Memory: ✅ 65.514MB (SLO: <67.000MB -2.2%) vs baseline: +4.9% ✅ tracer-no-templatesTime: ✅ 20.321ms (SLO: <22.000ms -7.6%) vs baseline: ~same Memory: ✅ 65.554MB (SLO: <67.000MB -2.2%) vs baseline: +4.9% 🟡 errortrackingdjangosimple - 6/6✅ errortracking-enabled-allTime: ✅ 18.300ms (SLO: <19.850ms -7.8%) vs baseline: +1.0% Memory: ✅ 65.313MB (SLO: <66.500MB 🟡 -1.8%) vs baseline: +4.9% ✅ errortracking-enabled-userTime: ✅ 18.042ms (SLO: <19.400ms -7.0%) vs baseline: +0.1% Memory: ✅ 65.333MB (SLO: <66.500MB 🟡 -1.8%) vs baseline: +4.9% ✅ tracer-enabledTime: ✅ 18.194ms (SLO: <19.450ms -6.5%) vs baseline: +0.6% Memory: ✅ 65.306MB (SLO: <66.500MB 🟡 -1.8%) vs baseline: +4.8% 🟡 flasksimple - 18/18✅ appsec-getTime: ✅ 4.577ms (SLO: <4.750ms -3.6%) vs baseline: -0.5% Memory: ✅ 61.873MB (SLO: <65.000MB -4.8%) vs baseline: +4.8% ✅ appsec-postTime: ✅ 6.576ms (SLO: <6.750ms -2.6%) vs baseline: +0.1% Memory: ✅ 61.932MB (SLO: <65.000MB -4.7%) vs baseline: +4.9% ✅ appsec-telemetryTime: ✅ 4.575ms (SLO: <4.750ms -3.7%) vs baseline: -0.2% Memory: ✅ 62.030MB (SLO: <65.000MB -4.6%) vs baseline: +5.0% ✅ debuggerTime: ✅ 1.857ms (SLO: <2.000ms -7.1%) vs baseline: -0.4% Memory: ✅ 45.416MB (SLO: <47.000MB -3.4%) vs baseline: +4.8% ✅ iast-getTime: ✅ 1.864ms (SLO: <2.000ms -6.8%) vs baseline: -0.2% Memory: ✅ 42.408MB (SLO: <49.000MB 📉 -13.5%) vs baseline: +5.1% ✅ profilerTime: ✅ 1.913ms (SLO: <2.100ms -8.9%) vs baseline: ~same Memory: ✅ 46.478MB (SLO: <47.000MB 🟡 -1.1%) vs baseline: +4.9% ✅ resource-renamingTime: ✅ 3.386ms (SLO: <3.650ms -7.2%) vs baseline: +0.5% Memory: ✅ 52.258MB (SLO: <53.500MB -2.3%) vs baseline: +5.0% ✅ tracerTime: ✅ 3.375ms (SLO: <3.650ms -7.5%) vs baseline: -0.3% Memory: ✅ 52.180MB (SLO: <53.500MB -2.5%) vs baseline: +4.7% ✅ tracer-nativeTime: ✅ 3.377ms (SLO: <3.650ms -7.5%) vs baseline: +0.2% Memory: ✅ 57.970MB (SLO: <60.000MB -3.4%) vs baseline: +4.4% 🟡 otelspan - 22/22✅ add-eventTime: ✅ 45.490ms (SLO: <47.150ms -3.5%) vs baseline: +1.0% Memory: ✅ 45.284MB (SLO: <47.000MB -3.7%) vs baseline: +5.0% ✅ add-metricsTime: ✅ 323.011ms (SLO: <344.800ms -6.3%) vs baseline: ~same Memory: ✅ 552.767MB (SLO: <562.000MB 🟡 -1.6%) vs baseline: +4.9% ✅ add-tagsTime: ✅ 290.229ms (SLO: <314.000ms -7.6%) vs baseline: -0.7% Memory: ✅ 556.052MB (SLO: <563.500MB 🟡 -1.3%) vs baseline: +5.3% ✅ get-contextTime: ✅ 82.834ms (SLO: <92.350ms 📉 -10.3%) vs baseline: -0.1% Memory: ✅ 40.224MB (SLO: <46.500MB 📉 -13.5%) vs baseline: +4.8% ✅ is-recordingTime: ✅ 42.962ms (SLO: <44.500ms -3.5%) vs baseline: +0.3% Memory: ✅ 44.694MB (SLO: <47.500MB -5.9%) vs baseline: +5.0% ✅ record-exceptionTime: ✅ 61.876ms (SLO: <67.650ms -8.5%) vs baseline: +0.3% Memory: ✅ 40.574MB (SLO: <47.000MB 📉 -13.7%) vs baseline: +4.5% ✅ set-statusTime: ✅ 48.779ms (SLO: <50.400ms -3.2%) vs baseline: ~same Memory: ✅ 44.670MB (SLO: <47.000MB -5.0%) vs baseline: +4.9% ✅ startTime: ✅ 42.355ms (SLO: <43.450ms -2.5%) vs baseline: +0.9% Memory: ✅ 44.594MB (SLO: <47.000MB -5.1%) vs baseline: +4.7% ✅ start-finishTime: ✅ 83.094ms (SLO: <88.000ms -5.6%) vs baseline: ~same Memory: ✅ 34.564MB (SLO: <46.500MB 📉 -25.7%) vs baseline: +4.9% ✅ start-finish-telemetryTime: ✅ 84.669ms (SLO: <89.000ms -4.9%) vs baseline: +0.5% Memory: ✅ 34.564MB (SLO: <46.500MB 📉 -25.7%) vs baseline: +4.8% ✅ update-nameTime: ✅ 44.313ms (SLO: <45.150ms 🟡 -1.9%) vs baseline: +0.3% Memory: ✅ 44.984MB (SLO: <47.000MB -4.3%) vs baseline: +4.9%
|
|
looks like the |
Yun-Kim
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the test suites are being run duplicate, and considering langchain_community is fairly dependent on langchain_core, should we just combine the two test suites into one langchain test suite?
i tried this, but one of the ci jobs complained about at which point i was like well i guess it maybe makes sense to split them entirely, or do some weird thing where they have two different venvs in the riotfile but use the same test suite? but i wasn't sure if that would be more complicated then just splitting them 😭 |
…/dd-trace-py into sabrenner/langchain-patching-fix
…langchain-patching-fix
…/dd-trace-py into sabrenner/langchain-patching-fix
Yun-Kim
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple questions but looks great!
894d78f to
20c8618
Compare
|
The backport to To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-3.15 3.15
# Navigate to the new working tree
cd .worktrees/backport-3.15
# Create a new branch
git switch --create backport-14687-to-3.15
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 967ff6dba3866f9362c8fc479717154d41e3760c
# Push it to GitHub
git push --set-upstream origin backport-14687-to-3.15
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-3.15Then, create a pull request where the |
Changes our patching logic to only patch `langchain_core`, as we have
not supported operations on `langchain` directly for some time, and is
being removed in all langchain 1.0.0 releases - patching this was
causing `ImporError`s due to circular imports from how langchain
switched some imports from inlined to top-level.
The reason we patched `langchain-community`, `langchain-openai`, and
`langchain-pinecone` in the first place was because we could not patch
the underlying `embed_query`, `embed_documents`, and `similarity_search`
for embeddings and vectorstores as the base classes are abstract.
However, by patching `__init_subclass__`, we can successfully patch them
on each instance, meaning we only need to patch `langchain_core`.
This is reflected in the change in `_monkey.py`, and the imported
`langchain_core` in the patching file.
Updated all riot lockfiles and ensured existing tests continued to pass.
In order to make tests work (as `__init_submodules__` did not play
nicely with cached modules & patching/unpatching every test), the
`langchain_core` fixture is scoped to each module.
Additionally, running this script:
```python
from ddtrace.llmobs import LLMObs
LLMObs.enable(ml_app="sam-test")
from langchain_openai import ChatOpenAI
chat = ChatOpenAI(
model = "gpt-3.5-turbo",
temperature=0.1,
)
resp = chat.invoke("What is the capital of France?")
```
with
```bash
$ python -m venv .venv
$ source .venv/bin/activate
$ pip install langchain_openai
$ pip install ddtrace
```
Should not experience any spans being submitted, but with
```bash
$ pip install -e /path/to/dd-trace-py
```
checked out to this branch should see spans being submitted.
[Trace](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Asam-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&sp=%5B%7B%22sp%22%3A%7B%22width%22%3A%22min%28100%25%2C%20max%28calc%28100%25%20-%20var%28--ui-page-left-offset%29%20-%2016px%29%2C%20900px%29%29%22%7D%2C%22p%22%3A%7B%22eventId%22%3A%22AwAAAZl4GYUiibFcWwAAABhBWmw0R1lVaUFBQVI1bGtTMlMtaUFBQUEAAAAkZDE5OTc4MTktYTZkYS00NzMxLTk1NTctZmZjNTk0MDc4NjM5AAAAGA%22%7D%2C%22i%22%3A%22llm-obs-panel%22%7D%5D&spanId=202569580851811194&start=1758653023818&end=1758656623818&paused=false)
using `ddtrace` without the fix (just OpenAI span, missing LangChain
span. Output does not directly include the `ImportError` but
`breakpoint`ing in does reveal it).
[Trace](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Asam-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&sp=%5B%7B%22sp%22%3A%7B%22width%22%3A%22min%28100%25%2C%20max%28calc%28100%25%20-%20var%28--ui-page-left-offset%29%20-%2016px%29%2C%20900px%29%29%22%7D%2C%22p%22%3A%7B%22eventId%22%3A%22AwAAAZl4Gcv15UA_aQAAABhBWmw0R2N2MUFBQXlvaEhFVkI0Y0FBQUEAAAAkZjE5OTc4MTktZDc5MS00ZDQyLTlhMWMtZTk2OGM5NTUzODYzAAAAGQ%22%7D%2C%22i%22%3A%22llm-obs-panel%22%7D%5D&spanId=12717218430539325303&start=1758653023818&end=1758656623818&paused=false)
using local `ddtrace` with the fix (Includes the LangChain parent span).
Importing `langchain` only/directly will no longer trigger patching.
With that being said, that hasn't been a valid way to use langchain for
some time now.
Fixes #14590
…ort 3.15] (#14739) Backport 967ff6d from #14687 to 3.15. ## Description Changes our patching logic to only patch `langchain_core`, as we have not supported operations on `langchain` directly for some time, and is being removed in all langchain 1.0.0 releases - patching this was causing `ImporError`s due to circular imports from how langchain switched some imports from inlined to top-level. The reason we patched `langchain-community`, `langchain-openai`, and `langchain-pinecone` in the first place was because we could not patch the underlying `embed_query`, `embed_documents`, and `similarity_search` for embeddings and vectorstores as the base classes are abstract. However, by patching `__init_subclass__`, we can successfully patch them on each instance, meaning we only need to patch `langchain_core`. This is reflected in the change in `_monkey.py`, and the imported `langchain_core` in the patching file. ## Testing Updated all riot lockfiles and ensured existing tests continued to pass. In order to make tests work (as `__init_submodules__` did not play nicely with cached modules & patching/unpatching every test), the `langchain_core` fixture is scoped to each module. Additionally, running this script: ```python from ddtrace.llmobs import LLMObs LLMObs.enable(ml_app="sam-test") from langchain_openai import ChatOpenAI chat = ChatOpenAI( model = "gpt-3.5-turbo", temperature=0.1, ) resp = chat.invoke("What is the capital of France?") ``` with ```bash $ python -m venv .venv $ source .venv/bin/activate $ pip install langchain_openai $ pip install ddtrace ``` Should not experience any spans being submitted, but with ```bash $ pip install -e /path/to/dd-trace-py ``` checked out to this branch should see spans being submitted. [Trace](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Asam-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&sp=%5B%7B%22sp%22%3A%7B%22width%22%3A%22min%28100%25%2C%20max%28calc%28100%25%20-%20var%28--ui-page-left-offset%29%20-%2016px%29%2C%20900px%29%29%22%7D%2C%22p%22%3A%7B%22eventId%22%3A%22AwAAAZl4GYUiibFcWwAAABhBWmw0R1lVaUFBQVI1bGtTMlMtaUFBQUEAAAAkZDE5OTc4MTktYTZkYS00NzMxLTk1NTctZmZjNTk0MDc4NjM5AAAAGA%22%7D%2C%22i%22%3A%22llm-obs-panel%22%7D%5D&spanId=202569580851811194&start=1758653023818&end=1758656623818&paused=false) using `ddtrace` without the fix (just OpenAI span, missing LangChain span. Output does not directly include the `ImportError` but `breakpoint`ing in does reveal it). [Trace](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Asam-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&sp=%5B%7B%22sp%22%3A%7B%22width%22%3A%22min%28100%25%2C%20max%28calc%28100%25%20-%20var%28--ui-page-left-offset%29%20-%2016px%29%2C%20900px%29%29%22%7D%2C%22p%22%3A%7B%22eventId%22%3A%22AwAAAZl4Gcv15UA_aQAAABhBWmw0R2N2MUFBQXlvaEhFVkI0Y0FBQUEAAAAkZjE5OTc4MTktZDc5MS00ZDQyLTlhMWMtZTk2OGM5NTUzODYzAAAAGQ%22%7D%2C%22i%22%3A%22llm-obs-panel%22%7D%5D&spanId=12717218430539325303&start=1758653023818&end=1758656623818&paused=false) using local `ddtrace` with the fix (Includes the LangChain parent span). ## Risks Importing `langchain` only/directly will no longer trigger patching. With that being said, that hasn't been a valid way to use langchain for some time now. ## Additional Notes MLOB-3970 Fixes #14590
Description
Changes our patching logic to only patch
langchain_core, as we have not supported operations onlangchaindirectly for some time, and is being removed in all langchain 1.0.0 releases - patching this was causingImporErrors due to circular imports from how langchain switched some imports from inlined to top-level.The reason we patched
langchain-community,langchain-openai, andlangchain-pineconein the first place was because we could not patch the underlyingembed_query,embed_documents, andsimilarity_searchfor embeddings and vectorstores as the base classes are abstract. However, by patching__init_subclass__, we can successfully patch them on each instance, meaning we only need to patchlangchain_core.This is reflected in the change in
_monkey.py, and the importedlangchain_corein the patching file.Testing
Updated all riot lockfiles and ensured existing tests continued to pass. In order to make tests work (as
__init_submodules__did not play nicely with cached modules & patching/unpatching every test), thelangchain_corefixture is scoped to each module.Additionally, running this script:
with
$ python -m venv .venv $ source .venv/bin/activate $ pip install langchain_openai $ pip install ddtraceShould not experience any spans being submitted, but with
checked out to this branch should see spans being submitted.
Trace using
ddtracewithout the fix (just OpenAI span, missing LangChain span. Output does not directly include theImportErrorbutbreakpointing in does reveal it).Trace using local
ddtracewith the fix (Includes the LangChain parent span).Risks
Importing
langchainonly/directly will no longer trigger patching. With that being said, that hasn't been a valid way to use langchain for some time now.Additional Notes
MLOB-3970
Fixes #14590