Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Bug Report: OpenAI Requests Not Traced When Sent from a LangGraph Node #2271

Open
1 task done
jemo21k opened this issue Nov 6, 2024 · 5 comments · May be fixed by #2273
Open
1 task done

🐛 Bug Report: OpenAI Requests Not Traced When Sent from a LangGraph Node #2271

jemo21k opened this issue Nov 6, 2024 · 5 comments · May be fixed by #2273
Labels
bug Something isn't working

Comments

@jemo21k
Copy link

jemo21k commented Nov 6, 2024

Which component is this bug for?

Langchain Instrumentation

📜 Description

In my Langchain-based application using LangGraph, I noticed that OpenAI requests with OpenAI's own client within a LangGraph node are not traced. Specifically, when I call the OpenAI GPT-4o model from within a LangGraph node, I do not see a span related to the OpenAI call in my exported trace log, nor do I see any associated LLM call metrics.

👟 Reproduction steps

Here is an example demonstrating the issue:

from openai import OpenAI
from dotenv import load_dotenv
import os
from typing import TypedDict
from langgraph.graph import StateGraph

from opentelemetry.sdk.trace.export import ConsoleSpanExporter
from traceloop.sdk import Traceloop

# Load environment variables
load_dotenv()

# Setup directories for logs
logs_dir = "path/to/logs"
langtrace_logs_dir = os.path.join(logs_dir, "traceloop")
traceloop_log_file_path = os.path.join(langtrace_logs_dir, "traceloop_issue_example.log")
traceloop_log_file = open(traceloop_log_file_path, "w")

# Initialize Traceloop with ConsoleSpanExporter
exporter = ConsoleSpanExporter(out=traceloop_log_file)
Traceloop.init(disable_batch=True, exporter=exporter)

client = OpenAI()

# Define state for LangGraph
class State(TypedDict):
    request: str
    result: str

# Define a calculation node
def calculate(state: State):
    request = state["request"]
    completion = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You are a mathematician."},
            {"role": "user", "content": request}
        ]
    )
    return {"result": completion.choices[0].message.content}

# Create the workflow graph
workflow = StateGraph(State)
workflow.add_node("calculate", calculate)
workflow.set_entry_point("calculate")

langgraph = workflow.compile()

# Invoke the graph
user_request = "What's 5 + 5?"
result = langgraph.invoke(input={"request": user_request})

print(f"Request: {user_request}")
print(f"Result: {result['result']}")

In the trace logs, there are no gen_ai attributes or metrics for the OpenAI call. However, if I replace the OpenAI client with Langchain’s own OpenAI client, a span with LLM metrics is generated as expected.

👍 Expected behavior

When an OpenAI call is made within a LangGraph node, I expect to see tracing data that includes gen_ai attributes and other metrics associated with the LLM call, such as:

{
    "name": "openai.chat",
    "context": {
        "trace_id": "0x993046216162c1a4dddf7c3484062d26",
        "span_id": "0xcfeb41cffa9b1021",
        "trace_state": "[]"
    },
    "kind": "SpanKind.CLIENT",
    "parent_id": null,
    "start_time": "2024-11-06T14:30:28.540006Z",
    "end_time": "2024-11-06T14:30:31.680890Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "llm.request.type": "chat",
        "gen_ai.system": "OpenAI",
        "gen_ai.request.model": "gpt-4o",
        "llm.headers": "None",
        "llm.is_streaming": false,
        "gen_ai.openai.api_base": "https://api.openai.com/v1/",
        "gen_ai.prompt.0.role": "system",
        "gen_ai.prompt.0.content": "You are a mathematician.",
        "gen_ai.prompt.1.role": "user",
        "gen_ai.prompt.1.content": "claculate 5 + 5",
        "gen_ai.response.model": "gpt-4o-2024-08-06",
        "gen_ai.openai.system_fingerprint": "fp_45cf54deae",
        "llm.usage.total_tokens": 39,
        "gen_ai.usage.completion_tokens": 15,
        "gen_ai.usage.prompt_tokens": 24,
        "gen_ai.completion.0.finish_reason": "stop",
        "gen_ai.completion.0.role": "assistant",
        "gen_ai.completion.0.content": "The result of \\(5 + 5\\) is \\(10\\)."
    },
    "events": [],
    "links": [],
    "resource": {
        "attributes": {
            "service.name": "/Users/muhammadkanaan/Desktop/work_repos/agent-analytics/insighter/src/test/openai/openai_llm_call.py"
        },
        "schema_url": ""
    }
}

👎 Actual Behavior with Screenshots

When running the code from within a LangGraph node, no tracing data is recorded for the OpenAI calls. Below is the actual log output captured during the execution:

{
    "name": "__start__.task",
    "context": {
        "trace_id": "0xc01627149044876058dc3ed2d419a5aa",
        "span_id": "0x1b2becef5d6179ab",
        "trace_state": "[]"
    },
    "kind": "SpanKind.INTERNAL",
    "parent_id": "0xc77e31c8e27e4710",
    "start_time": "2024-11-06T14:22:00.764176Z",
    "end_time": "2024-11-06T14:22:00.764384Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "traceloop.association.properties.langgraph_step": 0,
        "traceloop.association.properties.langgraph_node": "__start__",
        "traceloop.association.properties.langgraph_triggers": [
            "__start__"
        ],
        "traceloop.association.properties.langgraph_path": [
            "__pregel_pull",
            "__start__"
        ],
        "traceloop.association.properties.langgraph_checkpoint_ns": "__start__:91fbf40a-29dd-606a-8186-ca138e2e7803",
        "traceloop.workflow.name": "LangGraph",
        "traceloop.entity.path": "",
        "traceloop.span.kind": "task",
        "traceloop.entity.name": "__start__",
        "traceloop.entity.input": "{\"inputs\": {\"request\": \"whats 5 + 5\"}, \"tags\": [\"graph:step:0\", \"langsmith:hidden\", \"langsmith:hidden\"], \"metadata\": {\"langgraph_step\": 0, \"langgraph_node\": \"__start__\", \"langgraph_triggers\": [\"__start__\"], \"langgraph_path\": [\"__pregel_pull\", \"__start__\"], \"langgraph_checkpoint_ns\": \"__start__:91fbf40a-29dd-606a-8186-ca138e2e7803\"}, \"kwargs\": {\"name\": \"__start__\"}}",
        "traceloop.entity.output": "{\"outputs\": {\"request\": \"whats 5 + 5\"}, \"kwargs\": {\"tags\": [\"graph:step:0\", \"langsmith:hidden\", \"langsmith:hidden\"]}}"
    },
    "events": [],
    "links": [],
    "resource": {
        "attributes": {
            "service.name": "/Users/muhammadkanaan/Desktop/work_repos/agent-analytics/insighter/src/test/openai/tmp.py"
        },
        "schema_url": ""
    }
}
{
    "name": "ChannelWrite<calculate,request,result>.task",
    "context": {
        "trace_id": "0xc01627149044876058dc3ed2d419a5aa",
        "span_id": "0x2e1ca943f99c35fd",
        "trace_state": "[]"
    },
    "kind": "SpanKind.INTERNAL",
    "parent_id": "0x822c13f276dd7dc7",
    "start_time": "2024-11-06T14:22:01.836989Z",
    "end_time": "2024-11-06T14:22:01.837304Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "traceloop.association.properties.langgraph_step": 1,
        "traceloop.association.properties.langgraph_node": "calculate",
        "traceloop.association.properties.langgraph_triggers": [
            "start:calculate"
        ],
        "traceloop.association.properties.langgraph_path": [
            "__pregel_pull",
            "calculate"
        ],
        "traceloop.association.properties.langgraph_checkpoint_ns": "calculate:ba4fa053-c8d2-c4dd-5456-f56c9095719e",
        "traceloop.workflow.name": "LangGraph",
        "traceloop.entity.path": "calculate",
        "traceloop.span.kind": "task",
        "traceloop.entity.name": "ChannelWrite<calculate,request,result>",
        "traceloop.entity.input": "{\"inputs\": {\"result\": \"\\\\(5 + 5 = 10\\\\).\"}, \"tags\": [\"seq:step:2\", \"langsmith:hidden\"], \"metadata\": {\"langgraph_step\": 1, \"langgraph_node\": \"calculate\", \"langgraph_triggers\": [\"start:calculate\"], \"langgraph_path\": [\"__pregel_pull\", \"calculate\"], \"langgraph_checkpoint_ns\": \"calculate:ba4fa053-c8d2-c4dd-5456-f56c9095719e\"}, \"kwargs\": {\"name\": \"ChannelWrite<calculate,request,result>\"}}",
        "traceloop.entity.output": "{\"outputs\": {\"result\": \"\\\\(5 + 5 = 10\\\\).\"}, \"kwargs\": {\"tags\": [\"seq:step:2\", \"langsmith:hidden\"]}}"
    },
    "events": [],
    "links": [],
    "resource": {
        "attributes": {
            "service.name": "/Users/muhammadkanaan/Desktop/work_repos/agent-analytics/insighter/src/test/openai/tmp.py"
        },
        "schema_url": ""
    }
}
{
    "name": "calculate.task",
    "context": {
        "trace_id": "0xc01627149044876058dc3ed2d419a5aa",
        "span_id": "0x822c13f276dd7dc7",
        "trace_state": "[]"
    },
    "kind": "SpanKind.INTERNAL",
    "parent_id": "0xc77e31c8e27e4710",
    "start_time": "2024-11-06T14:22:00.766167Z",
    "end_time": "2024-11-06T14:22:01.837811Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "traceloop.association.properties.langgraph_step": 1,
        "traceloop.association.properties.langgraph_node": "calculate",
        "traceloop.association.properties.langgraph_triggers": [
            "start:calculate"
        ],
        "traceloop.association.properties.langgraph_path": [
            "__pregel_pull",
            "calculate"
        ],
        "traceloop.association.properties.langgraph_checkpoint_ns": "calculate:ba4fa053-c8d2-c4dd-5456-f56c9095719e",
        "traceloop.workflow.name": "LangGraph",
        "traceloop.entity.path": "",
        "traceloop.span.kind": "task",
        "traceloop.entity.name": "calculate",
        "traceloop.entity.input": "{\"inputs\": {\"request\": \"whats 5 + 5\"}, \"tags\": [\"graph:step:1\"], \"metadata\": {\"langgraph_step\": 1, \"langgraph_node\": \"calculate\", \"langgraph_triggers\": [\"start:calculate\"], \"langgraph_path\": [\"__pregel_pull\", \"calculate\"], \"langgraph_checkpoint_ns\": \"calculate:ba4fa053-c8d2-c4dd-5456-f56c9095719e\"}, \"kwargs\": {\"name\": \"calculate\"}}",
        "traceloop.entity.output": "{\"outputs\": {\"result\": \"\\\\(5 + 5 = 10\\\\).\"}, \"kwargs\": {\"tags\": [\"graph:step:1\"]}}"
    },
    "events": [],
    "links": [],
    "resource": {
        "attributes": {
            "service.name": "/Users/muhammadkanaan/Desktop/work_repos/agent-analytics/insighter/src/test/openai/tmp.py"
        },
        "schema_url": ""
    }
}
{
    "name": "LangGraph.workflow",
    "context": {
        "trace_id": "0xc01627149044876058dc3ed2d419a5aa",
        "span_id": "0xc77e31c8e27e4710",
        "trace_state": "[]"
    },
    "kind": "SpanKind.INTERNAL",
    "parent_id": null,
    "start_time": "2024-11-06T14:22:00.763510Z",
    "end_time": "2024-11-06T14:22:01.838540Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "traceloop.workflow.name": "LangGraph",
        "traceloop.entity.path": "",
        "traceloop.span.kind": "workflow",
        "traceloop.entity.name": "LangGraph",
        "traceloop.entity.input": "{\"inputs\": {\"request\": \"whats 5 + 5\"}, \"tags\": [], \"metadata\": {}, \"kwargs\": {\"name\": \"LangGraph\"}}",
        "traceloop.entity.output": "{\"outputs\": {\"request\": \"whats 5 + 5\", \"result\": \"\\\\(5 + 5 = 10\\\\).\"}, \"kwargs\": {\"tags\": []}}"
    },
    "events": [],
    "links": [],
    "resource": {
        "attributes": {
            "service.name": "/Users/muhammadkanaan/Desktop/work_repos/agent-analytics/insighter/src/test/openai/tmp.py"
        },
        "schema_url": ""
    }
}

As shown, the logs lack the expected gen_ai and llm attributes or metrics related to the OpenAI call, which would normally be included when using Langchain's OpenAI client directly.

🤖 Python Version

Python 3.12.4

📃 Provide any additional context for the Bug.

langchain==0.2.16
langchain-cohere==0.1.9
langchain-community==0.2.17
langchain-core==0.2.41
langchain-experimental==0.0.65
langchain-openai==0.1.25
langchain-text-splitters==0.2.4
langgraph==0.2.23
langgraph-checkpoint==1.0.10

openai==1.47.0

traceloop-sdk==0.33.3

👀 Have you spent some time to check if this bug has been raised before?

  • I checked and didn't find similar issue

Are you willing to submit PR?

Yes I am willing to submit a PR!

@dosubot dosubot bot added the bug Something isn't working label Nov 6, 2024
@traceloop traceloop deleted a comment from dosubot bot Nov 6, 2024
@nirga
Copy link
Member

nirga commented Nov 6, 2024

Thanks for reporting @jemo21k! You wrote that you're willing to submit a PR - does it mean you have a fix for this? Or should we look into this?

@jemo21k
Copy link
Author

jemo21k commented Nov 7, 2024

Hi, @nirga
Glad to help!. I don't have a fix yet, currently investigating.
I think you guys should look into it also.

@thisthat
Copy link

thisthat commented Nov 8, 2024

Hey 👋 I also bumped into the same issue. From a quick look at the code, the LangChain instrumentation suppresses further instruments by setting the SUPPRESS_LANGUAGE_MODEL_INSTRUMENTATION_KEY key in the context. See the code here.

That key is then used by the instruments to decide if they instrument the call or not. See the code here.

Was there any special reason for disabling downstream instrumentations? 🤔

EDIT
Enabling the downstream sensors causes OTel to fail at the metric collection while freezing the attributes (see code here) because of unhashable list in the attrs. The attribute has key traceloop.association.properties.ls_stop- what is the purpose of this attribute?

@thisthat thisthat linked a pull request Nov 8, 2024 that will close this issue
4 tasks
@nirga
Copy link
Member

nirga commented Nov 9, 2024

@thisthat the reason we've done that is because these spans are (supposed to be?) already collected by the Langchain callbacks. Before, you would have gotten duplicate OpenAI spans (which results in counting tokens twice for example). We should figure out why the callbacks are not producing the needed spans in this case.

@thisthat
Copy link

@nirga thanks for the answer. I believe the LangChain sensor should only provide visibility into the pipeline and not trace additional LLM calls. Otherwise, we would need to re-implement every LLM sensors twice, one for their individual calls and one for LangChain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants