Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: manually instrumented chatbot #2730

Merged
merged 1 commit into from
Apr 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions examples/manually-instrumented-chatbot/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
.venv
*egg-info/
__pycache__/
11 changes: 11 additions & 0 deletions examples/manually-instrumented-chatbot/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
server:
uvicorn chat.app:app --reload

streamlit:
streamlit run app.py

format:
ruff check --fix . && ruff format .

typecheck:
pyright .
56 changes: 56 additions & 0 deletions examples/manually-instrumented-chatbot/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Manually Instrument Your LLM Applications

This example shows how to manually instrument an LLM application using OpenTelemetry with OpenInference semantic conventions. Once instrumented, your application will produce trace data that can be collected and analyzed using Arize and Phoenix.

This implementation uses a FastAPI backend and a Streamlit frontend. It provides a reference not only for Python developers who wish to manually instrument their bespoke applications, but also for developers in statically compiled languages for which auto-instrumentation is not possible.


## Background

[OpenTelemetry](https://opentelemetry.io/) (OTel) is an open-source observability framework that provides APIs, libraries, and instrumentation to collect and export telemetry data (metrics, logs, and traces). OTel provides a unified and vendor-agnostic way to collect observability data, enabling developers to get a comprehensive view into their systems.

[OpenInference](https://github.com/Arize-ai/openinference) is a set of OTel-compatible conventions and plugins that facilitate tracing of AI and LLM applications. OpenInference is natively supported by Arize and Phoenix, but can be used with any OTel-compatible backend.

The easiest way to get started with OpenInference is to use one of our [auto-instrumentation libraries](https://github.com/Arize-ai/openinference?tab=readme-ov-file#instrumentation), which instrument your LLM orchestration framework or SDK of choice with a few lines of code. For bespoke applications, you may wish to manually instrument your application, meaning that you as the developer are responsible for constructing your traces and spans and ensuring that they adhere to the conventions defined in the [OpenInference specification](https://github.com/Arize-ai/openinference/tree/main/spec). For the sake of illustration, this example keeps the dependencies to a minimum and does not leverage any LLM orchestration frameworks, SDKs, or auto-instrumentation libraries.

💡 We recommend using auto-instrumentation when available. For example, if you are already making requests to the OpenAI API via the OpenAI Python SDK, it's easier to instrument those calls via OpenAI auto-instrumentation than to do so manually.

ℹ️ Note that OTel gives you the flexibility to mix and match auto-instrumentation with manually created traces and spans, so whether to adopt manual or automatic instrumentation is not a binary decision.


## Setup

Install dependencies with

```python
pip install -e .
```

Run the FastAPI server with

```
uvicorn chat.app:app
```

Run the Streamlit frontend with

```
streamlit run app.py
```

Run Phoenix with

```
python -m phoenix.server.main serve
```

Chat with the chatbot and watch your traces appear in Phoenix in real-time.


## Resources

The following resources may prove useful when manually instrumenting your application:

- [OpenInference Semantic Conventions](https://github.com/Arize-ai/openinference/blob/main/spec/semantic_conventions.md)
- [`openinference-semantic-conventions` Python source code](https://github.com/Arize-ai/openinference/tree/main/python/openinference-semantic-conventions)
- [`@arizeai/openinference-semantic-conventions` JavaScript source code](https://github.com/Arize-ai/openinference/tree/main/js/packages/openinference-semantic-conventions)
47 changes: 47 additions & 0 deletions examples/manually-instrumented-chatbot/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
import json

import streamlit as st
from httpx import Client

from chat.types import Message, MessagesPayload, MessagesResponse

http_client = Client()


MESSAGES_ENDPOINT = "http://localhost:8000/messages/"


st.title("Chat")

if "messages" not in st.session_state:
st.session_state.messages = []

for message in st.session_state.messages:
with st.chat_message(message.role):
st.markdown(message.content)

if user_message_content := st.chat_input("Message"):
st.session_state.messages.append(Message(role="user", content=user_message_content))
payload = MessagesPayload(messages=st.session_state.messages)
with st.chat_message("user"):
st.markdown(user_message_content)
with st.chat_message("assistant"):
try:
response = http_client.post(
MESSAGES_ENDPOINT,
json=payload.model_dump(),
)
if not (200 <= response.status_code < 300):
raise Exception(response.content.decode("utf-8"))
except Exception as error:
try:
error_data = json.loads(str(error))
st.error("An error occurred")
st.json(error_data)
except json.JSONDecodeError:
st.error(f"An error occurred: {error}")
else:
messages_response = MessagesResponse.model_validate(response.json())
assistant_message = messages_response.message
st.markdown(assistant_message.content)
st.session_state.messages.append(assistant_message)
Empty file.
189 changes: 189 additions & 0 deletions examples/manually-instrumented-chatbot/chat/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,189 @@
import json
import os
from typing import Any, Dict, Iterator, List, Tuple

from fastapi import FastAPI, HTTPException
from httpx import AsyncClient
from openinference.semconv.trace import (
MessageAttributes,
OpenInferenceMimeTypeValues,
OpenInferenceSpanKindValues,
SpanAttributes,
)
from opentelemetry import trace as trace_api
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry.sdk.trace.export import (
SimpleSpanProcessor,
)

from chat.types import Message, MessagesPayload, MessagesResponse

endpoint = "http://127.0.0.1:6006/v1/traces"
tracer_provider = trace_sdk.TracerProvider()
tracer_provider.add_span_processor(SimpleSpanProcessor(OTLPSpanExporter(endpoint)))
trace_api.set_tracer_provider(tracer_provider)
tracer = trace_api.get_tracer(__name__)


def getenv_or_raise(key: str) -> str:
if not (value := os.getenv(key)):
raise ValueError(f"Please set the {key} environment variable.")
return value


OPENAI_API_KEY = getenv_or_raise("OPENAI_API_KEY")
OPENAI_API_URL = "https://api.openai.com/v1/chat/completions"
OPENAI_MODEL = "gpt-4"

http_client = AsyncClient()
app = FastAPI()


class OpenAIException(HTTPException):
pass


@app.post("/messages/")
async def messages(messages_payload: MessagesPayload) -> MessagesResponse:
messages = messages_payload.messages
invocation_parameters = {"temperature": 0.1}
openai_payload = {
"model": OPENAI_MODEL,
**invocation_parameters,
"messages": [message.model_dump() for message in messages],
}
with tracer.start_as_current_span("OpenAI Async Chat Completion") as span:
for attribute_key, attribute_value in (
*_llm_span_kind_attributes(),
*_llm_model_name_attributes(OPENAI_MODEL),
*_llm_invocation_parameters_attributes(invocation_parameters),
*_input_attributes(openai_payload),
*_llm_input_messages_attributes(messages),
):
span.set_attribute(attribute_key, attribute_value)
response = await http_client.post(
OPENAI_API_URL,
headers={
"Content-Type": "application/json",
"Authorization": f"Bearer {OPENAI_API_KEY}",
},
json=openai_payload,
)
if not (200 <= response.status_code < 300):
raise OpenAIException(
status_code=500, detail=response.content.decode("utf-8")
)
span.set_status(trace_api.StatusCode.OK)
response_data = response.json()
assistant_message_content = response_data["choices"][0]["message"]["content"]
assistant_message = Message(
role="assistant",
content=assistant_message_content,
)
for (
attribute_key,
attribute_value,
) in (
*_output_attributes(response_data),
*_llm_output_message_attributes(assistant_message),
*_llm_token_usage_attributes(response_data),
):
span.set_attribute(attribute_key, attribute_value)
return MessagesResponse(message=assistant_message)


def _llm_span_kind_attributes() -> Iterator[Tuple[str, str]]:
"""
Yields the OpenInference span kind attribute for LLMs.
"""
yield SpanAttributes.OPENINFERENCE_SPAN_KIND, OpenInferenceSpanKindValues.LLM.value


def _llm_model_name_attributes(model_name: str) -> Iterator[Tuple[str, str]]:
"""
Yields the OpenInference model name attribute.
"""
yield SpanAttributes.LLM_MODEL_NAME, model_name


def _llm_invocation_parameters_attributes(
invocation_parameters: Dict[str, Any],
) -> Iterator[Tuple[str, str]]:
"""
Yields the OpenInference invocation parameters attribute as a JSON string.
"""
yield SpanAttributes.LLM_INVOCATION_PARAMETERS, json.dumps(invocation_parameters)


def _input_attributes(payload: Any) -> Iterator[Tuple[str, str]]:
"""
Yields the OpenInference input value attribute as a JSON string if the
payload can be serialized as JSON, otherwise as a string.
"""
try:
yield SpanAttributes.INPUT_VALUE, json.dumps(payload)
yield SpanAttributes.INPUT_MIME_TYPE, OpenInferenceMimeTypeValues.JSON.value
except json.JSONDecodeError:
yield SpanAttributes.INPUT_VALUE, str(payload)
yield SpanAttributes.INPUT_MIME_TYPE, OpenInferenceMimeTypeValues.TEXT.value


def _llm_input_messages_attributes(
messages: List[Message],
) -> Iterator[Tuple[str, str]]:
"""
Yields the OpenInference input messages attributes for each message in the list.
"""
for messages_index, message in enumerate(messages):
yield (
f"{SpanAttributes.LLM_INPUT_MESSAGES}.{messages_index}.{MessageAttributes.MESSAGE_ROLE}",
message.role,
)
yield (
f"{SpanAttributes.LLM_INPUT_MESSAGES}.{messages_index}.{MessageAttributes.MESSAGE_CONTENT}",
message.content,
)


def _output_attributes(payload: Any) -> Iterator[Tuple[str, str]]:
"""
Yields the OpenInference output value attribute as a JSON string if the
payload can be serialized as JSON, otherwise as a string.
"""
try:
yield SpanAttributes.OUTPUT_VALUE, json.dumps(payload)
yield SpanAttributes.OUTPUT_MIME_TYPE, OpenInferenceMimeTypeValues.JSON.value
except json.JSONDecodeError:
yield SpanAttributes.OUTPUT_VALUE, str(payload)
yield SpanAttributes.OUTPUT_MIME_TYPE, OpenInferenceMimeTypeValues.TEXT.value


def _llm_output_message_attributes(message: Message) -> Iterator[Tuple[str, str]]:
"""
Yields the OpenInference output message attributes.
"""
yield (
f"{SpanAttributes.LLM_OUTPUT_MESSAGES}.0.{MessageAttributes.MESSAGE_ROLE}",
message.role,
)
yield (
f"{SpanAttributes.LLM_OUTPUT_MESSAGES}.0.{MessageAttributes.MESSAGE_CONTENT}",
message.content,
)


def _llm_token_usage_attributes(
response_data: Dict[str, Any],
) -> Iterator[Tuple[str, int]]:
"""
Parses and yields token usage attributes from the response data.
"""
if not isinstance((usage := response_data.get("usage")), dict):
return
if prompt_tokens := usage.get("prompt_tokens"):
yield SpanAttributes.LLM_TOKEN_COUNT_PROMPT, prompt_tokens
if completion_tokens := usage.get("completion_tokens"):
yield SpanAttributes.LLM_TOKEN_COUNT_COMPLETION, completion_tokens
if total_tokens := usage.get("total_tokens"):
yield SpanAttributes.LLM_TOKEN_COUNT_TOTAL, total_tokens
18 changes: 18 additions & 0 deletions examples/manually-instrumented-chatbot/chat/types.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
from typing import List, Literal

from pydantic import BaseModel

Role = Literal["system", "assistant", "user"]


class Message(BaseModel):
role: Role
content: str


class MessagesPayload(BaseModel):
messages: List[Message]


class MessagesResponse(BaseModel):
message: Message
24 changes: 24 additions & 0 deletions examples/manually-instrumented-chatbot/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
[project]
name = "chat"
version = "0.1.0"
dependencies = [
"opentelemetry-api",
"opentelemetry-sdk",
"opentelemetry-exporter-otlp",
"openinference-semantic-conventions",
"streamlit",
"httpx",
"pydantic",
"uvicorn",
"arize-phoenix",
]
requires-python = ">=3.8"

[project.optional-dependencies]
dev = [
"pyright",
"ruff",
]

[tool.ruff.lint]
extend-select = ["I"]
Loading