Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: adding trace i/o in langfuse openai integration #532

Merged
merged 10 commits into from
Apr 5, 2024

Conversation

noble-varghese
Copy link
Contributor

Description

The LangFuse OpenAI integration previously failed to include the input and output data in the trace body, limiting it only to the generation.

Key Changes:

  • Implemented sending input/output in the trace body.
  • Trace input/output is not updated when a trace_id or observation_id is provided in the inputs. This prevents inadvertent updates to nested traces.

@@ -325,6 +330,8 @@ async def _get_langfuse_data_from_async_streaming_response(
resource, responses
)

langfuse.trace(id=generation.trace_id, output=completion)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In these cases, we do not know anymore whether the trace was generated by the user becore and whether he provided the traceId himself. Can we somehow get that information here and only update in this case? Similar to how we do that with the input.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have modified this approach to get the is_nested_trace bool value to identify the state of the trace. If the trace is freshly created (i.e) if it is not nested, then we modify the input and the output.

@@ -505,7 +530,7 @@ async def _wrap_async(
start_time = _get_timestamp()
arg_extractor = OpenAiArgsExtractor(*args, **kwargs)

generation = _get_langfuse_data_from_kwargs(
generation, is_nested_trace = _get_langfuse_data_from_kwargs(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was wondering if we could return a nested function bool from the existing function which would make the implementation easier to review and also possibly reduce the touch points in the code. If we change, we change on ly inside the function and rest should fall in place.

@noble-varghese noble-varghese self-assigned this Apr 5, 2024
@maxdeichmann
Copy link
Member

Implementation looks good to me. Could you add a test to the openai test function ensuring we only add io in the case discussed?

…ngfuse/langfuse-python into noble-varghese/fix-trace-input-output
…r in the input to openai. In that case the integration is expected to add the generation in the trace and not update the i/o
@noble-varghese
Copy link
Contributor Author

@maxdeichmann I have added a test for checking the scenario we discussed where the user provides a trace_id in the openai integration as well as updated other tests to verify the trace i/o to be validated again input and out to and from model.

@noble-varghese noble-varghese marked this pull request as ready for review April 5, 2024 11:36
@noble-varghese noble-varghese requested a review from a team April 5, 2024 11:36
@maxdeichmann maxdeichmann enabled auto-merge (squash) April 5, 2024 11:51
@maxdeichmann maxdeichmann merged commit ffc4694 into main Apr 5, 2024
11 checks passed
@maxdeichmann maxdeichmann deleted the noble-varghese/fix-trace-input-output branch April 5, 2024 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants