-
Notifications
You must be signed in to change notification settings - Fork 1
fix: Fix Duplicate Trace Calls for Async Functions and Optimize Tracer Implementation #473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…nput extraction and logging finalization - Introduced `_extract_function_inputs` to streamline input extraction for logging. - Added `_finalize_step_logging` to encapsulate step timing and logging logic. - Implemented `_handle_trace_completion` for improved trace completion handling. - Enhanced `trace_async` decorator to support both async functions and async generators with optimized logging. - Refactored existing tracing logic to utilize new helper functions for better maintainability and readability.
- Refactored client initialization logic into a new `_get_client` function for better lazy loading. - Ensured the Openlayer client is only created when needed, improving resource management. - Updated data streaming calls to utilize the new client retrieval method, enhancing code readability and maintainability.
latency=step.latency, | ||
) | ||
|
||
def _handle_trace_completion( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is duplicating code.
For example, the ConfigLlmData
is created here and inside the create_step
function. Streaming to Openlayer is also happening in both places.
I imagine that the create_step
function should use _handle_trace_completion
inputs: dict, | ||
output: Any, | ||
start_time: float, | ||
exception: Optional[Exception] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exception
is an unused argument. Either use it or remove from function signature
step_name: str, | ||
step_args: tuple, | ||
inference_pipeline_id: Optional[str] = None, | ||
**step_kwargs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
step_args
and step_kwargs
are unused arguments. Either use or remove
) | ||
|
||
_finalize_step_logging(step, inputs, output, step.start_time, exception) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment applies to other function calls as well: some of these new functions, have a lot of arguments. When calling them, it would be better to use the keyword arguments (instead of positional arguments) – e.g., _finalize_step_logging(step=step, inputs=inputs, start_time=step.start_time, ...)
).
It's easier to make mistakes when using positional arguments and the code is more readable with kwargs. It would also make it easier to see that some kwargs are not being used (in this case, exception
is not being used by _finalize_step_logging
)
@@ -143,7 +154,143 @@ def add_chat_completion_step_to_trace(**kwargs) -> None: | |||
step.log(**kwargs) | |||
|
|||
|
|||
# ----------------------------- Helper functions for tracing ---------------------------- # |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a code-style comment: it's fine to separate helper functions with these comments. However, make sure that the sections are consistent and accurate. For example, _create_step_for_async_generator
is inside the "Helper functions for tracing" block, but it is not a helper function.
More broadly, we generally aim to declare functions in the order they are used. This helps with readability and makes it easier for others to follow the flow without jumping around the file.
For example, if a main
function calls func_1
, func_2
, and func_3
in that order, we would define them like this:
def main(...):
func_1()
func_2()
func_3()
def func_1(...):
pass
def func_2(...):
pass
def func_3(...):
pass
Fix: Eliminate Duplicate Trace Calls in Async Functions and Optimize Tracer Implementation
Problem Statement
The Openlayer Python SDK was experiencing duplicate trace calls when using the
@trace()
decorator on async functions. Instead of generating a single unified trace, async functions created multiple duplicate traces with different data, causing test failures and incorrect tracing behavior.Root Cause Analysis
The issue was traced to the
@trace_async()
decorator's incompatibility with async generator functions:with create_step()
) that interfered withyield
statementscoroutine
objects instead ofasync_generator
objectsSolution Implementation
1. Async Generator Detection and Handling
inspect.isasyncgenfunction(func)
at decoration time to identify async generatorsyield
statements remain at the top level of wrapper functions2. Code Optimization (80% Duplication Reduction)
Identified and eliminated massive code duplication across three wrapper functions:
Before: ~150 lines with 80% duplication
After: ~40 lines with extracted helper functions
New Helper Functions:
_extract_function_inputs()
- Centralized input extraction and cleaning_finalize_step_logging()
- Unified step timing and logging_handle_trace_completion()
- Consistent trace completion logic_create_step_for_async_generator()
- Specialized context manager for async generators3. Client Authentication Fix
_client
was initialized at import time before environment variables were set_get_client()
function4. Test Refactoring
Created comprehensive nested test scenario with single orchestrator method:
intelligent_assistant_main()
method with 7 nested operations