Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access to tracing span during component run invocation #8385

Open
LastRemote opened this issue Sep 19, 2024 · 1 comment
Open

Access to tracing span during component run invocation #8385

LastRemote opened this issue Sep 19, 2024 · 1 comment
Assignees
Labels
P2 Medium priority, add to the next sprint if no P1 available

Comments

@LastRemote
Copy link
Contributor

LastRemote commented Sep 19, 2024

Is your feature request related to a problem? Please describe.
We are developing several chatbot-like applications that require streaming the response from LLM. There are a couple of metrics to look at, and one of which is the TTFT (time to first token) indicating how long the user needs to wait before seeing something in the output dialog box. However, due to the way that the tracing spans are handled in the pipeline, the run invocation inside the component does not have direct access to the span, so we are not able to log this information to the tracer.

Describe the solution you'd like
The simplest solution would be adding visibility to tracing span from component run() method. This could be a context variable that the methods inside of the component have access to, but I am not very confident about the exact approach here.

Describe alternatives you've considered
The only temporary solution right now is to directly manipulate low-level tracing sdks inside the streaming callback function, and make a special callback function such that it uploads the timestamp upon receiving the first SSE.

Additional context
Add any other context or screenshots about the feature request here.

@julian-risch julian-risch added the P2 Medium priority, add to the next sprint if no P1 available label Sep 19, 2024
@vblagoje
Copy link
Member

vblagoje commented Sep 23, 2024

Note to self:

TTFT in Langfuse is automatically calculated when completion_start_time (the timestamp of the first token) is provided to generation span i.e. just call update on the generation span with this key/value i.e completion_start_time=datetime.now()

This could be done by attaching custom stream callback on chat generator (most likely from our LangfuseTracer), consuming first token in the callback and calling generation span update - perhaps directly in that callback.

To be investigated how we can eventually do this in async calls as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 Medium priority, add to the next sprint if no P1 available
Projects
None yet
Development

No branches or pull requests

4 participants
@vblagoje @julian-risch @LastRemote and others