Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] llama_index errors get attached to the wrong span #1618

Closed
mikeldking opened this issue Oct 13, 2023 · 6 comments · Fixed by #1814
Closed

[BUG] llama_index errors get attached to the wrong span #1618

mikeldking opened this issue Oct 13, 2023 · 6 comments · Fixed by #1814
Assignees
Labels
blocked bug Something isn't working

Comments

@mikeldking
Copy link
Contributor

Describe the bug
The bug is that I ran the internal llama_index notebook using a trial cohere key. This causes a rate limit error but unfortunately it's attached to the embedding span, not the reranker span.
Screenshot 2023-10-13 at 12 23 01 PM

To Reproduce
Run the llama_index_tracing_example with cohere and a trial API key

Expected behavior
The errors should be attached to the reranker span.

Environment (please complete the following information):

  • OS: MacOS
  • Notebook Runtime VS Code
  • Browser: Any
  • Version 0.0.45
@mikeldking mikeldking added the bug Something isn't working label Oct 13, 2023
@mikeldking
Copy link
Contributor Author

@tammy37 also facing similar issues
Screenshot 2023-10-24 at 5 05 26 PM

@mikeldking
Copy link
Contributor Author

I think there are few problems:

  • errors not propagated up the run tree so the root span doesn't error out
  • re-ranker errors are not captured on the reranker spans

@axiomofjoy
Copy link
Contributor

axiomofjoy commented Nov 21, 2023

This issue is being caused by an inconsistency in the trace_map passed by the LlamaIndex callback system. Most of the time when an error occurs, the exception event is a sibling event that follows the callback event in which the exception occurs. When an error occurs in a re-ranker, however, the exception events are children of the event in which the exception occurred. The diagram below shows the trace_map for the particular trace from the llama_index_tracing_example notebook. As you can see, the embedding node is the preceding node in the tree when doing a BFS, so the exception events get associated with the embedding event by our current code.

Image

@axiomofjoy
Copy link
Contributor

We need to make a change to LlamaIndex to either:

  • pass all exception events as children of the events that caused the exception,
  • pass error information not as a distinct event from the event that caused the error, but on the on_event_end hook for that event.

@axiomofjoy
Copy link
Contributor

I filed an issue with LlamaIndex here.

@axiomofjoy
Copy link
Contributor

axiomofjoy commented Nov 29, 2023

To address this issue further, we need to improve visibility into error status codes at the trace level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants