-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Used token not calculating when streaming - Llamaindex #5729
Comments
I have just seen that this is happening just when using groq as an LLM and not ollama for example! |
This is probably an issue with groq itself not returning token counts when streaming. As a comparison, OpenAI exhibits the same token count issue when streaming, but can be fixed via additional kwargs as shown below. chat_engine = CondensePlusContextChatEngine.from_defaults(
index_doc.as_retriever(),
llm=OpenAI(
model="gpt-4o-mini",
additional_kwargs={"stream_options": {"include_usage": True}},
),
) |
@MarouaneZhani We released an update in openinference-instrumentation-llama-index 3.1.1. Please give it a try and let us know if you have further questions. Thank you! |
@RogerHYang I have tested and it's working now as expected! thanks for the quick fix! |
Describe the bug
Token calculation is not done/shown when using astream_chat in Llamaindex.
To Reproduce
`
query_str = "Hello, Tell me a joke!"
chat_engine = CondensePlusContextChatEngine.from_defaults(
index_doc.as_retriever(),
llm=groqLLM
)
responses = []
result = await chat_engine.astream_chat(query_str)
async for response in result.achat_stream:
responses.append(response.delta)
`
the token calculation is working correctly when using chat_engine.achat or chat_engine.chat. so the problem is just present when streaming.
Expected behavior
Token Calculation also when streaming.
Screenshots
Here it's not working, and as you can see the token information is not there:
And here when using chat or achat and everything is working as expected:
Environment (please complete the following information):
The text was updated successfully, but these errors were encountered: