-
-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Ollama streamed responses #644
Support Ollama streamed responses #644
Conversation
lib/langchain/llm/ollama.rb
Outdated
JSON.parse(chunk_line) | ||
rescue JSON::ParserError | ||
# In some cases the chunk exceeds the buffer size and the JSON parser fails. | ||
# TODO: How to better handle this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not clear what we should do here @andreibondarev . wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the error just needs to be thrown? There's no graceful way to recover from this and still provide a useful response, I don't think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, let's remove that and let it blow up, until we figure out a way to handle this.
lib/langchain/llm/ollama.rb
Outdated
|
||
yield json_chunk, size if block | ||
yield Langchain::LLM::OllamaResponse.new(parsed_chunk, model: parameters[:model]) if block_given? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we start referencing these classes directly here since it's all under the same namespace: OllamaResponse.new ...
. It's cleaner that way, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep. it'll probably work too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
prompt_tokens + completion_tokens if done? | ||
end | ||
|
||
def done? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
@dferrazm I just tried this but nothing was streamed to me: irb(main):005> llm = Langchain::LLM::Ollama.new(url: ENV["OLLAMA_URL"])
=>
#<Langchain::LLM::Ollama:0x0000000128a3a3d0
...
irb(main):006* llm.chat messages: [{role:"user", content:"hey"}] do |chunk|
irb(main):007* puts chunk
irb(main):008> end
#<Langchain::LLM::OllamaResponse:0x000000012803b988>
=>
#<Langchain::LLM::OllamaResponse:0x0000000128038170
@model="llama3",
@prompt_tokens=nil,
@raw_response=
{"model"=>"llama3",
"created_at"=>"2024-05-29T15:56:04.473077Z",
"message"=>{"role"=>"assistant", "content"=>"Hey! How's it going?"},
"done_reason"=>"stop",
"done"=>true,
"total_duration"=>11215288666,
"load_duration"=>10900135000,
"prompt_eval_count"=>11,
"prompt_eval_duration"=>152727000,
"eval_count"=>8,
"eval_duration"=>156123000}> |
@andreibondarev you have to pass |
83d5fff
to
050eadd
Compare
050eadd
to
ba9db3c
Compare
chat
: Added support for streamed responses.complete
: Fixed. It was not working at all. Now it works with both streamed and non-streamed responses.To generate non-streamed responses, call the methods without passing a block. To generate streamed responses, passes the block. Eg.
Note: Passing the
stream
paramater to method will not have any effect anymore.Closes #550