-
Notifications
You must be signed in to change notification settings - Fork 71
Open
Description
Currently streaming returns the entire generation so far:
from mellea import start_session
from mellea.stdlib.base import CBlock
from mellea.backends.types import ModelOption
import asyncio
async def stream_chat(prompt: str) -> str:
m = start_session()
mot, _ = await m.backend.generate_from_context(
action=CBlock(value=prompt), ctx=m.ctx, model_options={ModelOption.STREAM: True}
)
while not mot.is_computed():
print(await mot.astream(), flush=True)
print("\n\nFINAL ANSWER")
print(mot.value)
return str(mot.value)
if __name__ == "__main__":
asyncio.run(stream_chat("Write a tight 8-line poem about granite and winter."))Instead, we should be returning only the new text.
We could also return a StreamingResult object that contains both. But getting just the latest diff should easy.
Metadata
Metadata
Assignees
Labels
No labels