-
Notifications
You must be signed in to change notification settings - Fork 547
fix(streaming): resolve word concatenation in streaming output rails #1259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- fix spacing loss in streaming output where words were concatenated without spaces - refactor BufferStrategy interface with ChunkBatch named tuple for better API design - improve variable names and add comprehensive Google-style documentation - add robust exception handling for streaming generation tasks - enhance test coverage - remove unnecessary chunk splitting that defeated buffer strategy purpose Fixes issue where `test_sequential_streaming_output_rails_allowed` failed due to output like "safeand complianthigh quality" instead of "safe and compliant high quality". The root cause was in `_run_output_rails_in_streaming` calling `split()` on chunks that already had proper spacing, then failing to reconstruct the original format.
might resolve #1197. @andompesta if you are able to verify that the issue is resolved it'd be great. I've added a test based on #1197 in this commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes word concatenation issues in the streaming output rails by removing unnecessary split()
calls and streaming raw chunks, refactors the buffer strategy API for clearer interfaces, and adds comprehensive tests for the new behavior.
- Remove splitting of chunk strings in
llmrails.py
and yield raw chunks to preserve spacing - Introduce
ChunkBatch
,process_stream()
, andformat_chunks()
inbuffer.py
- Expand test coverage for streaming and buffer strategy behavior
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
tests/test_streaming.py | Add sequential streaming output rails test to verify spacing |
tests/test_buffer_strategy.py | Update buffer strategy tests for new ChunkBatch interface |
nemoguardrails/rails/llm/llmrails.py | Swap out split() logic for direct chunk yielding |
nemoguardrails/rails/llm/buffer.py | Define ChunkBatch , refactor BufferStrategy and RollingBuffer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just some minor comments!
Great to have this fix. 👍
3412598
to
91159d5
Compare
Thank you @trebedea for the review. I applied your suggestions, it made it more readable 👍🏻 |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #1259 +/- ##
===========================================
+ Coverage 69.57% 69.64% +0.06%
===========================================
Files 161 161
Lines 16023 16055 +32
===========================================
+ Hits 11148 11181 +33
+ Misses 4875 4874 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
it is using outdated code for llmrails.py it does not understand that our tests cover 4 missing lines from buffer.py |
…1259) * fix(streaming): resolve word concatenation in streaming output rails - fix spacing loss in streaming output where words were concatenated without spaces - refactor BufferStrategy interface with ChunkBatch named tuple for better API design - improve variable names and add comprehensive Google-style documentation - add robust exception handling for streaming generation tasks - enhance test coverage - remove unnecessary chunk splitting that defeated buffer strategy purpose Fixes issue where `test_sequential_streaming_output_rails_allowed` failed due to output like "safeand complianthigh quality" instead of "safe and compliant high quality". The root cause was in `_run_output_rails_in_streaming` calling `split()` on chunks that already had proper spacing, then failing to reconstruct the original format.
Problem
The test
test_sequential_streaming_output_rails_allowed
was failing with words being concatenated without spaces:Root Cause
The issue was in
_run_output_rails_in_streaming
method inllmrails.py
. The method was callingchunk_str_rep.split()
which removed all whitespace, then tried to reconstruct spacing withf" {word}"
, but this lost the original token format including trailing spaces.This created a design contradiction: the buffer strategy creates multi-word chunks (e.g.,
"This is a funny "
,"joke but "
) for efficient output rails processing, but then the code immediately split these back into individual words, defeating the purpose.Solution
The fix was to not split the chunks at all. The buffer strategy already creates appropriately sized chunks for output rails processing, so we simply yield
user_output_chunks
directly from the buffer strategy.Core Bug Fix
split()
operations that were breaking word spacingstream_first
andnot stream_first
code pathsInterface Improvements
BufferStrategy
withprocess_stream()
method andChunkBatch
named tupleprocessing_context
(for output rails) anduser_output_chunks
(for streaming)Tuple[List[str], List[str]]
with self documentingChunkBatch
Test Coverage Improvements
tests/test_buffer_strategy.py
from 2 basic tests to 11 comprehensive testsprocess_stream()
vs__call__()
parity)