File tree Expand file tree Collapse file tree 1 file changed +8
-0
lines changed Expand file tree Collapse file tree 1 file changed +8
-0
lines changed Original file line number Diff line number Diff line change @@ -697,10 +697,18 @@ The following table describes the subfields for the `streaming` field:
697697* - streaming.chunk_size
698698 - Specifies the number of tokens for each chunk.
699699 The toolkit applies output guardrails on each chunk of tokens.
700+
701+ Larger values provide more meaningful information for the rail to assess,
702+ but can add latency while accumulating tokens for a full chunk.
703+ The risk of higher latency is especially true if you specify ` stream_first: False`.
700704 - ` 200`
701705
702706* - streaming.context_size
703707 - Specifies the number of tokens to keep from the previous chunk to provide context and continuity in processing.
708+
709+ Larger values provide continuity across chunks with minimal impact on latency.
710+ Small values might fail to detect cross-chunk violations.
711+ Specifying approximately 25% of `chunk_size` provides a good compromise.
704712 - ` 50`
705713
706714* - streaming.stream_first
You can’t perform that action at this time.
0 commit comments