[Continuous batching] Add finish reason to generation output #725

mzegla · 2024-08-01T15:25:01Z

Introducing additional information about generation finish reason to generation outputs. This allows supporting finish_reason field in OpenAI completion and chat completion response in OVMS.

src/cpp/src/sampler.hpp

ilya-lavrenov · 2024-08-02T14:10:16Z

src/cpp/include/openvino/genai/generation_handle.hpp

@@ -32,6 +32,12 @@ struct EncodedGenerationResult {
    GenerationStatus m_status = GenerationStatus::RUNNING;
 };

+enum class GenerationFinishReason {


do we really need this new enum? I mean that GenerationStatus is more generic status, which can be extended to support FINISHED_BY_LENGHT and FINISHED_BY_STOP instead of generic FINISHED.

GenerationStatus is a status for the whole request while GenerationFinishReason is per sequence, so for example when we have beam search or n > 1 we can read generation status for entire request (that would be finished) and get finish reason for every output separately (in case some of the beams hit max_new_tokens and some of them stopped naturally due to EOS token for example).

I also considered using SequenceStatus, but this class is internal and we need a one that's available for the user.

ilya-lavrenov self-assigned this Aug 1, 2024

ilya-lavrenov added this to the 2024.4 milestone Aug 1, 2024

mzegla added 3 commits August 2, 2024 11:05

introduce finish reason

50a33a0

set reason for partial push also

21e680e

beam search

5b3c185

mzegla force-pushed the finish_reason branch from a40948d to 5b3c185 Compare August 2, 2024 09:07

mzegla commented Aug 2, 2024

View reviewed changes

src/cpp/src/sampler.hpp Outdated Show resolved Hide resolved

Update src/cpp/src/sampler.hpp

14c7fa6

Wovchena approved these changes Aug 2, 2024

View reviewed changes

ilya-lavrenov reviewed Aug 2, 2024

View reviewed changes

ilya-lavrenov approved these changes Aug 6, 2024

View reviewed changes

ilya-lavrenov added this pull request to the merge queue Aug 6, 2024

Merged via the queue into openvinotoolkit:master with commit eb248db Aug 6, 2024
27 checks passed

mzegla deleted the finish_reason branch August 19, 2024 10:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Continuous batching] Add finish reason to generation output #725

[Continuous batching] Add finish reason to generation output #725

mzegla commented Aug 1, 2024

ilya-lavrenov Aug 2, 2024

mzegla Aug 2, 2024

[Continuous batching] Add finish reason to generation output #725

[Continuous batching] Add finish reason to generation output #725

Conversation

mzegla commented Aug 1, 2024

ilya-lavrenov Aug 2, 2024

Choose a reason for hiding this comment

mzegla Aug 2, 2024

Choose a reason for hiding this comment