Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Continuous batching] Add finish reason to generation output #725

Merged
merged 4 commits into from
Aug 6, 2024

Conversation

mzegla
Copy link
Collaborator

@mzegla mzegla commented Aug 1, 2024

Introducing additional information about generation finish reason to generation outputs. This allows supporting finish_reason field in OpenAI completion and chat completion response in OVMS.

@ilya-lavrenov ilya-lavrenov self-assigned this Aug 1, 2024
@ilya-lavrenov ilya-lavrenov added this to the 2024.4 milestone Aug 1, 2024
src/cpp/src/sampler.hpp Outdated Show resolved Hide resolved
@@ -32,6 +32,12 @@ struct EncodedGenerationResult {
GenerationStatus m_status = GenerationStatus::RUNNING;
};

enum class GenerationFinishReason {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need this new enum? I mean that GenerationStatus is more generic status, which can be extended to support FINISHED_BY_LENGHT and FINISHED_BY_STOP instead of generic FINISHED.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GenerationStatus is a status for the whole request while GenerationFinishReason is per sequence, so for example when we have beam search or n > 1 we can read generation status for entire request (that would be finished) and get finish reason for every output separately (in case some of the beams hit max_new_tokens and some of them stopped naturally due to EOS token for example).

I also considered using SequenceStatus, but this class is internal and we need a one that's available for the user.

@ilya-lavrenov ilya-lavrenov added this pull request to the merge queue Aug 6, 2024
Merged via the queue into openvinotoolkit:master with commit eb248db Aug 6, 2024
27 checks passed
@mzegla mzegla deleted the finish_reason branch August 19, 2024 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants