Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whisper pipeline: implement chunk streamer for long-form audio processing #1148

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

as-suvorov
Copy link
Contributor

No description provided.

@as-suvorov as-suvorov added category: whisper Whisper pipeline category: Python API Python API for GenAI category: GenAI C++ API Changes in GenAI C++ public headers labels Nov 5, 2024
@as-suvorov as-suvorov added this to the 2025.0 milestone Nov 5, 2024
@github-actions github-actions bot added the category: sampling Sampling / Decoding algorithms label Nov 5, 2024
@as-suvorov as-suvorov changed the title Whisper pipeline: implement chunk streamer for long-form audio Whisper pipeline: implement chunk streamer for long-form audio processing Nov 5, 2024
@as-suvorov as-suvorov marked this pull request as ready for review November 5, 2024 16:38
class StreamerBase {
public:
/// @brief put is called every time new token is decoded,
/// @return bool flag to indicate whether generation should be stopped, if return true generation stops
virtual bool put(int64_t token) = 0;

/// @brief end is called at the end of generation. It can be used to flush cache if your own streamer has one
virtual void end() = 0;

virtual ~StreamerBase() = default;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's better to move dtor definition to .cpp file and export this class

it will help with RTTI issue on some platforms

@@ -50,6 +51,7 @@ Config from_config_json_if_exists(const std::filesystem::path& models_path, cons
}

ov::genai::StreamerVariant get_streamer_from_map(const ov::AnyMap& config_map);
ov::genai::ChunkStreamerVariant get_chunk_streamer_from_map(const ov::AnyMap& config_map);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's whisper specific entity. Maybe we can move it to whisper files? the same in other places like py_utils.hpp

these utils are supposed to be generic ones

@@ -76,6 +78,28 @@ class ConstructableStreamer: public StreamerBase {
}
};

class ConstructableChunkStreamer: public ChunkStreamerBase {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to src/python/py_whisper_pipeline.cpp ?

@@ -105,4 +109,7 @@ class OPENVINO_GENAI_EXPORTS WhisperPipeline {
WhisperGenerationConfig get_generation_config() const;
void set_generation_config(const WhisperGenerationConfig& config);
};

OPENVINO_GENAI_EXPORTS std::pair<std::string, Any> chunk_streamer(ChunkStreamerVariant func);
OPENVINO_GENAI_EXPORTS std::pair<std::string, Any> whisper_generation_config(const WhisperGenerationConfig& config);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can overload existing function like streamer and generation_config instead of introducing Whisper specific.

Example:

OPENVINO_GENAI_EXPORTS
std::pair<std::string, ov::Any> generation_config(const ImageGenerationConfig& generation_config);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: GenAI C++ API Changes in GenAI C++ public headers category: Python API Python API for GenAI category: sampling Sampling / Decoding algorithms category: whisper Whisper pipeline
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants