suppress or remove annoying print statement #40

BBC-Esq · 2024-02-25T10:19:16Z

Can we please please have a way to remove this message...Every time I run the program from my python script it checks for ffmpeg, which is fine, but i wish there was a way to remove or temporarily suppress it. I have important messages printed to the command prompt when my program runs and this clutters it up...

Also, is there a way to REMOVE FFMPEG requirement entirely? For example, the pyav library includes it when you pip install that library.

https://pypi.org/project/av/

This is why the faster-whisper library uses it. See here:

https://github.com/SYSTRAN/faster-whisper

Anyways, here is the print that's annoying me:

ffmpeg version 6.1.1-full_build-www.gyan.dev Copyright (c) 2000-2023 the FFmpeg developers
built with gcc 12.2.0 (Rev10, Built by MSYS2 project)
configuration: --enable-gpl --enable-version3 --enable-static --pkg-config=pkgconf --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libaribcaption --enable-libdav1d --enable-libdavs2 --enable-libuavs3d --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libaom --enable-libjxl --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-dxva2 --enable-d3d11va --enable-libvpl --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libcodec2 --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
libavutil      58. 29.100 / 58. 29.100
libavcodec     60. 31.102 / 60. 31.102
libavformat    60. 16.100 / 60. 16.100
libavdevice    60.  3.100 / 60.  3.100
libavfilter     9. 12.100 /  9. 12.100
libswscale      7.  5.100 /  7.  5.100
libswresample   4. 12.100 /  4. 12.100
libpostproc    57.  3.100 / 57.  3.100

The text was updated successfully, but these errors were encountered:

shashikg · 2024-02-25T11:19:36Z

In future all the print statements will get replaced by a logger (so that users can set the logging level as per their need).

I will suppress the above FFMPEG log in next PR. No plan to remove FFMPEG, direct calls to FFMPEG is clean and faster than most of the wrappers like ffmpeg_python (open-whisper uses it) or PyAV. Direct calls also make it super-easy to run the resampling command in background in a separate thread.

BBC-Esq · 2024-02-25T11:28:21Z

Just to play devil's advocate, don't you think that it introduces unneeded complexity for users of various platforms though? I mean...when I started learning about programming I didn't even know what a computer system "path" was let alone how to install ffmpeg and add it to the path. Plus, there's a lot of different platforms out there and different installation procedures for each one. With something like pyav you simply pip install the library...How much of a slowdown are we discussing versus directly calling ffmpeg?

Anyways, if you're curious, here's how I did the resampling with pyav in my script:

    def convert_to_wav(self, audio_file):
        output_file = Path(audio_file).stem + "_converted.wav"
        output_path = Path(__file__).parent / output_file
        
        container = av.open(audio_file)
        stream = next(s for s in container.streams if s.type == 'audio')
        
        resampler = av.AudioResampler(
            format='s16',
            layout='mono',
            rate=16000,
        )
        
        output_container = av.open(str(output_path), mode='w')
        output_stream = output_container.add_stream('pcm_s16le', rate=16000)
        output_stream.layout = 'mono'
        
        for frame in container.decode(audio=0):
            frame.pts = None
            resampled_frames = resampler.resample(frame)
            if resampled_frames is not None:
                for resampled_frame in resampled_frames:
                    for packet in output_stream.encode(resampled_frame):
                        output_container.mux(packet)
        
        for packet in output_stream.encode(None):
            output_container.mux(packet)
        
        output_container.close()
        
        return str(output_path)

I couldn't get the resampling to work automatically using whisperS2T so that's why I had to add pyav...Not sure if I did it wrong though.

In another script of mine, resampling is avoided by directly sampling into 16000 mono from the beginning. This pertains to a voice recorder functionality though, not an audio file that's who knows what the original sample rate is...

import os
import gc
import torch
import pyaudio
import wave
import tempfile
from pathlib import Path
import whisper_s2t
from PySide6.QtCore import QThread, Signal
from utilities import my_cprint

class TranscriptionThread(QThread):
    transcription_complete = Signal(str)

    def __init__(self, audio_file, voice_recorder):
        super().__init__()
        self.audio_file = audio_file
        self.voice_recorder = voice_recorder

    def run(self):
        device = "cpu"
        compute_type = "float32"
        model_identifier = "ctranslate2-4you/whisper-small.en-ct2-float32"
        cpu_threads = max(4, os.cpu_count() - 4)
        model_kwargs = {
            'compute_type': compute_type,
            'model_identifier': model_identifier,
            'backend': 'CTranslate2',
            "device": device,
            "cpu_threads": cpu_threads,
        }
        self.model = whisper_s2t.load_model(**model_kwargs)

        out = self.model.transcribe_with_vad([self.audio_file],
                                             lang_codes=['en'],
                                             tasks=['transcribe'],
                                             initial_prompts=[None],
                                             batch_size=16)

        transcription_text = " ".join([_['text'] for _ in out[0]]).strip()

        my_cprint("Transcription completed.", 'white')
        self.transcription_complete.emit(transcription_text)
        Path(self.audio_file).unlink()
        self.voice_recorder.ReleaseTranscriber()

class RecordingThread(QThread):
    def __init__(self, voice_recorder):
        super().__init__()
        self.voice_recorder = voice_recorder

    def run(self):
        self.voice_recorder.record_audio()

class VoiceRecorder:
    def __init__(self, gui_instance, format=pyaudio.paInt16, channels=1, rate=16000, chunk=1024):
        self.gui_instance = gui_instance
        self.format, self.channels, self.rate, self.chunk = format, channels, rate, chunk
        self.is_recording, self.frames = False, []
        self.recording_thread = None
        self.transcription_thread = None

    def record_audio(self):
        p = pyaudio.PyAudio()
        stream = p.open(format=self.format, channels=self.channels, rate=self.rate, input=True, frames_per_buffer=self.chunk)
        self.frames = []
        while self.is_recording:
            data = stream.read(self.chunk, exception_on_overflow=False)
            self.frames.append(data)
        stream.stop_stream()
        stream.close()
        p.terminate()

    def save_audio(self):
        self.is_recording = False
        temp_file = Path(tempfile.mktemp(suffix=".wav"))
        with wave.open(str(temp_file), "wb") as wf:
            wf.setnchannels(self.channels)
            wf.setsampwidth(pyaudio.PyAudio().get_sample_size(self.format))
            wf.setframerate(self.rate)
            wf.writeframes(b"".join(self.frames))
        self.frames.clear()

        self.transcription_thread = TranscriptionThread(str(temp_file), self)
        self.transcription_thread.transcription_complete.connect(self.gui_instance.update_transcription)
        self.transcription_thread.start()

    def start_recording(self):
        if not self.is_recording:
            self.is_recording = True
            self.recording_thread = RecordingThread(self)
            self.recording_thread.start()

    def stop_recording(self):
        self.is_recording = False
        if self.recording_thread is not None:
            self.recording_thread.wait()
            self.save_audio()

    def ReleaseTranscriber(self):
        if hasattr(self, 'model'):
            if hasattr(self.model, 'model'):
                del self.model.model
            if hasattr(self.model, 'feature_extractor'):
                del self.model.feature_extractor
            if hasattr(self.model, 'hf_tokenizer'):
                del self.model.hf_tokenizer
            del self.model
        
        if torch.cuda.is_available():
            torch.cuda.empty_cache()
        gc.collect()
        my_cprint("Whisper model removed from memory.", 'red')

shashikg · 2024-02-25T12:25:32Z

How much of a slowdown are we discussing versus directly calling ffmpeg?

Depends on the file size for example say if resampling takes 5 secs for some 1 hour files. And you have 20 such files in a request. Overall reduction will be ~ 5*19 secs. Because WhisperS2T runs ffmpeg cmd in a separate thread > it resamples the first audio file and send it for transcription and > in parallel resampling of other audio files are done in the background. Same thing can be done with PyAV (not sure though -- depends if PyAV interface is blocking or non-blocking).

Plus, there's a lot of different platforms out there and different installation procedures for each one. I couldn't get the resampling to work automatically using whisperS2T

Weird.. what system you are using? What's the exact issue?

BBC-Esq · 2024-02-25T12:37:25Z

Thanks for the explanation, makes sense and is interesting to know...Every little bit helps I suppose when you're talking about improving speed overall. If I have time today I'll try to revert my script to what it was when I encountered the error - i.e. before I implemented pyav...but unfortunately it's difficult because I don't use any "versioning" workflow. I'm just using Notepad++...don't laugh. ;-) I do know that this script wasn't automatically resampling and was giving an error that it couldn't process the audio file without it being resampled first. ALSO THIS MAY HAVE BEEN with a different file than the Sam Altman. I'm testing it on .mp3, .wma, .flac, and .wav. Hope that helps...

shashikg · 2024-02-25T13:21:48Z

@BBC-Esq https://github.com/shashikg/WhisperS2T?tab=readme-ov-file#for-ubuntumacwindowsanyother-with-conda-for-python

shashikg · 2024-02-29T15:21:01Z

@BBC-Esq removed redundant ffmpeg logs: 33e305f

shashikg closed this as completed Mar 1, 2024

BBC-Esq mentioned this issue Sep 6, 2024

Performance Improvement ideas / feature requests absadiki/pywhispercpp#49

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

suppress or remove annoying print statement #40

suppress or remove annoying print statement #40

BBC-Esq commented Feb 25, 2024

shashikg commented Feb 25, 2024

BBC-Esq commented Feb 25, 2024

shashikg commented Feb 25, 2024 •

edited

Loading

BBC-Esq commented Feb 25, 2024

shashikg commented Feb 25, 2024

shashikg commented Feb 29, 2024

suppress or remove annoying print statement #40

suppress or remove annoying print statement #40

Comments

BBC-Esq commented Feb 25, 2024

shashikg commented Feb 25, 2024

BBC-Esq commented Feb 25, 2024

shashikg commented Feb 25, 2024 • edited Loading

BBC-Esq commented Feb 25, 2024

shashikg commented Feb 25, 2024

shashikg commented Feb 29, 2024

shashikg commented Feb 25, 2024 •

edited

Loading