You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have the following frontend code which sends audio data over a websocket in the browser (using the microphone):
constwebSocket=newWebSocket('ws://127.0.0.1:3000');webSocket.onmessage=event=>{console.log('Message from server:',event.data);}webSocket.onopen=()=>{console.log('Connected to server');};webSocket.onclose=(event)=>{console.log('Disconnected from server: ',event.code,event.reason);};webSocket.onerror=error=>{console.error('Error:',error);}constconstraints={audio: true};letrecorder;functionstart(){navigator.mediaDevices.getUserMedia(constraints).then(mediaStream=>{// use MediaStream Recording APIrecorder=newMediaRecorder(mediaStream);// fires every one second and passes an BlobEventrecorder.ondataavailable=event=>{// get the Blob from the eventconstblob=event.data;// and send that blob to the server...webSocket.send(blob);};// make data available event fire every one secondrecorder.start(2000);});}functionstop(){recorder.stop();webSocket.close(1000,"Finished sending audio");}
It uses the MediaRecorder API to send an audio chunk every 2 seconds. This is recieved on the backend like this:
main.py:
importasynciofromioimportBytesIOimportwebsocketsfromASR.ASRimportASR_ASR=ASR("tiny", "auto","int8")
asyncdefhandler(websocket):
whileTrue:
try:
# Receiving binary data directly from the clientdata=awaitwebsocket.recv()
#Handle the audion data with Whisper_ASR.process_audio(data)
# Optionally, send an acknowledgment back to the clientawaitwebsocket.send("Chunk received")
exceptwebsockets.ConnectionClosed:
print("Connection closed")
break# Start WebSocket serverstart_server=websockets.serve(handler, "127.0.0.1", 3000)
asyncio.get_event_loop().run_until_complete(start_server)
asyncio.get_event_loop().run_forever()
ASR.py:
fromioimportBytesIOimportrefromtypingimportListfromASR.LocalAgreementimportLocalAgreementfromfaster_whisperimportWhisperModelimportsoundfileassfclassASR:
audio_buffer: BytesIO=BytesIO()
local_agreement=LocalAgreement()
context:str=""confirmed_sentences: List[str] = []
def__init__ (self, model_size: str, device="auto", compute_type="int8"):
self.whisper_model=WhisperModel(model_size, device=device, compute_type=compute_type)
deftranscribe(self, audio_buffer: BytesIO, context: str):
transcribed_text=""segments, info=self.whisper_model.transcribe(audio_buffer)
forsegmentinsegments:
transcribed_text+=" "+segment.textreturntranscribed_textdefprocess_audio(self, audio_chunk) ->str:
# Append new audio data to the main bufferself.audio_buffer.write(audio_chunk)
self.audio_buffer.seek(0) # Reset buffer's position to the beginningtranscribed_text=self.transcribe(self.audio_buffer, self.context)
print("transcribed_text: "+transcribed_text)
confirmed_text=self.local_agreement.confirm_tokens(transcribed_text)
print(confirmed_text)
punctuation=r"[.!?]"# Regular expression pattern for ., !, or ?# Detect punctuationprint("check punctuation: ", re.search(punctuation,confirmed_text))
ifre.search(punctuation,confirmed_text):
split_sentence=re.split(f"({punctuation})", confirmed_text)
# Join the punctuation back to the respective parts of the sentencesentence= [split_sentence[i] +split_sentence[i+1] foriinrange(0, len(split_sentence)-1, 2)]
print("sentence", sentence)
self.confirmed_sentences.append(sentence[-1])
self.context=" ".join(self.confirmed_sentences)
print("context added: "+self.context)
# Clear the main audio buffer only after processing is completeself.audio_buffer=BytesIO()
returnconfirmed_text
The issue happens when I try to clear the audio buffer. My thought is to clear the buffer every time I detect a punctuation meaning a sentence has ended. However clearing the buffer throws the following error:
connection handler failed
Traceback (most recent call last):
File "/Users/frederik/Uni/P7/P7Project/backend/.venv/lib/python3.12/site-packages/websockets/legacy/server.py", line 245, in handler
await self.ws_handler(self)
File "/Users/frederik/Uni/P7/P7Project/backend/./src/__main__.py", line 15, in handler
_ASR.process_audio(data)
File "/Users/frederik/Uni/P7/P7Project/backend/./src/ASR/ASR.py", line 30, in process_audio
transcribed_text = self.transcribe(self.audio_buffer, self.context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/frederik/Uni/P7/P7Project/backend/./src/ASR/ASR.py", line 18, in transcribe
segments, info = self.whisper_model.transcribe(audio_buffer)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/frederik/Uni/P7/P7Project/backend/.venv/lib/python3.12/site-packages/faster_whisper/transcribe.py", line 319, in transcribe
audio = decode_audio(audio, sampling_rate=sampling_rate)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/frederik/Uni/P7/P7Project/backend/.venv/lib/python3.12/site-packages/faster_whisper/audio.py", line 46, in decode_audio
with av.open(input_file, mode="r", metadata_errors="ignore") as container:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "av/container/core.pyx", line 420, in av.container.core.open
File "av/container/core.pyx", line 266, in av.container.core.Container.__cinit__
File "av/container/core.pyx", line 286, in av.container.core.Container.err_check
File "av/error.pyx", line 326, in av.error.err_check
av.error.InvalidDataError: [Errno 1094995529] Invalid data found when processing input: '<none>'
The text was updated successfully, but these errors were encountered:
I have the following frontend code which sends audio data over a websocket in the browser (using the microphone):
It uses the MediaRecorder API to send an audio chunk every 2 seconds. This is recieved on the backend like this:
main.py:
ASR.py:
The issue happens when I try to clear the audio buffer. My thought is to clear the buffer every time I detect a punctuation meaning a sentence has ended. However clearing the buffer throws the following error:
The text was updated successfully, but these errors were encountered: