Skip to content

Commit

Permalink
Merge pull request #4 from SYSTRAN/master
Browse files Browse the repository at this point in the history
Updating the base directory
  • Loading branch information
Jiltseb authored Jan 22, 2024
2 parents b516bc8 + 72ff979 commit 3477d86
Show file tree
Hide file tree
Showing 2 changed files with 17 additions and 1 deletion.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -187,10 +187,12 @@ Here is a non exhaustive list of open-source projects using faster-whisper. Feel

* [whisper-ctranslate2](https://github.com/Softcatala/whisper-ctranslate2) is a command line client based on faster-whisper and compatible with the original client from openai/whisper.
* [whisper-diarize](https://github.com/MahmoudAshraf97/whisper-diarization) is a speaker diarization tool that is based on faster-whisper and NVIDIA NeMo.
* [whisper-standalone-win](https://github.com/Purfview/whisper-standalone-win) contains the portable ready to run binaries of faster-whisper for Windows.
* [whisper-standalone-win](https://github.com/Purfview/whisper-standalone-win) Standalone CLI executables of faster-whisper for Windows, Linux & macOS.
* [asr-sd-pipeline](https://github.com/hedrergudene/asr-sd-pipeline) provides a scalable, modular, end to end multi-speaker speech to text solution implemented using AzureML pipelines.
* [Open-Lyrics](https://github.com/zh-plus/Open-Lyrics) is a Python library that transcribes voice files using faster-whisper, and translates/polishes the resulting text into `.lrc` files in the desired language using OpenAI-GPT.
* [wscribe](https://github.com/geekodour/wscribe) is a flexible transcript generation tool supporting faster-whisper, it can export word level transcript and the exported transcript then can be edited with [wscribe-editor](https://github.com/geekodour/wscribe-editor)
* [aTrain](https://github.com/BANDAS-Center/aTrain) is a graphical user interface implementation of faster-whisper developed at the BANDAS-Center at the University of Graz for transcription and diarization in Windows ([Windows Store App](https://apps.microsoft.com/detail/atrain/9N15Q44SZNS2)) and Linux.
* [WhisperLive](https://github.com/collabora/WhisperLive) is a nearly-live implementation of OpenAI's Whisper which uses faster-whisper as the backend to transcribe audio in real-time.

## Model conversion

Expand Down
14 changes: 14 additions & 0 deletions faster_whisper/transcribe.py
Original file line number Diff line number Diff line change
Expand Up @@ -1009,6 +1009,13 @@ def generate_with_fallback(
decode_result = max(
below_cr_threshold_results or all_results, key=lambda x: x[1]
)
# to pass final temperature for prompt_reset_on_temperature
decode_result = (
decode_result[0],
decode_result[1],
temperature,
decode_result[3],
)

return decode_result

Expand Down Expand Up @@ -1186,6 +1193,13 @@ def find_alignment(
words, word_tokens = tokenizer.split_to_word_tokens(
text_tokens + [tokenizer.eot]
)
if len(word_tokens) <= 1:
# return on eot only
# >>> np.pad([], (1, 0))
# array([0.])
# This results in crashes when we lookup jump_times with float, like
# IndexError: arrays used as indices must be of integer (or boolean) type
return []
word_boundaries = np.pad(np.cumsum([len(t) for t in word_tokens[:-1]]), (1, 0))
if len(word_boundaries) <= 1:
return []
Expand Down

0 comments on commit 3477d86

Please sign in to comment.