Speech recognition

🌟	Support this project
	`bc1qs6qq0fkqqhp4whwq8u8zc5egprakvqxewr5pmx`
	`0x3147bEE3179Df0f6a0852044BFe3C59086072e12`
	`TKznmR65yhPt5qmYCML4tNSWFeeUkgYSEV`

JVM library for speech recognition, written in Kotlin and based on the C++ library whisper.cpp and ML model Silero

Features

Recognizes speech in PCM audio data and returns a string with the result
Supports any sampling rate and number of channels due to resampling and downmixing

Installation

Download latest release

Add library dependency

dependencies {
     implementation(file("/path/to/jar"))
}

whisper.cpp

Unzip binaries
Download one of the models here or use any other compatible model

Silero

Add ONNX dependency

dependencies {
     implementation("com.microsoft.onnxruntime:onnxruntime:1.20.0")
}

Download

Usage

TL;DR

See the example module for implementation details

Call recognize to process the input data and get recognized string

Step-by-step

Load binaries if you are going to use whisper.cpp

CPU

SpeechRecognition.Whisper.loadCPU(
 ggmlBase = "/path/to/ggml-base", 
 ggmlCpu = "/path/to/ggml-cpu",
 ggml = "/path/to/ggml",
 speechRecognitionWhisper = "/path/to/speech-recognition-whisper",
)

CUDA

SpeechRecognition.Whisper.loadCUDA(
 ggmlBase = "/path/to/ggml-base", 
 ggmlCpu = "/path/to/ggml-cpu",
 ggmlCuda = "/path/to/ggml-cuda",
 ggml = "/path/to/ggml",
 speechRecognitionWhisper = "/path/to/speech-recognition-whisper",
)

Create an instance

whisper.cpp

SpeechRecognition.Whisper.create(modelPath = "/path/to/model")

Silero

SpeechRecognition.Silero.create(modelPath = "/path/to/model")

Call minimumInputSize to get the audio producer buffer size for real-time detection
Call adjustTemperature to adjust the temperature parameter
Call recognize passing the input data, sample rate, and number of channels as arguments
Call reset to reset the internal state - for example when the audio source changes
Call close to release resources

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
example		example
gradle/wrapper		gradle/wrapper
library		library
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Speech recognition

See also

Features

Installation

whisper.cpp

Silero

Usage

TL;DR

Step-by-step

whisper.cpp

Silero

Requirements

License

Acknowledgments

About

Uh oh!

Releases 1

Packages

Languages

License

numq/speech-recognition

Folders and files

Latest commit

History

Repository files navigation

Speech recognition

See also

Features

Installation

whisper.cpp

Silero

Usage

TL;DR

Step-by-step

whisper.cpp

Silero

Requirements

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages