Hello 🎙️

Hello is a project aimed at democratizing audio processing and transcription services. It provides an automated solution for monitoring folders, processing audio files, and generating transcriptions using either Faster Whisper (self-hosted) or Groq Whisper.

🚀 Features

📁 Automatic folder monitoring for new audio files
🔄 Real-time audio processing and transcription
🗄️ SQLite database for storing processed files and transcriptions
📊 Performance tracking and statistics
🌐 FastAPI server for status updates and file searching
🔌 Support for multiple transcription providers (Faster Whisper and Groq Whisper)
📄 CSV export of transcription data

🛠️ Installation

Click to expand installation instructions

Prerequisites

Python 3.8+
CUDA-compatible GPU (for Faster Whisper)
NVIDIA CUDA Toolkit 12.x
cuBLAS for CUDA 12
cuDNN 8 for CUDA 12 (NVIDIA Archive)
FFmpeg

NVIDIA Library Installation

Option 1: Use Docker

The libraries are pre-installed in official NVIDIA CUDA Docker images:

nvidia/cuda:12.0.0-runtime-ubuntu20.04
nvidia/cuda:12.0.0-runtime-ubuntu22.04

Option 2: Install with pip (Linux only)

pip install nvidia-cublas-cu12 nvidia-cudnn-cu12
export LD_LIBRARY_PATH=`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__))'`

Note: Ensure you're using cuDNN 8, as version 9+ may cause issues.

Option 3: Download from Purfview's repository (Windows & Linux)

Download the required NVIDIA libraries from Purfview's whisper-standalone-win repository. Extract the archive and add the library directory to your system's PATH.

For detailed installation instructions, refer to the official NVIDIA documentation.

Steps

Clone the repository:

git clone https://github.com/namastexlabs/hello.git
cd hello

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

Install the required packages:
```
pip install -r requirements.txt
```
Set up environment variables: Create a .env file in the project root and add the following:
```
GROQ_API_KEYS=your_groq_api_key1,your_groq_api_key2
RECORDINGS_PATH=./recordings
```

🚀 Usage

Run the main script with desired options:

python main.py --language pt --provider faster_whisper --model_size large-v3

Click to see available command-line arguments

--language: Language code for transcription (default: pt)
--provider: Transcription provider (choices: groq, faster_whisper; default: faster_whisper)
--model_size: Model size for Faster Whisper (default: large-v3)
--device: Device for Faster Whisper (default: cuda)
--compute_type: Compute type for Faster Whisper (default: float16)
--log-level: Set the logging level (choices: DEBUG, INFO, WARNING, ERROR, CRITICAL; default: INFO)
--clean-stats: Clean the transcription stats database
--stats-db: Path to the stats database (default: transcription_stats.db)
--database: Path to the main database (default: processed_files.db)

For a full list of Faster Whisper-specific options, run:

python main.py --help

🌐 API Endpoints

/healthz: Health check endpoint
/status: Get current processing status
/search-files: Search processed files with optional filters
/api-key-status: Check the status of API keys

📊 Example Project

Click to see example project details

The example project (TODO) will showcase an end-to-end solution that:

Captures office activity
Transcribes recordings every few minutes
Saves timestamped database records
Provides API access to transcriptions

This setup aims to facilitate easier access to transcriptions for agent systems.

🗺️ Roadmap

Implement MONITOR_FOLDER environment variable for dynamic folder monitoring
Develop a user interface for easier management and visualization
Implement real-time audio streaming and transcription
Optimize performance for large-scale deployments
Develop plugins for popular audio recording software

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

❓ FAQ

Known Errors

Error: Could not load library `libcudnn_ops_infer.so.8`

This error occurs when the environment does not have the CUDA toolkit installed or properly configured. Ensure that you have the CUDA toolkit installed and that your environment variables are correctly set up.

You can download the CUDA toolkit from the NVIDIA website.

🌟 Community

Join our Discord community to discuss the project, get help, and contribute: https://discord.gg/MXa5GsVcCB

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgements

Faster Whisper for the efficient transcription engine
Groq for their Whisper API
All contributors and supporters of the project

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
experiments		experiments
src		src
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hello 🎙️

🚀 Features

🛠️ Installation

Prerequisites

NVIDIA Library Installation

Steps

🚀 Usage

🌐 API Endpoints

📊 Example Project

🗺️ Roadmap

🤝 Contributing

❓ FAQ

Known Errors

Error: Could not load library `libcudnn_ops_infer.so.8`

🌟 Community

📄 License

🙏 Acknowledgements

About

Releases

Packages

Contributors 2

Languages

namastexlabs/hello

Folders and files

Latest commit

History

Repository files navigation

Hello 🎙️

🚀 Features

🛠️ Installation

Prerequisites

NVIDIA Library Installation

Steps

🚀 Usage

🌐 API Endpoints

📊 Example Project

🗺️ Roadmap

🤝 Contributing

❓ FAQ

Known Errors

Error: Could not load library libcudnn_ops_infer.so.8

🌟 Community

📄 License

🙏 Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Error: Could not load library `libcudnn_ops_infer.so.8`

Packages