Skip to content

Commit

Permalink
Refactor microphone pipeline to remove unnecessary dependencies and i…
Browse files Browse the repository at this point in the history
…mprove overall logic #785
  • Loading branch information
davidmezzetti committed Sep 27, 2024
1 parent 05fe040 commit ea730ff
Show file tree
Hide file tree
Showing 9 changed files with 275 additions and 84 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ jobs:
java-version: "8"

- name: Install dependencies - Linux
run: sudo apt-get update && sudo apt-get install libsndfile1 portaudio19-dev
run: sudo apt-get update && sudo apt-get install libsndfile1 portaudio19
if: matrix.os == 'ubuntu-latest'

- name: Install dependencies - macOS
Expand Down
2 changes: 1 addition & 1 deletion docker/base/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ ENV LANG=C.UTF-8
RUN \
# Install required packages
apt-get update && \
apt-get -y --no-install-recommends install libgomp1 libsndfile1 portaudio19-dev gcc g++ python${PYTHON_VERSION} python${PYTHON_VERSION}-dev python3-pip && \
apt-get -y --no-install-recommends install libgomp1 libsndfile1 portaudio19 gcc g++ python${PYTHON_VERSION} python${PYTHON_VERSION}-dev python3-pip && \
rm -rf /var/lib/apt/lists && \
\
# Install txtai project and dependencies
Expand Down
4 changes: 2 additions & 2 deletions docs/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,13 +130,13 @@ Additional environment specific prerequisites are below.

### Linux

The AudioStream and Microphone pipelines require the [PortAudio](https://people.csail.mit.edu/hubert/pyaudio) system library. The Transcription pipeline requires the [SoundFile](https://github.com/bastibe/python-soundfile#installation) system library.
The AudioStream and Microphone pipelines require the [PortAudio](https://python-sounddevice.readthedocs.io/en/0.5.0/installation.html) system library. The Transcription pipeline requires the [SoundFile](https://github.com/bastibe/python-soundfile#installation) system library.

### macOS

Older versions of Faiss have a runtime dependency on `libomp` for macOS. Run `brew install libomp` in this case.

The AudioStream and Microphone pipelines require the [PortAudio](https://people.csail.mit.edu/hubert/pyaudio) system library.
The AudioStream and Microphone pipelines require the [PortAudio](https://python-sounddevice.readthedocs.io/en/0.5.0/installation.html) system library. Run `brew install portaudio`.

### Windows

Expand Down
11 changes: 11 additions & 0 deletions docs/pipeline/audio/texttospeech.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,16 @@ from txtai.pipeline import TextToSpeech
# Create and run pipeline
tts = TextToSpeech()
tts("Say something here")

# Stream audio - incrementally generates snippets of audio
yield from tts(
"Say something here. And say something else",
streaming=True
)

# Generate audio using a speaker id
tts = TextToSpeech("neuml/vctk-vits-onnx")
tts("Say something here", speaker=42)
```

See the link below for a more detailed example.
Expand All @@ -27,6 +37,7 @@ This pipeline is backed by ONNX models from the Hugging Face Hub. The following

- [ljspeech-jets-onnx](https://huggingface.co/NeuML/ljspeech-jets-onnx)
- [ljspeech-vits-onnx](https://huggingface.co/NeuML/ljspeech-vits-onnx)
- [vctk-vits-onnx](https://huggingface.co/NeuML/vctk-vits-onnx)

## Configuration-driven example

Expand Down
2 changes: 0 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,7 @@
extras["pipeline-audio"] = [
"onnx>=1.11.0",
"onnxruntime>=1.11.0",
"pyaudio>=0.2.14",
"scipy>=1.4.1",
"speechrecognition>=3.10.4",
"sounddevice>=0.5.0",
"soundfile>=0.10.3.post1",
"ttstokenizer>=1.0.0",
Expand Down
4 changes: 2 additions & 2 deletions src/python/txtai/pipeline/audio/audiostream.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
import sounddevice as sd

SOUNDDEVICE = True
except ImportError:
except (ImportError, OSError):
SOUNDDEVICE = False

from ..base import Pipeline
Expand All @@ -35,7 +35,7 @@ def __init__(self, rate=22050):
"""

if not SOUNDDEVICE:
raise ImportError('AudioStream pipeline is not available - install "pipeline" extra to enable')
raise ImportError("SoundDevice library not installed or portaudio library not found")

# Sampler rate
self.rate = rate
Expand Down
Loading

0 comments on commit ea730ff

Please sign in to comment.