Skip to content

Commit 39a821d

Browse files
dangusevNash0x7E2
andauthored
Update Deepgram plugin to use SDK v5.0.0 (#98)
* Update Deepgram plugin to use SDK v5.0.0 * Merge test_realtime and test_stt and update the remaining tests * Make deepgram.STT.start() idempotent * Clean up unused import * Use uv as the default package manager > pip --------- Co-authored-by: Neevash Ramdial (Nash) <mail@neevash.dev>
1 parent 2013be5 commit 39a821d

File tree

6 files changed

+682
-749
lines changed

6 files changed

+682
-749
lines changed

plugins/deepgram/README.md

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,24 @@
11
# Deepgram Speech-to-Text Plugin
22

3-
A high-quality Speech-to-Text (STT) plugin for GetStream that uses the Deepgram API.
3+
A high-quality Speech-to-Text (STT) plugin for Vision agents that uses the Deepgram API.
44

55
## Installation
66

77
```bash
8-
pip install getstream-plugins-deepgram
8+
uv add vision-agents-plugins-deepgram
99
```
1010

1111
## Usage
1212

1313
```python
14-
from getstream.plugins.deepgram import DeepgramSTT
14+
from vision_agents.plugins import deepgram
15+
from getstream.video.rtc.track_util import PcmData
1516

1617
# Initialize with API key from environment variable
17-
stt = DeepgramSTT()
18+
stt = deepgram.STT()
1819

1920
# Or specify API key directly
20-
stt = DeepgramSTT(api_key="your_deepgram_api_key")
21+
stt = deepgram.STT(api_key="your_deepgram_api_key")
2122

2223
# Register event handlers
2324
@stt.on("transcript")
@@ -29,6 +30,7 @@ def on_partial(text, user, metadata):
2930
print(f"Partial transcript from {user}: {text}")
3031

3132
# Process audio
33+
pcm_data = PcmData(samples=b"\x00\x00" * 1000, sample_rate=48000, format="s16")
3234
await stt.process_audio(pcm_data)
3335

3436
# When done
@@ -37,14 +39,16 @@ await stt.close()
3739

3840
## Configuration Options
3941

40-
- `api_key`: Deepgram API key (default: reads from DEEPGRAM_API_KEY environment variable)
41-
- `options`: Deepgram LiveOptions for configuring the transcription
42+
- `api_key`: Deepgram API key (default: reads from `DEEPGRAM_API_KEY` environment variable)
43+
- `options`: Deepgram options for configuring the transcription.
44+
See the [Deepgram Listen V1 Connect API documentation](https://github.com/deepgram/deepgram-python-sdk/blob/main/websockets-reference.md#%EF%B8%8F-parameters) for more details.
4245
- `sample_rate`: Sample rate of the audio in Hz (default: 16000)
4346
- `language`: Language code for transcription (default: "en-US")
44-
- `keep_alive_interval`: Interval in seconds to send keep-alive messages (default: 5.0)
47+
- `keep_alive_interval`: Interval in seconds to send keep-alive messages (default: 1.0s)
48+
- `connection_timeout`: Timeout to wait for the Deepgram connection to be established before skipping the in seconds to send keep-alive messages (default: 15.0s)
4549

4650
## Requirements
4751

4852
- Python 3.10+
49-
- deepgram-sdk>=4.5.0
53+
- deepgram-sdk>=5.0.0,<5.1
5054
- numpy>=2.2.6,<2.3

plugins/deepgram/pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ requires-python = ">=3.10"
1212
license = "MIT"
1313
dependencies = [
1414
"vision-agents",
15-
"deepgram-sdk==4.8.1",
15+
"deepgram-sdk>=5.0.0,<5.1",
1616
"numpy>=2.2.6,<2.3",
1717
]
1818

0 commit comments

Comments
 (0)