Skip to content

Commit 10ef0e1

Browse files
Nash0x7E2dangusev
andauthored
Moondream Detection API (#136)
* Basic structure setup and stubbing * Basic person detection and test * multi-object detection * Clean up and focus in on only detection * Further simplification * Rebase latest main commit ec32383 Author: Neevash Ramdial (Nash) <mail@neevash.dev> Date: Mon Oct 27 15:51:53 2025 -0600 mypy clean up (#130) commit c52fe4c Author: Neevash Ramdial (Nash) <mail@neevash.dev> Date: Mon Oct 27 15:28:00 2025 -0600 remove turn keeping from example (#129) commit e1072e8 Merge: 5bcffa3 fea101a Author: Yarik <43354956+yarikdevcom@users.noreply.github.com> Date: Mon Oct 27 14:28:05 2025 +0100 Merge pull request #106 from tjirab/feat/20251017_gh-labeler feat: Github pull request labeler commit 5bcffa3 Merge: 406673c bfe888f Author: Thierry Schellenbach <thierry@getstream.io> Date: Sat Oct 25 10:56:27 2025 -0600 Merge pull request #119 from GetStream/fix-screensharing Fix screensharing commit bfe888f Merge: 8019c14 406673c Author: Thierry Schellenbach <thierry@getstream.io> Date: Sat Oct 25 10:56:15 2025 -0600 Merge branch 'main' into fix-screensharing commit 406673c Author: Stefan Blos <stefan.blos@gmail.com> Date: Sat Oct 25 03:03:10 2025 +0200 Update README (#118) * Changed README to LaRaes version * Remove arrows from table * Add table with people & projects to follow * Update images and links in README.md commit 3316908 Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Fri Oct 24 23:48:06 2025 +0200 Simplify TTS plugin and audio utils (#123) - Simplified TTS plugin - AWS Polly TTS plugin - OpenAI TTS plugin - Improved audio utils commit 8019c14 Author: Max Kahan <max.kahan@getstream.io> Date: Fri Oct 24 17:32:26 2025 +0100 remove video forwarder lazy init commit ca62d37 Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 16:44:03 2025 +0100 use correct codec commit 8cf8788 Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 14:27:18 2025 +0100 rename variable to fix convention commit 33fd70d Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 14:24:42 2025 +0100 unsubscribe from events commit 3692131 Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 14:19:53 2025 +0100 remove nonexistent type commit c5f68fe Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 14:10:07 2025 +0100 cleanup tests to fit style commit 8b3c61a Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 13:55:08 2025 +0100 clean up resources when track cancelled commit d8e08cb Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 13:24:55 2025 +0100 fix track republishing in agent commit 0f8e116 Author: Max Kahan <max.kahan@getstream.io> Date: Wed Oct 22 15:37:11 2025 +0100 add tests commit 08e6133 Author: Max Kahan <max.kahan@getstream.io> Date: Wed Oct 22 15:25:37 2025 +0100 ensure video track dimensions are an even number commit 6a725b0 Merge: 5f001e0 5088709 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 15:23:58 2025 -0600 Merge pull request #122 from GetStream/cleanup_stt Cleanup STT commit 5088709 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 15:23:34 2025 -0600 cleanup of stt commit f185120 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 15:08:42 2025 -0600 more cleanup commit 05ccbfd Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 14:51:48 2025 -0600 cleanup commit bb834ca Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 14:28:53 2025 -0600 more cleanup for stt commit 7a3f2d2 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 14:11:35 2025 -0600 more test cleanup commit ad7f4fe Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 14:10:57 2025 -0600 cleanup test commit 9e50cdd Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 14:03:45 2025 -0600 large cleanup commit 5f001e0 Merge: 95a03e4 5d204f3 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 12:01:52 2025 -0600 Merge pull request #121 from GetStream/fish_stt [AI-201] Fish speech to text (partial) commit 5d204f3 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 11:48:16 2025 -0600 remove ugly tests commit ee9a241 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 11:46:19 2025 -0600 cleanup commit 6eb8270 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 11:23:00 2025 -0600 fix 48khz support commit 3b90548 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 10:59:08 2025 -0600 first attempt at fish stt, doesnt entirely work just yet commit 95a03e4 Merge: b90c9e3 b4c0da8 Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Thu Oct 23 10:11:39 2025 +0200 Merge branch 'main' of github.com:GetStream/Vision-Agents commit b90c9e3 Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Wed Oct 22 23:28:28 2025 +0200 remove print and double event handling commit b4c0da8 Merge: 3d06446 a426bc2 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 15:08:51 2025 -0600 Merge pull request #117 from GetStream/openrouter [AI-194] Openrouter commit a426bc2 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 15:03:10 2025 -0600 skip broken test commit ba6c027 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 14:50:23 2025 -0600 almost working openrouter commit 0b1c873 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 14:47:12 2025 -0600 almost working, just no instruction following commit ce63233 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 14:35:53 2025 -0600 working memory for openai commit 149e886 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 13:32:43 2025 -0600 todo commit e0df1f6 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 13:20:38 2025 -0600 first pass at adding openrouter commit 3d06446 Merge: 4eb8ef4 ef55d66 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 13:20:11 2025 -0600 Merge branch 'main' of github.com:GetStream/Vision-Agents commit 4eb8ef4 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 13:20:01 2025 -0600 cleanup ai plugin instructions commit ef55d66 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 12:54:33 2025 -0600 Add link to stash_pomichter for spatial memory commit 9c9737f Merge: c954409 390c45b Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 19:45:09 2025 -0600 Merge pull request #115 from GetStream/fish [AI-195] Fish support commit 390c45b Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 19:44:37 2025 -0600 cleannup commit 1cc1cf1 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 19:42:03 2025 -0600 happy tests commit 8163d32 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 19:39:21 2025 -0600 fix gemini rule following commit ada3ac9 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 19:20:18 2025 -0600 fish tts commit 61a26cf Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 16:44:03 2025 -0600 attempt at fish commit c954409 Merge: ab27e48 c71da10 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 14:18:15 2025 -0600 Merge pull request #104 from GetStream/bedrock [AI-192] - Bedrock, AWS & Nova commit c71da10 Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Tue Oct 21 22:00:25 2025 +0200 maybe commit b5482da Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Tue Oct 21 21:46:15 2025 +0200 debugging commit 9a36e45 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 13:14:58 2025 -0600 echo environment name commit 6893968 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 12:53:58 2025 -0600 more debugging commit c35fc47 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 12:45:44 2025 -0600 add some debug info commit 0d6d3fd Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 12:03:13 2025 -0600 run test fix commit c3a31bd Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 11:52:25 2025 -0600 log cache hit commit 04554ae Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 11:48:03 2025 -0600 fix glob commit 7da96db Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 11:33:56 2025 -0600 mypy commit 186053f Merge: 4b540c9 ab27e48 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 11:17:17 2025 -0600 happy tests commit 4b540c9 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 10:20:04 2025 -0600 happy tests commit b05a60a Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 09:17:45 2025 -0600 add readme commit 71affcc Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 09:13:01 2025 -0600 rename to aws commit d2eeba7 Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 21:32:01 2025 -0600 ai tts instructions commit 98a4f9d Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 16:49:00 2025 -0600 small edits commit ab27e48 Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Mon Oct 20 21:42:04 2025 +0200 Ensure user agent is initialized before joining the call (#113) * ensure user agent is initialized before joining the call * wip commit 3cb339b Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Mon Oct 20 21:22:57 2025 +0200 New conversation API (#102) * trying to resurrect * test transcription events for openai * more tests for openai and gemini llm * more tests for openai and gemini llm * update py-client * wip * ruff * wip * ruff * snap * another way * another way, a better way * ruff * ruff * rev * ruffit * mypy everything * brief * tests * openai dep bump * snap - broken * nothingfuckingworks * message id * fix test * ruffit commit cb6f00a Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 13:18:03 2025 -0600 use qwen commit f84b2ad Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 13:02:24 2025 -0600 fix tests commit e61acca Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 12:50:40 2025 -0600 testing and linting commit 5f4d353 Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 12:34:14 2025 -0600 working commit c2a15a9 Merge: a310771 1025a42 Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 11:40:00 2025 -0600 Merge branch 'main' of github.com:GetStream/Vision-Agents into bedrock commit a310771 Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 11:39:48 2025 -0600 wip commit b4370f4 Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 11:22:43 2025 -0600 something isn't quite working commit 2dac975 Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 10:30:04 2025 -0600 add the examples commit 6885289 Author: Thierry Schellenbach <thierry@getstream.io> Date: Sun Oct 19 20:19:42 2025 -0600 ai realtime docs commit a0fa3cc Author: Thierry Schellenbach <thierry@getstream.io> Date: Sun Oct 19 18:48:06 2025 -0600 wip commit b914fc3 Author: Thierry Schellenbach <thierry@getstream.io> Date: Sun Oct 19 18:40:22 2025 -0600 fix ai llm commit b5b00a7 Author: Thierry Schellenbach <thierry@getstream.io> Date: Sun Oct 19 17:11:26 2025 -0600 work audio input commit ac72260 Author: Thierry Schellenbach <thierry@getstream.io> Date: Sun Oct 19 16:47:19 2025 -0600 fix model id commit 2b5863c Author: Thierry Schellenbach <thierry@getstream.io> Date: Sun Oct 19 16:32:54 2025 -0600 wip on bedrock commit 8bb4162 Author: Thierry Schellenbach <thierry@getstream.io> Date: Fri Oct 17 15:22:03 2025 -0600 next up the connect method commit 7a21e4e Author: Thierry Schellenbach <thierry@getstream.io> Date: Fri Oct 17 14:12:00 2025 -0600 nova progress commit 16e8ba0 Author: Thierry Schellenbach <thierry@getstream.io> Date: Fri Oct 17 13:16:00 2025 -0600 docs for bedrock nova commit 1025a42 Author: Bart Schuijt <schuijt.bart@gmail.com> Date: Fri Oct 17 21:05:45 2025 +0200 fix: Update .env.example for Gemini Live (#108) commit e12112d Author: Thierry Schellenbach <thierry@getstream.io> Date: Fri Oct 17 11:49:07 2025 -0600 wip commit fea101a Author: Bart Schuijt <schuijt.bart@gmail.com> Date: Fri Oct 17 09:25:55 2025 +0200 workflow file update commit bb2d74c Author: Bart Schuijt <schuijt.bart@gmail.com> Date: Fri Oct 17 09:22:33 2025 +0200 initial commit commit d2853cd Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 16 19:44:59 2025 -0600 always remember pep 420 commit 30a8eca Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 16 19:36:58 2025 -0600 start of bedrock branch commit fc032bf Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Thu Oct 16 09:17:42 2025 +0200 Remove cli handler from examples (#101) commit 39a821d Author: Dan Gusev <dangusev92@gmail.com> Date: Tue Oct 14 12:20:41 2025 +0200 Update Deepgram plugin to use SDK v5.0.0 (#98) * Update Deepgram plugin to use SDK v5.0.0 * Merge test_realtime and test_stt and update the remaining tests * Make deepgram.STT.start() idempotent * Clean up unused import * Use uv as the default package manager > pip --------- Co-authored-by: Neevash Ramdial (Nash) <mail@neevash.dev> commit 2013be5 Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Mon Oct 13 16:57:37 2025 +0200 ensure chat works with default types (#99) * remove extra av * Okay detection * further cleanup for detection * Experimenting with HF version * Move processing to CPU for MPS (CUDA/Model limit) * Basic test for inference, device selction and model load * Rename public detection classes * Extract moondream video track to a common file * Use util video track instead * Update plugins/moondream/vision_agents/plugins/moondream/moondream_cloud_processor.py Co-authored-by: Dan Gusev <dangusev92@gmail.com> * avoid swallowing too many exceptions Co-authored-by: Dan Gusev <dangusev92@gmail.com> * clean up * Extract detection logic to utils * ruff and mypy clean up * Update public exports * Fix test imports * Clean up remaining issues * Doc string clean up * Clean up readme * Update plugins/moondream/README.md * Update plugins/moondream/README.md --------- Co-authored-by: Dan Gusev <dangusev92@gmail.com>
1 parent 4bc269b commit 10ef0e1

File tree

13 files changed

+1884
-73
lines changed

13 files changed

+1884
-73
lines changed

agents-core/pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ requires-python = ">=3.10"
2323
dependencies = [
2424
"getstream[webrtc,telemetry]>=2.5.7",
2525
"python-dotenv>=1.1.1",
26-
"pillow>=11.3.0",
26+
"pillow>=10.4.0", # Compatible with moondream SDK (<11.0.0)
2727
"numpy>=1.24.0",
2828
"mcp>=1.16.0",
2929
"colorlog>=6.10.1",

plugins/moondream/README.md

Lines changed: 172 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,172 @@
1+
# Moondream Plugin
2+
3+
This plugin provides Moondream 3 detection capabilities for vision-agents, enabling real-time zero-shot object detection on video streams. Choose between cloud-hosted or local processing depending on your needs.
4+
5+
## Installation
6+
7+
```bash
8+
uv add vision-agents-plugins-moondream
9+
```
10+
11+
## Choosing the Right Processor
12+
13+
### CloudDetectionProcessor (Recommended for Most Users)
14+
- **Use when:** You want a simple setup with no infrastructure management
15+
- **Pros:** No model download, no GPU required, automatic updates
16+
- **Cons:** Requires API key, 2 RPS rate limit by default (can be increased)
17+
- **Best for:** Development, testing, low-to-medium volume applications
18+
19+
### LocalDetectionProcessor (For Advanced Users)
20+
- **Use when:** You need higher throughput, have your own GPU infrastructure, or want to avoid rate limits
21+
- **Pros:** No rate limits, no API costs, full control over hardware
22+
- **Cons:** Requires GPU for best performance, model download on first use, infrastructure management
23+
- **Best for:** Production deployments, high-volume applications, Digital Ocean Gradient AI GPUs, or custom infrastructure
24+
25+
## Quick Start
26+
27+
### Using CloudDetectionProcessor (Hosted)
28+
29+
The `CloudDetectionProcessor` uses Moondream's hosted API. By default it has a 2 RPS (requests per second) rate limit and requires an API key. The rate limit can be adjusted by contacting the Moondream team to request a higher limit.
30+
31+
```python
32+
from vision_agents.plugins import moondream
33+
from vision_agents.core import Agent
34+
35+
# Create a cloud processor with detection
36+
processor = moondream.CloudDetectionProcessor(
37+
api_key="your-api-key", # or set MOONDREAM_API_KEY env var
38+
detect_objects="person", # or ["person", "car", "dog"] for multiple
39+
fps=30
40+
)
41+
42+
# Use in an agent
43+
agent = Agent(
44+
processors=[processor],
45+
llm=your_llm,
46+
# ... other components
47+
)
48+
```
49+
50+
### Using LocalDetectionProcessor (On-Device)
51+
52+
If you are running on your own infrastructure or using a service like Digital Ocean's Gradient AI GPUs, you can use the `LocalDetectionProcessor` which downloads the model from HuggingFace and runs on device. By default it will use CUDA for best performance. Performance will vary depending on your specific hardware configuration.
53+
54+
**Note:** The moondream3-preview model is gated and requires HuggingFace authentication:
55+
- Request access at https://huggingface.co/moondream/moondream3-preview
56+
- Set `HF_TOKEN` environment variable: `export HF_TOKEN=your_token_here`
57+
- Or run: `huggingface-cli login`
58+
59+
```python
60+
from vision_agents.plugins import moondream
61+
from vision_agents.core import Agent
62+
63+
# Create a local processor (no API key needed)
64+
processor = moondream.LocalDetectionProcessor(
65+
detect_objects=["person", "car", "dog"],
66+
conf_threshold=0.3,
67+
device="cuda", # Auto-detects CUDA, MPS, or CPU
68+
fps=30
69+
)
70+
71+
# Use in an agent
72+
agent = Agent(
73+
processors=[processor],
74+
llm=your_llm,
75+
# ... other components
76+
)
77+
```
78+
79+
### Detect Multiple Objects
80+
81+
```python
82+
# Detect multiple object types with zero-shot detection
83+
processor = moondream.CloudDetectionProcessor(
84+
api_key="your-api-key",
85+
detect_objects=["person", "car", "dog", "basketball"],
86+
conf_threshold=0.3
87+
)
88+
89+
# Access results for LLM
90+
state = processor.state()
91+
print(state["detections_summary"]) # "Detected: 2 persons, 1 car"
92+
print(state["detections_count"]) # Total number of detections
93+
print(state["last_image"]) # PIL Image for vision models
94+
```
95+
96+
## Configuration
97+
98+
### CloudDetectionProcessor Parameters
99+
100+
- `api_key`: str - API key for Moondream Cloud API. If not provided, will attempt to read from `MOONDREAM_API_KEY` environment variable.
101+
- `detect_objects`: str | List[str] - Object(s) to detect using zero-shot detection. Can be any object name like "person", "car", "basketball". Default: `"person"`
102+
- `conf_threshold`: float - Confidence threshold for detections (default: 0.3)
103+
- `fps`: int - Frame processing rate (default: 30)
104+
- `interval`: int - Processing interval in seconds (default: 0)
105+
- `max_workers`: int - Thread pool size for CPU-intensive operations (default: 10)
106+
107+
**Rate Limits:** By default, the Moondream Cloud API has a 2rps (requests per second) rate limit. Contact the Moondream team to request a higher limit.
108+
109+
### LocalDetectionProcessor Parameters
110+
111+
- `detect_objects`: str | List[str] - Object(s) to detect using zero-shot detection. Can be any object name like "person", "car", "basketball". Default: `"person"`
112+
- `conf_threshold`: float - Confidence threshold for detections (default: 0.3)
113+
- `fps`: int - Frame processing rate (default: 30)
114+
- `interval`: int - Processing interval in seconds (default: 0)
115+
- `max_workers`: int - Thread pool size for CPU-intensive operations (default: 10)
116+
- `device`: str - Device to run inference on ('cuda', 'mps', or 'cpu'). Auto-detects CUDA, then MPS (Apple Silicon), then defaults to CPU. Default: `None` (auto-detect)
117+
- `model_name`: str - Hugging Face model identifier (default: "moondream/moondream3-preview")
118+
- `options`: AgentOptions - Model directory configuration. If not provided, uses default which defaults to tempfile.gettempdir()
119+
120+
**Performance:** Performance will vary depending on your hardware configuration. CUDA is recommended for best performance on NVIDIA GPUs. The model will be downloaded from HuggingFace on first use.
121+
122+
## Video Publishing
123+
124+
The processor publishes annotated video frames with bounding boxes drawn on detected objects:
125+
126+
```python
127+
processor = moondream.CloudDetectionProcessor(
128+
api_key="your-api-key",
129+
detect_objects=["person", "car"]
130+
)
131+
132+
# The track will show:
133+
# - Green bounding boxes around detected objects
134+
# - Labels with confidence scores
135+
# - Real-time annotation overlay
136+
```
137+
138+
## Testing
139+
140+
The plugin includes comprehensive tests:
141+
142+
```bash
143+
# Run all tests
144+
pytest plugins/moondream/tests/ -v
145+
146+
# Run specific test categories
147+
pytest plugins/moondream/tests/ -k "inference" -v
148+
pytest plugins/moondream/tests/ -k "annotation" -v
149+
pytest plugins/moondream/tests/ -k "state" -v
150+
```
151+
152+
## Dependencies
153+
154+
### Required
155+
- `vision-agents` - Core framework
156+
- `moondream` - Moondream SDK for cloud API (CloudDetectionProcessor only)
157+
- `numpy>=2.0.0` - Array operations
158+
- `pillow>=10.0.0` - Image processing
159+
- `opencv-python>=4.8.0` - Video annotation
160+
- `aiortc` - WebRTC support
161+
162+
### LocalDetectionProcessor Additional Dependencies
163+
- `torch` - PyTorch for model inference
164+
- `transformers` - HuggingFace transformers library for model loading
165+
166+
## Links
167+
168+
- [Moondream Documentation](https://docs.moondream.ai/)
169+
- [Vision Agents Documentation](https://visionagents.ai/)
170+
- [GitHub Repository](https://github.com/GetStream/Vision-Agents)
171+
172+

plugins/moondream/py.typed

Whitespace-only changes.

plugins/moondream/pyproject.toml

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
[build-system]
2+
requires = ["hatchling", "hatch-vcs"]
3+
build-backend = "hatchling.build"
4+
5+
[project]
6+
name = "vision-agents-plugins-moondream"
7+
dynamic = ["version"]
8+
description = "Moondream 3 vision processor plugin for Vision Agents"
9+
readme = "README.md"
10+
requires-python = ">=3.10"
11+
license = "MIT"
12+
dependencies = [
13+
"vision-agents",
14+
"numpy>=2.0.0",
15+
"pillow>=10.4.0",
16+
"opencv-python>=4.8.0",
17+
"moondream>=0.1.1", # Now compatible with vision-agents pillow>=10.4.0
18+
"transformers>=4.40.0", # For local model loading
19+
"torch>=2.0.0", # PyTorch for model inference
20+
"accelerate>=0.20.0", # Required for device_map and device management
21+
]
22+
23+
[project.urls]
24+
Documentation = "https://visionagents.ai/"
25+
Website = "https://visionagents.ai/"
26+
Source = "https://github.com/GetStream/Vision-Agents"
27+
28+
[tool.hatch.version]
29+
source = "vcs"
30+
raw-options = { root = "..", search_parent_directories = true, fallback_version = "0.0.0" }
31+
32+
[tool.hatch.build.targets.wheel]
33+
packages = [".", "vision_agents"]
34+
35+
[tool.uv.sources]
36+
vision-agents = { workspace = true }
37+
38+
[dependency-groups]
39+
dev = [
40+
"pytest>=8.4.1",
41+
"pytest-asyncio>=1.0.0",
42+
]
43+

0 commit comments

Comments
 (0)