This is a fun side project to build a custom voice assistant in JavaScript that sounds like Jarvis from Iron Man.
It's been tested on Mac and Raspberry Pi and should run on pretty much any platform if installed properly. It uses the latest OpenAI Realtime speech-to-speech API via Node.js + WebSocket, and uses OpenWakeWord and Python for wake word detection.
OpenAI has no JavaScript SDK for this, so I thought it might be helpful to show how simple it is for JavaScript developers to build this without an SDK. Rather than use an NPM module for speaker/microphone, I found it far better to use sox via child process and stdin/stdout.
Let's connect! Reach me at gaberogan.com.
This is a fun side project not built for production. You may encounter several bugs.
You may have to troubleshoot if you decide to follow the README. Using an AI editor like Cline or Cursor can help a lot with troubleshooting.
- Create
.env
withOPENAI_API_KEY
and (optional)GOOGLE_API_KEY
for Google Search - Install
sox
i.e.brew install sox
- Install NVM + Node.js 22 (other versions untested)
- Install pyenv + Python 3.12 (other versions untested)
- Run
pip install -r requirements.txt
- Run
python scripts/download_models.py
- If using Raspberry Pi, download missing audio libraries
- Run
npm i
- Run
npm start
- Say "Hey Jarvis, how are you?"
- (optional) For more wake words, see https://github.com/fwartner/home-assistant-wakewords-collection
- Clean up Quickstart steps
- Implement a memory tool to save things like location to memory
- Use "Jarvis" wake word instead, may need TFLite
- Fix AEC with PulseAudio on Raspberry Pi