- Speech Content Understanding: Parse the user's input speech content.
- Speech Playback: Play synthesized speech output.
- Semantic Recognition: Understand the semantics within the speech content.
- Speech Dialogue API: Engage in intelligent dialogue through speech.
- Button Control: Support interaction via button controls.
- Screen Display: Display relevant information on the screen.
- Register and Obtain API Keys
export OPENAI_API_KEY=your-openai-key
export SPEECH_KEY=your-azure-key
export SPEECH_REGION=your-azure-key-region
- Configuring the Runtime Environment
Prepare a device, such as MaixII-Sense
, or any device containing a microphone, speakers, display (with framebuffer driver support), and buttons (partial inclusion is also acceptable).
sudo bash -x build_environment.sh
- Running the Program
use
python3 voice_assistant.py
or
sudo python3 voice_assistant.py