We showcase an impressive voice agent called Astra, powered by TEN, demonstrating its ability to create intuitive and seamless conversational interactions.
- Agora App ID and App Certificate(read here on how)
- Azure's speech-to-text and text-to-speech API keys
- OpenAI API key
- Docker
- Node.js(LTS) v18
You will need to uncheck "Use Rosetta for x86_64/amd64 emulation on apple silicon" option for Docker if you are on Apple Silicon, otherwise the server is not gonna work.
# Create manifest.json from the example
cp ./agents/manifest.json.example ./agents/manifest.json
// Feel free to edit prompt and greeting in manifest.json
"property": {
"base_url": "",
"api_key": "<openai_api_key>",
"frequency_penalty": 0.9,
"model": "gpt-3.5-turbo",
"max_tokens": 512,
"prompt": "", // prompt
"proxy_url": "",
"greeting": "Astra agent connected. How can I help you today?", // greeting
"max_memory_length": 10
}
# In CLI, pull Docker image and mount the target directory
docker run -itd -v $(pwd):/app -w /app -p 8080:8080 --name astra_agents_dev ghcr.io/rte-design/astra_agents_build
# Windows Git Bash
# docker run -itd -v //$(pwd):/app -w //app -p 8080:8080 --name astra_agents_dev ghcr.io/rte-design/astra_agents_build
# Enter container
docker exec -it astra_agents_dev bash
# Create agent
make build
# In the same CLI window, set env variables
export AGORA_APP_ID=<your_agora_appid>
export AGORA_APP_CERTIFICATE=<your_agora_app_certificate>
# OpenAI API key
export OPENAI_API_KEY=<your_openai_api_key>
# Azure STT key and region
export AZURE_STT_KEY=<your_azure_stt_key>
export AZURE_STT_REGION=<your_azure_stt_region>
# Azure TTS key and region
export AZURE_TTS_KEY=<your_azure_tts_key>
export AZURE_TTS_REGION=<your_azure_tts_region>
# Run server on port 8080
make run-server
Open a separate Terminal tab and run the commands:
# Create a .env file from example
cd playground
cp .env.example .env
# Install dependencies and start dev environment in localhost:3000
npm install && npm run dev
Open localhost:3000
in your browser, you should be seeing a voice agent just like the Astra, yet with your own customizations.
To explore further, the voice agent is an excellent starting point. It incorporates various extensions, some of which are interchangeable. Feel free to select the ones that best suit your needs and maximize its capabilities.
Extension | Feature | Description |
---|---|---|
openai_chatgpt | LLM | GPT-4o , GPT-4 Turbo , GPT-3.5 Turbo |
elevenlabs_tts | Text-to-speech | ElevanLabs text to speech converts text to audio |
azure_tts | Text-to-speech | Azure text to speech converts text to audio |
azure_stt | Speech-to-text | Azure speech to text converts audio to text |
chat_transcriber | Transcriber | A utility ext to forward chat logs into channel |
agora_rtc | Transporter | A low latency transporter powered by agora_rtc |
interrupt_detector | Interrupter | A utility ext to help interrupt agent |
Now that you’ve created your first AI agent, the creativity doesn’t stop here. To develop more amazing agents, you’ll need an advanced understanding of how the TEN works under the hood. Please refer to the TEN service documentation .
Before we dive further, be sure to star our repository and get instant notifications for all new releases!
- Discord: Ideal for sharing your applications and engaging with the community.
- Github Discussion: Perfect for providing feedback and asking questions.
- GitHub Issues: Best for reporting bugs and proposing new features. Refer to our contribution guidelines for more details.
- X (formerly Twitter): Great for sharing your agents and interacting with the community.
Contributions are welcome! Please read the contribution guidelines first.
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.