Skip to content

TEN Agent is a conversational AI powered by the TEN, integrating Gemini 2.0 Live, OpenAI Realtime, RTC, and more. It delivers realtime capabilities to see, hear, and speak, while being fully compatible with popular workflow platforms like Dify and Coze.

License

Notifications You must be signed in to change notification settings

TEN-framework/TEN-Agent

 
 

Repository files navigation

Banner Image


Voice agent: Astra

Voice agent: Astra

We showcase an impressive voice agent called Astra, powered by TEN, demonstrating its ability to create intuitive and seamless conversational interactions.

Showcase Astra


How to build voice agent locally

Prerequisites

Docker setting on apple silicon

You will need to uncheck "Use Rosetta for x86_64/amd64 emulation on apple silicon" option for Docker if you are on Apple Silicon, otherwise the server is not gonna work.

1. Create manifest.json

# Create manifest.json from the example
cp ./agents/manifest.json.example ./agents/manifest.json

2. Modify prompt and greeting

// Feel free to edit prompt and greeting in manifest.json
"property": {
    "base_url": "",
    "api_key": "<openai_api_key>",
    "frequency_penalty": 0.9,
    "model": "gpt-3.5-turbo",
    "max_tokens": 512,
    "prompt": "", // prompt
    "proxy_url": "",
    "greeting": "Astra agent connected. How can I help you today?", // greeting
    "max_memory_length": 10
}

3. Create agent in Docker container

# In CLI, pull Docker image and mount the target directory
docker run -itd -v $(pwd):/app -w /app -p 8080:8080 --name astra_agents_dev ghcr.io/rte-design/astra_agents_build

# Windows Git Bash
# docker run -itd -v //$(pwd):/app -w //app -p 8080:8080 --name astra_agents_dev ghcr.io/rte-design/astra_agents_build

# Enter container
docker exec -it astra_agents_dev bash

# Create agent
make build

4. Export env variables and start server

# In the same CLI window, set env variables
export AGORA_APP_ID=<your_agora_appid>
export AGORA_APP_CERTIFICATE=<your_agora_app_certificate>

# OpenAI API key
export OPENAI_API_KEY=<your_openai_api_key>

# Azure STT key and region
export AZURE_STT_KEY=<your_azure_stt_key>
export AZURE_STT_REGION=<your_azure_stt_region>

# Azure TTS key and region
export AZURE_TTS_KEY=<your_azure_tts_key>
export AZURE_TTS_REGION=<your_azure_tts_region>

# Run server on port 8080
make run-server

5. Connect voice agent UI to server

Open a separate Terminal tab and run the commands:

# Create a .env file from example
cd playground
cp .env.example .env

# Install dependencies and start dev environment in localhost:3000
npm install && npm run dev

6. Verify your customized voice agent 🎉

Open localhost:3000 in your browser, you should be seeing a voice agent just like the Astra, yet with your own customizations.


Voice agent architecture

To explore further, the voice agent is an excellent starting point. It incorporates various extensions, some of which are interchangeable. Feel free to select the ones that best suit your needs and maximize its capabilities.

Extension Feature Description
openai_chatgpt LLM GPT-4o , GPT-4 Turbo , GPT-3.5 Turbo
elevenlabs_tts Text-to-speech ElevanLabs text to speech converts text to audio
azure_tts Text-to-speech Azure text to speech converts text to audio
azure_stt Speech-to-text Azure speech to text converts audio to text
chat_transcriber Transcriber A utility ext to forward chat logs into channel
agora_rtc Transporter A low latency transporter powered by agora_rtc
interrupt_detector Interrupter A utility ext to help interrupt agent

Voice Agent Diagram

voice agent diagram


TEN Service

Discover More

Now that you’ve created your first AI agent, the creativity doesn’t stop here. To develop more amazing agents, you’ll need an advanced understanding of how the TEN works under the hood. Please refer to the TEN service documentation .


Stay Tuned

Before we dive further, be sure to star our repository and get instant notifications for all new releases!

TEN star us gif


Join Community


Code Contributors

TEN


Contribution Guidelines

Contributions are welcome! Please read the contribution guidelines first.


License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.