AI Voice Connector - Community Edition

This project leverages OpenSIPS as a SIP gateway, creating a seamless interface between traditional SIP-based communication systems and advanced AI engines. By handling SIP communication and passing voice data to external AI models, OpenSIPS acts as a powerful middleware layer. This setup enables a wide range of voice AI applications, from real-time voice assistants and automated customer support to conversational agents and beyond. While OpenSIPS provides the gateway functionality, it allows developers the flexibility to integrate any AI models needed for tasks like speech recognition, natural language understanding, or voice synthesis. This modularity makes it ideal for building sophisticated, scalable voice-driven applications without being tied to specific AI model constraints.

OpenSIPS functions as a back-to-back SIP endpoint, managing interactions with user agents on one side. On the other side, it connects to an external application — known as the AI Voice Connector — which facilitates communication with the AI engine. This setup allows OpenSIPS to efficiently relay voice data between user agents and the AI engine, ensuring seamless and responsive interactions for voice-enabled applications.

The AI Voice Connector is a modular Python application built to leverage the OpenSIPS SIP stack, efficiently managing SIP calls and handling the media streams within sessions. It provides hooks to capture RTP data, which it sends to the AI engine for processing. Once the AI engine responds, the AI Voice Connector seamlessly injects the processed data back into the call.

Interactions with AI engines can occur directly as Speech-to-Speech if the AI engine provides real-time endpoints. Alternatively, a Speech-to-Text engine can be employed to transcribe the audio. The transcript is then sent to the AI engine as text, and the AI’s response is processed through a Text-to-Speech engine before being relayed back to the SIP user. This flexible workflow allows seamless integration of either real-time voice interactions or a multi-step process that converts speech to text, processes it, and converts responses back into speech for the end user.

Flavors

The engine is designed to accommodate various AI models, adapting to different AI "flavors" based on each engine's unique capabilities. The currently supported flavors are:

Deepgram: convert to text using Deepgram Speech-to-Text, push transcribe to OpenAI and then push the response back to Deepgram Text-to-Speech engine
OpenAI: use OpenAI Real-Time Speech-to-Speech engine

Check out the AI Flavors page for more information.

Configuration

Engine configuration is done through a separate configuration file, or through environment variables. Using a configuration file is recommended, as it allows for more detailed settings. Also, if you use both methods, configuration file settings will override environment variables. See the Configuration page for all the details.

Getting Started

The simplest way to get the project running is using the Docker Compose files found in the docker/ directory. In order to use them, you need to setup Docker on your host and then run:

git clone https://github.com/OpenSIPS/opensips-ai-voice-connector-ce.git
cd opensips-ai-voice-connector-ce/docker
# edit the .env file and adjust the settings accordingly
# alternatively, create a configuration file
docker compose up

At this point, you should have the engine up and running. A more detailed guide can be found on the Getting Started page.

Testing

Then, you can use a softphone like Zoiper or Linphone to send a call to OpenSIPS by dialling one of the supported flavors (i.e. openai - see flavor selection). You should be able to talk to an AI assistent - ask him a question and get a response back.

Resources

Documentation pages contain the following topics:

Getting Started - How to get the engine up and running
Configuration - Information about configuration file
Implementation - Implementation details
AI Flavors - Different AI flavors supported

Contribute

This project is Community driven, therefore any contribution is welcome. Feel free to open a pull request for any fix/feature you find useful. You can find technical information about the project on the Implementation page.

License

The OpenSIPS AI Voice Connector Community Edition source code is licensed under the GNU General Public License v3.0

All documentation files (i.e. .md extension) are licensed under the Creative Common License 4.0

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
cfg		cfg
docker		docker
docs		docs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Voice Connector - Community Edition

Flavors

Configuration

Getting Started

Testing

Resources

Contribute

License

About

Contributors 3

Languages

License

OpenSIPS/opensips-ai-voice-connector-ce

Folders and files

Latest commit

History

Repository files navigation

AI Voice Connector - Community Edition

Flavors

Configuration

Getting Started

Testing

Resources

Contribute

License

About

Resources

License

Stars

Watchers

Forks

Contributors 3

Languages