-
-
Notifications
You must be signed in to change notification settings - Fork 244
Examples: RTPEngine speech
What if we could "read" transcribed users conversations directly in HOMER sessions? Well...
Speech recognition for VoIP has been around for ages, and the recent storm of speech recognition and transcription services keep this an evergreen subject to hack around with. In this guide, we'll assemble one of the many possible sets to create an "intercepting" SIP+RTP Proxy able to post-process recorded calls and correlate them for fun and profit using OpenSIPS, RTPEngine and the BING Speech API (more in the future)
OpenSIPS will act as a transparent proxy, hooking media streams using the latest RTPEngine which features recording capabilities via a new set of dedicated controls available directly at dialplan level and dedicated init options:
--recording-dir=FILE Spool directory where PCAP call recording data goes
--recording-method=pcap|proc Strategy for call recording
--recording-format=raw|eth PCAP file format for recorded calls.
In this experiment, we'll leverage this mix to create an intercepting proxy where we can emulate and extend a Sip:Wise
dangerous demo on the same subject and do the following:
- Record Complete Calls through our intercepting-proxy
- Post-Process Recordings for Speech Recognition
- Send any Transcription results in HEP format to HOMER
The ingredients of our intercepting recipe:
- OpenSIPS 2.x
- RTPEngine 6.x
- NodeJS + our Speech2Text Spooler
- A Free/Trial BING Speech API Key
- Good news! We prepared a full working docker container with all of the elements required for this demo!
- Bad News! You're going to build this locally to match your kernel version (unless you're on 3.16.0)
For the sake of simplicity, we'll use a container running OpenSIPS + RTPEngine recorder with open relay settings, thus able to proxy SIP towards any target system of choice - in other words, we will use a real SIP account on the other end.
In order for RTPEngine to insert and use its kernel recording modules on a given Docker system, the container must be built for the specific underlying OS kernel version - since we're at it, we'll build everything from sources! Mind this might fail when using "virtualized" Docker flavours (Virtuozzo, etc)
Make sure Docker is installed, and proceed to build your container:
git clone -b dev https://github.com/lmangani/docker-rtpagent-speech
cd docker-rtpagent-speech
docker build -t qxip/docker-rtpengine-speech .
NOTE: If you're running on Debian 8 with kernel 3.16.0-4 you can use the master
repository and the prebuilt packages
The repository ships with a sample docker-compose
file ready to be customized with our parameters.
Adjust the port range parameters to your likes, and enter the details of your HOMER installation:
version: '2.2'
services:
opensips-rec:
image: qxip/docker-rtpengine-speech
privileged: true
restart: always
environment:
ADVERTISED_RANGE_FIRST: 20000
ADVERTISED_RANGE_LAST: 20100
HOMER_SERVER: 'YOUR_HOMER_IP'
HOMER_PORT: 9060
BING_KEY: 'YOUR_KEY_HERE'
volumes:
- /var/lib/mysql
- /recording
ports:
- "5060:5060/udp"
- "5061:5061/tcp"
- "20000-20100:20000-20100/udp"
captagent:
container_name: captagent
image: qxip/captagent-docker
network_mode: "service:[opensips-rec]"
environment:
- ETHERNET_DEV=any
- CAPTURE_HOST='YOUR_HOMER_IP'
- CAPTURE_PORT=9060
Once you're happy with the settings, launch the container set:
docker-compose up -d
This is the easy part. Configure your existing SIP account to proxy through your shiny new Proxy. With a bit of luck you'll get REGISTERED - Next, make a call to your voicemail, yourself or a talkingh clock (no screaming monkeys for this one tho)
Keep things short for simplicity, maybe add a few breaks between sentences, and hangup!
The call recordings will be picked up and processed within 30s from hangup by the built in nodejs app.
Allow some time for this process to take place (you can watch syslog inside the container for actin) and then proceed to locate your call session in HOMER or HEPIC - If things went right, a few log entries should magically appear, revealing your conversation (or at least, what Bing Speech made of it!)
Did it work? Any profanity detected? Share your results with us or shout out @sipcapture on twitter!
- Make sure you entered your BING Key and HOMER details correctly.
- Shell into the container and tail
/var/log/syslog
- Check for errors related to kernel modules or recording targets
(C) 2008-2023 QXIP BV
HEP/EEP Agent Examples:
- CaptAgent
- HEPlify
- Kamailio
- OpenSIPS
- FreeSwitch
- Asterisk
- sipgrep
- sngrep
- RTPEngine
- RTPProxy
- Oracle ACME SBC
- Sonus SBC
- Avaya SM
- Sansay SBC
HEP/EEP Agent Examples (LOGS):
HEP/EEP Proxy:
Extra Examples:
- Custom JSON Stats
- RTCP-XR Stats
- GEO IP Maps
- Janus/Meetecho-WebRTC
- Cloudshark Export
- Encrypted HEP Tunneling
- SNMP Monitoring
- FreeSWITCH ESL Monitoring
- Kazoo Monitoring
- Speech-to-Text-to-HEP
Extra Resources: