How to Enable Twilio Dual Channel Recording For Better Speech Analytics

Posted by Bryon Mathias-Fuqua on Nov 18, 2016 11:10:59 AM

twilio dual channel call recording.jpg


At Twilio's last SIGNAL conference, this past May, they introduced their dual channel recording feature which means a whole lot in the world of speech analytics. Dual channel recording, or stereo recording, essentially allows you to take two different parties; say a caller and an agent, and record them separately on the same audio recording. When you take that audio recording and pass it to VoiceBase, you can then instruct VoiceBase to process those two channels separately so that you get perfect 'who said what' information. You can imagine that if a caller says a competitor's name first, it mean something a whole lot different than if an agent says a competitor's name first. There are huge benefits to using Stereo (dual channel) recordings versus Mono recordings for transcription, keyword spotting, agent monitoring/script adherence and training predictive models.

Want more posts like this? Check out these "How-To" examples from our blog:


Dual channel recordings can also increase accuracy compared Mono recorded files. This is due to two main factors; crosstalk and background noise. When the agent and caller are speaking over one another, even a human can have a hard time discerning what was said and who said it, well the machine has this difficulty as well. When each channel is recorded separately, you can clearly transcribe both sides whether there was crosstalk or not. Another factor that is greatly diminished with dual channel is background noise; if one side of the phone has a dog barking or is in a crowded area, the background noise is isolated to just the channel where the background noise is occurring, so ~50% of the call is unaffected.

In this video I will to go over the new dual channel recording feature that Twilio has added to their API, as well as how to take that recording and send it to VoiceBase with the correct configuration, so you can now get that valuable 'who said what' information. I'll also add in some more tips and tricks on how to get the most out of your Twilio integration throughout this video.



Below I've dropped the configs and curl commands used in the video:


1. Twiml to tell Twilio to record the call and who to connect the call to:

<?xml version="1.0" encoding="UTF-8"?>
       <Dial action="" record="record-from-answer-dual">555-555-5555</Dial> </Response>


2. The first API request (first curl command) is an upload, it includes the configuration .json file (further below) which tells VoiceBase what to do with the recording:

curl \
--header "Authorization: Bearer $TOKEN" \
--form media="" \
--form configuration=@dual-channel.json


And here is that configuration .json file included in that curl command, this tells VoiceBase what to do with the recording.

   "configuration": {
       "executor": "v2",
       "transcripts": {
           "vocabularies": [
                 "terms" : [

3. The second curl command is an API call to VoiceBase asking for the results (transcript, keywords, etc).

curl{your-media-id} \
 --header "Authorization: Bearer $TOKEN"



Ready to try the VoiceBase speech API? You can check out the docs here, or read more about what's possible here. Oh, and below are the Top 10 API Commands To Get Started, those might be helpful.



We can't wait to see what you'll detect.


Topics: Twilio, API, sample code

Written by Bryon Mathias-Fuqua

Bryon has one of the most recognizable faces at VoiceBase. He has played a key role as one of our early sales engineers in onboarding many of our enterprise customers. You may have seen him at Twilio's Signal, DreamForce, Enterprise Connect, or one of our other shows with a new demo to show off! When Bryon disconnects from the speech analytics world you can typically find him in his natural habitat on the beaches of Santa Cruz, California. He is also particularly gifted at the popular 90s craze, Dance Dance Revolution, but tries to keep it "low key".
Find me on:

What is Big Voice?

AI-powered speech analytics for the cloud

VoiceBase is defining the future of deep learning and communications by providing unparalleled access to spoken information for businesses to make better decisions. With flexible APIs developers and enterprises build scalable solutions with VoiceBase by embedding speech-to-text, speech analytics, and predictive analytics capabilities into any big voice application. 



The Modern Speech Analytics Playbook

Screenshot 2019-09-25 13.29.39

Subscribe to Email Updates

Recent Posts