How to Enable Twilio Dual Channel Recording For Better Speech Analytics

by Byron Mathias-Fuqua

INTRO TO DUAL CHANNEL RECORDING [STEREO]

At Twilio’s last SIGNAL conference, this past May, they introduced their dual channel recording feature which means a whole lot in the world of speech analytics. Dual channel recording, or stereo recording, essentially allows you to take two different parties; say a caller and an agent, and record them separately on the same audio recording. When you take that audio recording and pass it to VoiceBase, you can then instruct VoiceBase to process those two channels separately so that you get perfect ‘who said what’ information. You can imagine that if a caller says a competitor’s name first, it mean something a whole lot different than if an agent says a competitor’s name first. There are huge benefits to using Stereo (dual channel) recordings versus Mono recordings for transcription, keyword spotting, agent monitoring/script adherence and training predictive models.

Benefits of Dual Channel Recording

Dual channel recordings can also increase accuracy compared Mono recorded files. This is due to two main factors; crosstalk and background noise. When the agent and caller are speaking over one another, even a human can have a hard time discerning what was said and who said it, well the machine has this difficulty as well. When each channel is recorded separately, you can clearly transcribe both sides whether there was crosstalk or not. Another factor that is greatly diminished with dual channel is background noise; if one side of the phone has a dog barking or is in a crowded area, the background noise is isolated to just the channel where the background noise is occurring, so ~50% of the call is unaffected.

In this video I will to go over the new dual channel recording feature that Twilio has added to their API, as well as how to take that recording and send it to VoiceBase with the correct configuration, so you can now get that valuable ‘who said what’ information. I’ll also add in some more tips and tricks on how to get the most out of your Twilio integration throughout this video.

Enjoy!

https://youtu.be/zEwoAkI5oMs

Below I’ve dropped the configs and curl commands used in the video:

1. Twiml to tell Twilio to record the call and who to connect the call to:

<br /> <!--?xml version="1.0" encoding="UTF-8"?--><br /> <response><br /> <dial action="http://requestb.in/u08rlvu0" record="record-from-answer-dual">555-555-5555</dial><br /> </response><br />

2. The first API request (first curl command) is an upload, it includes the configuration .json file (further below) which tells VoiceBase what to do with the recording:

<br /> curl https://apis.voicebase.com/v2-beta/media \<br /> --header "Authorization: Bearer $TOKEN" \<br /> --form media="https://api.twilio.com/your-unique-url.wav" \<br /> --form configuration=@dual-channel.json<br />

And here is that configuration .json file included in that curl command, this tells VoiceBase what to do with the recording.

<br /> {<br /> "configuration": {<br /> "executor": "v2",<br /> "ingest":{<br /> "channels":{<br /> "left":{<br /> "speaker":"agent"<br /> },<br /> "right":{<br /> "speaker":"caller"<br /> }<br /> }<br /> },<br /> "transcripts": {<br /> "vocabularies": [<br /> {<br /> "terms" : [<br /> "VoiceBase"<br /> ] }<br /> ] }<br /> }<br /> }<br />

3. The second curl command is an API call to VoiceBase asking for the results (transcript, keywords, etc.).

<br /> curl https://apis.voicebase.com/v2-beta/media/{your-media-id} \<br /> --header "Authorization: Bearer $TOKEN"<br />

More Resources

Ready to try the VoiceBase speech API? You can check out the docs here , or read more about what’s possible here. Oh, and below are the Top 10 API Commands To Get Started, those might be helpful.

We can’t wait to see what you’ll detect.

Byron Mathias-Fuqua

Bryon has one of the most recognizable faces at VoiceBase. He has played a key role as one of our early sales engineers in onboarding many of our enterprise customers. You may have seen him at Twilio’s Signal, DreamForce, Enterprise Connect, or one of our other shows with a new demo to show off! When Bryon disconnects from the speech analytics world you can typically find him in his natural habitat on the beaches of Santa Cruz, California. He is also particularly gifted at the popular 90s craze, Dance Dance Revolution, but tries to keep it “low key”.

More From the Voice analytics blog

What Is Voice of the Customer?

L...

Predictive Analytics for Strategic Insights

Predictive analytics is an advanced form of data mining that leverages machine learning to identify patterns in voice recordings, intuit a speaker’s intent, and predict a future outcome — be it a sale, account cancellation, or one of many customized “X” signals your clients might request.

What Is Wrap-Up Time? 7 Ways to Reduce It

C...

Sample Code Twilio