Categories
Blog Design FreePBX Knowledge Base

Voice recognition and Asterisk.

This is primarily about Googles new Cloud Speech API and Asterisk recordings.

Having worked on many Voice rec systems including Mitels attendant system, Oranges Wildfire virtual assistance and Lumenvox’s add on for Digium’s Asterisk system one thing none could do was transcribe speech such as voicemails and this is what people want. There was a startup in the UK called Spinvox  but as anyone knows this wasn’t all it seems and when I questioned them while working on a project they clammed up and withdrew our testing account and the rest is history as they say.

So now we are many years on and Google have their second API for this service. The first API was a little flaky to say the least and came up with some amusing translations. The cloud version is much better and does a good job with most voice and also can be localised.

So what have we done. Well we have mixed together some existing code we use and created a “mini voicemail” that records your message converts it to text saves it as a voicemail and emails the resultant Text and recording to you.  In the process we did find a few “gotchas” with the API for example a pause of more than a couple of seconds will result in the translation stopping there, also a big one is that the translation takes as long as the recording is, and the API has a 60 second limit. Both of these can be overcome by limiting the record time in Asterisk to 60 seconds and using sox to remove silence of more than a second.

exten => s,n,Record(catline/${UNIQUEID}.wav,3,60,kaq)
/usr/bin/sox /var/lib/asterisk/sounds/catline/${origdir}.wav ${PATH}${origmailbox}/INBOX/${FILENAME}.flac  lowpass -2 2500 silence -l 1 0.1 1% -1 0.8 1% 

As you can see from these snippits of code above we have used variables where possible to that it can be incorporated easily with existing asterisk systems using GUIs such as Freepbx, We use the voicemail greetings that the user recorded and also use the email address thats linked with their mailbox for simplicity of management.

Now having Voicemails as text is nice but where it comes into its own is with structured mailboxes or simply put questionnaires where the caller is asked a number of predefined questions and these are recorded as one single voicemail. We already do this for some customers but they still have to have some one transcribe teh voicemail to text to input it. The quality of the Google translation means that soon they will be able to just copy the text over. Other applications are only limited by your imagination, Such as automated voice menus for Takeaways or Taxi firms.

To be Continued…HERE