{"id":1978,"date":"2018-06-05T13:21:08","date_gmt":"2018-06-05T12:21:08","guid":{"rendered":"http:\/\/www.cyber-cottage.eu\/?p=1978"},"modified":"2018-06-05T13:23:26","modified_gmt":"2018-06-05T12:23:26","slug":"transcribing-voicemail-with-google-speech-api","status":"publish","type":"post","link":"https:\/\/www.cyber-cottage.co.uk\/?p=1978","title":{"rendered":"Transcribing Voicemail with Google Speech api"},"content":{"rendered":"<p>This is part 2 and rather long awaited description of how to transcribe voicemails to email and deliver them with text and an attached MP3<\/p>\n<p>You will need to install the files from here\u00a0https:\/\/zaf.github.io\/asterisk-speech-recog\/ and also have a Google Developers account.<\/p>\n<p>Also create a directory:-<\/p>\n<pre>\/var\/lib\/asterisk\/sounds\/catline<\/pre>\n<p>Lets begin.<\/p>\n<ul>\n<li>Script to create the mp3 and the file for transcription<\/li>\n<\/ul>\n<pre>#!\/bin\/sh\r\nPATH=\/var\/spool\/asterisk\/voicemail\/default\/\r\ncallerchan=$1\r\ncallerid=$2\r\norigdate=$3\r\norigtime=$4\r\norigmailbox=$5\r\norigdir=$6\r\nduration=$7\r\napikey=YOUR GOOGLE SPEECH API KEY\r\nFILENUM=$(\/bin\/ls ${PATH}${origmailbox}\/INBOX |\/bin\/grep txt | \/usr\/bin\/wc -l)\r\n\r\n\r\n##Added to allow 999 messages\r\nif  (( $FILENUM &lt;= 9 ));\r\nthen\r\nFILENAME=msg000${FILENUM}\r\nelif (( $FILENUM &lt;= 99 ));\r\nthen\r\nFILENAME=msg00${FILENUM}\r\nelse\r\nFILENAME=msg0${FILENUM}\r\nfi\r\n\r\nIN=$(\/bin\/grep \"${origmailbox} =&gt;\" \/etc\/asterisk\/voicemail.conf)\r\nset -- \"$IN\"\r\nIFS=\",\"; declare -a Array=($*)\r\nemail=${Array[2]}\r\n\r\n\r\n\/bin\/echo \"[message]\" &gt;&gt; ${PATH}${origmailbox}\/INBOX\/${FILENAME}.txt\r\n\/bin\/echo origmailbox=${origmailbox} &gt;&gt; ${PATH}${origmailbox}\/INBOX\/${FILENAME}.txt\r\n\/bin\/echo \"context=demo\" &gt;&gt; ${PATH}${origmailbox}\/INBOX\/${FILENAME}.txt\r\n\/bin\/echo \"macrocontext=\" &gt;&gt; ${PATH}${origmailbox}\/INBOX\/${FILENAME}.txt\r\n\/bin\/echo \"exten=s\" &gt;&gt; ${PATH}${origmailbox}\/INBOX\/${FILENAME}.txt\r\n\/bin\/echo \"priority=11\" &gt;&gt; ${PATH}${origmailbox}\/INBOX\/${FILENAME}.txt\r\n\/bin\/echo callerchan=${callerchan} &gt;&gt; ${PATH}${origmailbox}\/INBOX\/${FILENAME}.txt\r\n\/bin\/echo callerid=${callerid} &gt;&gt; ${PATH}${origmailbox}\/INBOX\/${FILENAME}.txt\r\n\/bin\/echo origdate=${origdate} &gt;&gt; ${PATH}${origmailbox}\/INBOX\/${FILENAME}.txt\r\n\/bin\/echo origtime=${origtime} &gt;&gt; ${PATH}${origmailbox}\/INBOX\/${FILENAME}.txt\r\n\/bin\/echo \"category=\" &gt;&gt; ${PATH}${origmailbox}\/INBOX\/${FILENAME}.txt\r\n\/bin\/echo \"duration=${duration}\" &gt;&gt; ${PATH}${origmailbox}\/INBOX\/${FILENAME}.txt\r\n\r\n\/bin\/nice \/usr\/bin\/sox \/var\/lib\/asterisk\/sounds\/catline\/${origdir}.wav ${PATH}${origmailbox}\/INBOX\/${FILENAME}.flac   silence -l 1 0.1 1% -1 0.3 1% \r\n\r\n\/bin\/nice \/usr\/bin\/lame -b 16 -m m -q 9-resample \/var\/lib\/asterisk\/sounds\/catline\/${origdir}.wav  ${PATH}${origmailbox}\/INBOX\/${FILENAME}.mp3\r\n\r\nvoicemailbody=$(\/usr\/bin\/perl -w \/usr\/src\/asterisk-speech-recog-cloud_api\/cli\/speech-recog-cli.pl -k $apikey -o detailed -r 8000 -n 1  \/var\/spool\/asterisk\/voicemail\/default\/${origmailbox}\/INBOX\/${FILENAME}.flac)\r\n\r\n\/bin\/cp \/var\/lib\/asterisk\/sounds\/catline\/${origdir}.wav ${PATH}${origmailbox}\/INBOX\/${FILENAME}.wav\r\n\r\necho \"You have a new voicemail from ${callerid} it was left on ${origdate} and is ${duration} seconds long ${voicemailbody}\" | \/bin\/mail -s \"A new voicemail has arrived from ${callerid}\" -a \"${PATH}${origmailbox}\/INBOX\/${FILENAME}.mp3\" \"$email\"\r\n\r\n\/bin\/rm -f ${PATH}${origmailbox}\/INBOX\/${FILENAME}.flac\r\n\/bin\/rm -f ${PATH}${origmailbox}\/INBOX\/${FILENAME}.mp3<\/pre>\n<ul>\n<li>Asterisk Dialplan to pass the call to the above script<\/li>\n<\/ul>\n<pre>[vmail2text]\r\nexten =&gt; _XXXX,1,Set(__EXTTOCALL=${EXTEN})\r\nexten =&gt; _XXXX,n,Noop(${EXTTOCALL})\r\nexten =&gt; _XXXX,n,Goto(s,1)\r\n\r\nexten =&gt; s,1,Answer()  ; Listen to ringing for 1 seconds\r\nexten =&gt; s,n,Noop(${EXTTOCALL} , ${DIALSTATUS} , ${SV_DIALSTATUS})\r\nexten =&gt; s,n,GotoIf($[\"${DIALSTATUS}\"=\"BUSY\"]?busy:bnext)\r\nexten =&gt; s,n(busy),Set(greeting=busy)\r\nexten =&gt; s,n,Goto(carryon)\r\nexten =&gt; s,n(bnext),GotoIf($[\"${DIALSTATUS}\"=\"NOANSWER\"]?unavail:unext)\r\nexten =&gt; s,n(unavail),Set(greeting=unavail)\r\nexten =&gt; s,n,Goto(carryon)\r\nexten =&gt; s,n(unext),Set(greeting=unavail)\r\nexten =&gt; s,n,Goto(carryon)\r\nexten =&gt; s,n(carryon),Set(origmailbox=${EXTTOCALL})\r\nexten =&gt; s,n,Set(msg=${STAT(e,${ASTSPOOLDIR}\/voicemail\/default\/${origmailbox}\/${greeting}.wav)})\r\nexten =&gt; s,n,Set(__start=0)\r\nexten =&gt; s,n,Set(__end=0)\r\nexten =&gt; s,n,NoOp(${UNIQUEID})\r\nexten =&gt; s,n,Set(origdate=${STRFTIME(${EPOCH},,%a %b %d %r %Z %G)})\r\nexten =&gt; s,n,Set(origtime=${EPOCH})\r\nexten =&gt; s,n,Set(callerchan=${CHANNEL})\r\nexten =&gt; s,n,Set(callerid=${CALLERID(num)})\r\nexten =&gt; s,n,Set(origmailbox=${origmailbox})\r\nexten =&gt; s,n,Answer()\r\nexten =&gt; s,n,GotoIf($[\"${msg}\"=\"1\"]?msgy:msgn)\r\nexten =&gt; s,n(msgy),Playback(${ASTSPOOLDIR}\/voicemail\/default\/${origmailbox}\/${greeting});(local\/catreq\/how_did)\r\nexten =&gt; s,n,Goto(beep)\r\nexten =&gt; s,n(msgn),Playback(vm-intro)\r\nexten =&gt; s,n(beep),System(\/bin\/touch \/var\/lib\/asterisk\/sounds\/catline\/${UNIQUEID}.wav)\r\nexten =&gt; s,n,Playback(beep)\r\nexten =&gt; s,n,Set(__start=${EPOCH})\r\nexten =&gt; s,n,Record(catline\/${UNIQUEID}.wav,3,60,kaq)\r\nexten =&gt; s,n,Playback(beep)\r\nexten =&gt; s,n,Hangup()\r\nexten =&gt; h,1,Noop(${start} ${end})\r\nexten =&gt; h,n,GotoIf($[\"${start}\"!=\"0\"]?ok:end)\r\nexten =&gt; h,n(ok),Set(end=${EPOCH})\r\nexten =&gt; h,n,Set(duration=${MATH(${end}-${start},int)})\r\nexten =&gt; h,n,System(\/usr\/local\/sbin\/makevmal.sh \"${callerchan}\" ${callerid} \"${origdate}\" ${origtime} ${origmailbox} ${UNIQUEID} ${duration})\r\nexten =&gt; h,n(end),Noop(finished)<\/pre>\n<ul>\n<li>Modified api script, <strong>Note the language and enhanced mode setting<\/strong>\n<ul>\n<li>For these to work you need &#8220;datalogging &#8221; enabled in the dialogflow api settings<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<pre>#!\/usr\/bin\/env perl\r\n\r\n#\r\n# Render speech to text using Google's Cloud Speech API.\r\n#\r\n# Copyright (C) 2011 - 2016, Lefteris Zafiris &lt;zaf@fastmail.com&gt;\r\n#\r\n# This program is free software, distributed under the terms of\r\n# the GNU General Public License Version 2. See the COPYING file\r\n# at the top of the source tree.\r\n#\r\n# This has been altered to work with Googles new Speech models\r\n#\r\n\r\nuse strict;\r\nuse warnings;\r\nuse File::Temp qw(tempfile);\r\nuse Getopt::Std;\r\nuse File::Basename;\r\nuse LWP::UserAgent;\r\nuse LWP::ConnCache;\r\nuse JSON;\r\nuse MIME::Base64;\r\n\r\nmy %options;\r\nmy $flac;\r\nmy $key;\r\nmy $url        = \"https:\/\/speech.googleapis.com\/v1p1beta1\/speech\";\r\nmy $samplerate = 16000;\r\nmy $language   = \"en-US\";\r\nmy $output     = \"detailed\";\r\nmy $results    = 1;\r\nmy $pro_filter = \"false\";\r\nmy $error      = 0;\r\nmy $thetext = \".\";\r\nmy $score = \".\";\r\ngetopts('k:l:o:r:n:fhq', \\%options);\r\n\r\nVERSION_MESSAGE() if (defined $options{h} || !@ARGV);\r\n\r\nparse_options();\r\n\r\nmy %config = (\r\n        \"encoding\"         =&gt; \"FLAC\",\r\n        \"sampleRateHertz\"      =&gt; $samplerate,\r\n        \"languageCode\"    =&gt; $language,\r\n        \"profanityFilter\" =&gt; $pro_filter,\r\n        \"maxAlternatives\" =&gt; $results,\r\n        \"model\" =&gt; \"phone_call\",\r\n        \"useEnhanced\" =&gt; 'true' \r\n);\r\n\r\nmy $ua = LWP::UserAgent-&gt;new(ssl_opts =&gt; {verify_hostname =&gt; 1});\r\n$ua-&gt;agent(\"CLI speech recognition script\");\r\n$ua-&gt;env_proxy;\r\n$ua-&gt;conn_cache(LWP::ConnCache-&gt;new());\r\n$ua-&gt;timeout(60);\r\n\r\n# send each sound file to Google and get the recognition results #\r\nforeach my $file (@ARGV) {\r\n        my ($filename, $dir, $ext) = fileparse($file, qr\/\\.[^.]*\/);\r\n        if ($ext ne \".flac\" &amp;&amp; $ext ne \".wav\") {\r\n                say_msg(\"Unsupported file-type: $ext\");\r\n                ++$error;\r\n                next;\r\n        }\r\n        if ($ext eq \".wav\") {\r\n                if (($file = encode_flac($file)) eq '-1') {\r\n                        ++$error;\r\n                        next;\r\n                }\r\n        }\r\n#       print(\"File $filename\\n\") if (!defined $options{q});\r\n        my $audio;\r\n        if (open(my $fh, \"&lt;\", \"$file\")) {\r\n                $audio = do { local $\/; &lt;$fh&gt; };\r\n                close($fh);\r\n        } else {\r\n                say_msg(\"Cant read file $file\");\r\n                ++$error;\r\n                next;\r\n        }\r\n        my %audio = ( \"content\" =&gt; encode_base64($audio, \"\") );\r\n        my %json = (\r\n                \"config\" =&gt; \\%config,\r\n                \"audio\"  =&gt; \\%audio,\r\n        );\r\n        my $response = $ua-&gt;post(\r\n                \"$url:recognize?key=$key\",\r\n                Content_Type =&gt; \"application\/json\",\r\n                Content      =&gt; encode_json(\\%json),\r\n        );\r\n        if (!$response-&gt;is_success) {\r\n                say_msg(\"Failed to get data for file: $file\");\r\n                ++$error;\r\n                next;\r\n        }\r\n        if ($output eq \"raw\") {\r\n                print $response-&gt;content;\r\n                next;\r\n        }\r\n        my $jdata = decode_json($response-&gt;content);\r\n        if ($output eq \"detailed\") {\r\n                foreach (@{$jdata-&gt;{\"results\"}[0]-&gt;{\"alternatives\"}}) {\r\n                        $score = $_-&gt;{\"confidence\"};\r\n                        $thetext = $_-&gt;{\"transcript\"};\r\n                        }\r\n        } elsif ($output eq \"compact\") {\r\n                print $_-&gt;{\"transcript\"}.\"\\n\" foreach (@{$jdata-&gt;{\"results\"}[0]-&gt;{\"alternatives\"}});\r\n        }\r\n}\r\n\r\nprint \"\\n\\nThe transcription of message is below:\\n\\n$thetext\\n\\nWe are $score out of 1 sure its correct\\n\\nTranscribed using Googles Cloud Speech API \";\r\n\r\nexit(($error) ? 1 : 0);\r\n\r\nsub parse_options {\r\n# Command line options parsing #\r\n        if (defined $options{k}) {\r\n        # check API key #\r\n                $key = $options{k};\r\n        } else {\r\n                say_msg(\"Invalid or missing API key.\\n\");\r\n                exit 1;\r\n        }\r\n        if (defined $options{l}) {\r\n        # check if language setting is valid #\r\n                if ($options{l} =~ \/^[a-z]{2}(-[a-zA-Z]{2,6})?$\/) {\r\n                        $language = $options{l};\r\n                } else {\r\n                        say_msg(\"Invalid language setting. Using default.\\n\");\r\n                }\r\n        }\r\n        if (defined $options{o}) {\r\n        # check if output setting is valid #\r\n                if ($options{o} =~ \/^(detailed|compact|raw)$\/) {\r\n                        $output = $options{o};\r\n                } else {\r\n                        say_msg(\"Invalid output formatting setting. Using default.\\n\");\r\n                }\r\n        }\r\n        if (defined $options{n}) {\r\n        # set number or results #\r\n                $results = $options{n} if ($options{n} =~ \/\\d+\/);\r\n        }\r\n        if (defined $options{r}) {\r\n        # set audio sampling rate #\r\n                $samplerate = $options{r} if ($options{r} =~ \/\\d+\/);\r\n        }\r\n        # set profanity filter #\r\n        $pro_filter = \"true\" if (defined $options{f});\r\n\r\n        return;\r\n}\r\n\r\nsub say_msg {\r\n# Print messages to user if 'quiet' flag is not set #\r\n        my @message = @_;\r\n        warn @message if (!defined $options{q});\r\n        return;\r\n}\r\n\r\nsub VERSION_MESSAGE {\r\n# Help message #\r\n        print \"Speech recognition using Google Cloud Speech API.\\n\\n\",\r\n                \"Usage: $0 [options] [file(s)]\\n\\n\",\r\n                \"Supported options:\\n\",\r\n                \" -k &lt;key&gt;       specify the Speech API key\\n\",\r\n                \" -l &lt;lang&gt;      specify the language to use (default 'en-US')\\n\",\r\n                \" -o &lt;type&gt;      specify the type of output formatting\\n\",\r\n                \"    detailed    print detailed output with info like confidence (default)\\n\",\r\n                \"    compact     print only the transcripted string\\n\",\r\n                \"    raw         raw JSON output\\n\",\r\n                \" -r &lt;rate&gt;      specify the audio sample rate in Hz (default 16000)\\n\",\r\n                \" -n &lt;number&gt;    specify the maximum number of results (default 1)\\n\",\r\n                \" -f             filter out profanities\\n\",\r\n                \" -q             don't print any error messages or warnings\\n\",\r\n                \" -h             this help message\\n\\n\";\r\n        exit(1);\r\n}<\/pre>\n<ul>\n<li>In Freepbx create a Custom Destination as\u00a0 \u00a0 &#8220;vmail2text,s,1&#8221;\u00a0 and if you require certain queues to go to specific mailboxes one like &#8220;vmail2text,2000,1&#8221; so calls will be sent to mailbox 2000<\/li>\n<li>Then in extensions that want to use transcription set the &#8220;Optional Destinations&#8221; to the custom destination.<\/li>\n<\/ul>\n<p>And thats it. Enjoy.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This is part 2 and rather long awaited description of how to transcribe voicemails to email and deliver them with text and an attached MP3 You will need to install the files from here\u00a0https:\/\/zaf.github.io\/asterisk-speech-recog\/ and also have a Google Developers account. Also create a directory:- \/var\/lib\/asterisk\/sounds\/catline Lets begin. Script to create the mp3 and the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"content-type":"","advanced_seo_description":"","jetpack_seo_html_title":"","jetpack_seo_noindex":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[2,11],"tags":[],"class_list":["post-1978","post","type-post","status-publish","format-standard","hentry","category-blog","category-knowledge"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p5daZy-vU","jetpack_sharing_enabled":true,"jetpack_likes_enabled":false,"jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/www.cyber-cottage.co.uk\/index.php?rest_route=\/wp\/v2\/posts\/1978","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.cyber-cottage.co.uk\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.cyber-cottage.co.uk\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.cyber-cottage.co.uk\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.cyber-cottage.co.uk\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1978"}],"version-history":[{"count":5,"href":"https:\/\/www.cyber-cottage.co.uk\/index.php?rest_route=\/wp\/v2\/posts\/1978\/revisions"}],"predecessor-version":[{"id":1983,"href":"https:\/\/www.cyber-cottage.co.uk\/index.php?rest_route=\/wp\/v2\/posts\/1978\/revisions\/1983"}],"wp:attachment":[{"href":"https:\/\/www.cyber-cottage.co.uk\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1978"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.cyber-cottage.co.uk\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1978"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.cyber-cottage.co.uk\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1978"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}