Your Callers Expect Speech Recognition

By Kyle Henderson

“For billing, press 1. For support, press 2.” We lower the phone from our ear, open up the keypad screen to find the right number to press, and hope we don’t miss a better option while we can’t hear. If this scenario sounds familiar, it’s because for years that’s been the typical IVR menu format. It’s inconvenient, outdated given the prominence of smartphones, and too-often results in an accidental disconnect (maybe that’s just me and my sausage fingers). It’s just not a great customer experience.

Things got a bit better when “press or say” functionality took root in IVR, and many organizations made the move to upgrade from DTMF-only technology. The problem is, this is where many IVRs also stopped, at the point of “it’s good enough.” Newsflash: callers disagree. It’s simply not intuitive to say “two” in place of “support.” Imagine the reaction of the younger generation who doesn’t even remember tactile dial pads and has grown up with Siri and Alexa. They just assume technology can listen to and act on voice commands.

Automated Speech Recognition (ASR) is the single most important technology that companies can leverage as far as improving customer satisfaction scores in the IVR—as long as it’s done correctly. Today’s speech technology is capable of delivering a natural, efficient, and modern experience for callers. And by making it easier for callers to access self-service information, it shortens call times, which saves money.

Helpful Hints
The first step to deploying ASR applications is to determine what style of interface best suits the organization’s needs. Each has a little different flavor that you’ll need to consider, but all deliver a better experience than DTMF and “press or say.” Styles include:

  • Directed Dialogue – “You can say check balance, transfer funds, or make a payment.”
  • Multi-token – “What is your itinerary?” – caller response: “Seattle to Dallas on March 31”
  • Mixed Initiative – “Did you say Annapolis?” – caller response: “No, Minneapolis”
  • Statistical Language Model – “How may I help you?”

When you’ve selected the style, there are some best practices to follow in the pre-development phase to ensure your ASR works right and delivers the experience you and your callers are looking for.

1. Create a call flow and prompt list so you can visualize the path callers will take and what they’ll hear. Don’t cut corners on design, because spending the time upfront is going to make for a much better application later. Remember to focus on the transactions that benefit callers, not just the organization. For example, don’t ask callers to provide information only for reporting purposes.

2. Define required and optional utterances, or “grammars,” for each prompt in the application. Account for the various words and phrases callers might say to the system. Accurately predicting what a caller will say is tricky, of course, so a well-researched grammar specification is imperative.

3. Simulate calls. Simulate a lot of calls. Make sure what you put down on paper actually sounds good on the phone and feels natural. Then revise the design to address hiccups, and polish.

4. Once the system is taking live calls, perform a tuning cycle. A speech application requires regular tuning to remain relevant and working well. From an investment standpoint and from a customer service perspective, it’s a good idea to ensure the application is always performing at its peak.

There’s one last tip that will save you a lot of time, effort and stress: don’t try to do it yourself. It takes expertise to create a speech application that works the way you want and your callers will appreciate. INI has more than 26 years of experience designing and integrating IVR systems and has been developing speech recognition applications since our early days. Contact us and we’d be glad to talk through how speech applications from INI can benefit your organization and, more importantly, boost your customer experience.