Speech recognition has been on the drawing board for some time, but recently major commercial deployments of the technology have become more commonplace. Some have been very successful, but for others, the perception has been somewhat mixed.
Clearly speech recognition and verification software is becoming part of the business landscape. IDC projects the market for voice recognition hardware, software and services will reach $2.5 billion by 2005 worldwide. Frost and Sullivan and the Meta Group make similar predictions.
Paul Magee, managing director for VeCommerce, providers of speech recognition solutions, comments, "To date, the business world has been sceptical about speech recognition. It has, in the past, not lived up to all the hype. It's up to the industry to challenge that notion by developing voice enabled e-commerce solutions that continue to exceed expectations."
"Speech is certainly out there and it has some great potential for the future, but most people in the market don't understand the ROI it can deliver to their business," says Fausto Marasco, managing director for Premier Technologies. "The take-up and deployment has not been as huge as expected by the industry and companies are still grappling with the question of how it can actually operate in their environment."
The ACA Research Call Centre Industry Benchmark 2002 study found that companies and their customers do not fully accept speech recognition as an effective and efficient customer service tool.
According to Magee, Natural Language Speech Recognition (NLSR) can deliver real business benefits. He says, "NLSR has already been proved in hundreds of operational systems. It has unsurpassed accuracy on large vocabularies, both in speaker-independent and speaker-dependent applications. It delivers the rich, conversational dialogues that customers want and the robust natural language understanding that makes applications comfortable for callers."
Speech is a very resource hungry application. "It can soak up as much CPU power as you can throw at it," says Marasco. "The price of most speech implementations is very expensive and a huge order for most companies to find the budget."
It will steadily develop and mature, and the costs to implement will come down. Marasco comments, "The quality of the process in speech applications will become far more accurate and far more robust and able to handle conversational models far more effectively than they can do today."
"The potential savings of these solutions are tremendous," asserts Magee. "Studies performed to compare speech recognition systems to traditional call or contact centres have shown there are tremendous savings to be realised using this technology. Some of our customers are reporting savings of up to 85 per cent for the cost of a routine transaction or enquiry, with automation rates in excess of 50 per cent. When you also consider the cost of running 1800 or 13 numbers, this can be significant."
Many organisations surveyed in the Call Centre Benchmark Study have expressed the notion that they would prefer to speak to their customers than have them speak to a machine. And if the Telstra experience with its phone directory services is anything to go by, it would seem they are right.
But the interesting fact about Telstra implementation is that they received a significant return on their investment within three months of implementation, according to Michael Frajman, business development manager for Telstra's call centre solution. Since converting to the new system they have saved up to 30 per cent of call handling time, which means shorter queue times and tremendous savings on staffing costs.
Another key issue about a major speech recognition solution is that it is not something that comes out of the box. According to Lynda Kate Smith, VP and CMO for Nuance, it needs to be tuned and developed with the ability to learn, depending on the range of customers that ultimately end up using the system. She says, "As well as implementing and building the system, it's vitally important that the potential caller base is informed about and educated on the new system."
The Telstra system was originally implemented to handle the 2000 most common business names. Telstra's automated directory assistance line, which provides callers with telephone numbers via voice recognition, does not work when it comes to less common names and words spoken by customers. Smith advises, "More education and explanation in the market place about the limitations of the initial system may have prevented the tarnished image it has received in the minds of the general public and business."
But it is a vast improvement compared to the old days of 013, where people were in phone queues for ages or couldn't get through due to congestion. Now, if the system does not recognise a name, the caller is put through to a live operator very quickly.
Where Telstra made a lot of its savings is that, ultimately, staff spend less time on the phone. Even if the system hasn't put the customer through to the number, it allows the information to be available to the operator, reducing the need to query the database while the caller is on the phone.
The future
Companies are now actively working with speech recognition and verification technologies across several new applications. In the not too distant future, speech technology will allow airline customers to check on scheduled flight times and banking customers to use their 'voiceprint' to check their account balance from the road.
Companies like Speechworks and Nuance are developing systems that can pick up key words in a sentence, which will allow a more conversational model for speech recognition.
One area where speech recognition can be used very cost effectively and improve service is for IVR and auto-attendant applications. People are used to typing numbers on a touch phone. "Deploying speech recognition to support the touchtone can offer greater convenience, particularly for callers from mobile phones," says Andrew Weiss from VisibleVoice, systems integrators for self-service and voice applications.
Smith from Nuance says, "One of the greatest areas of improvement in recent years is in the development of 'text-to-speech' engines. Initially, when customers were accessing a computer application, the responses were slow and mechanical. But now a more 'human' sounding voice can be delivered."
Utilising text-to-speech gives greater flexibility and scope for IVR applications, compared to having a range of human recordings. Weiss says, "Text-to-speech can enhance the customer's experience by providing dynamic, detailed and personalised information in a real-time automated voice."
The technology also allows access to emails, faxes, SMS text messages and a wide variety of information contained within a data application via the telephone. Smith says, "Particularly useful for the busy executive or rep on the road is the ability to retrieve written messages and notes via their mobile phone."
This greater access can be delivered to customers who need to access the organisation after hours. Weiss comments, "Not only can a remote user receive these messages, but they can also be returned by the spoken message being transformed back into written text. Along with access to these applications, the user can also gain access to web and database text."
Ideally, as more customers can access the applications themselves at their own convenience, overhead costs associated with front office facilities and reception decrease.
In the United States, the Kelsey Group's Voice and Wireless Commerce research team conducted a detailed Return on Investment (ROI) Study in the first quarter of 2002. Key findings of the study included the following:
The study also found that the use of speech recognition to supplement the touchtone application led to a significant reduction in both the percentage of calls abandoned and the number that were transferred to a live agent.
One local industry that has used speech recognition to great effect is the taxi industry. Combining speech recognition technology with caller number identification, they have achieved tremendous efficiencies while increasing their levels of service.
Organisations like Black Cabs in Melbourne and Regent Taxis in Queensland have reported that the new systems have increased their capacity to handle calls by 50-100 per cent, particularly in crucial peak times. "A couple of years ago you could be on a phone queue for a taxi for ages or not get through at all. The speech systems they have implemented have greatly enhanced the phone service, while reducing costs," says Weiss.
Though speech recognition has only recently emerged from the sphere of science fiction into the boardroom, it seems to be developing a prosperous future.