(page 2 of 2)
Last, you build a large table of the most commonly occurring patterns of words people are likely to speak. Algorithms are created that combine all these sources of information to come up with the right answer in a specific situation. In the past few years, scientists at IBM and elsewhere have been learning how to adapt their voice recognition engines more quickly to a specific person or sound environment. Nuance's newly released Dragon NaturallySpeaking 10 PC speech recognition software translates speech into text with up to 99% accuracy.
Nuance is the giant of the speech recognition industry, with products for nearly every niche. Annual sales are expected to top $900 million this year. Steve Chambers, president of the company's mobile speech and consumer-services division, says this breadth of experience has made it possible for the company to collect a huge treasure trove of speech samples from people with different languages and accents, which helps it improve its technology rapidly. "The technology is unlike others in research land. It has to be used to improve. The name of the game is scale and usage," he says.
Even without Nuance's scale in this field, IBM Research has managed to produce very effective speech recognition software. Vlingo evaluated IBM's technology against Nuance's and a couple of others. Dave Grannan, Vlingo's chief executive, says IBM had the best combination of speed of processing and accuracy in his company's tests. Another attraction: He didn't fear that IBM might some day decide to get into his business. Nuance, on the other hand, competes with Vlingo."Because IBM Research is not a go-to-market part of IBM, there wasn't a competitive issue with them," he says.
Nahamoo's group is focusing on commercial opportunities right now. But IBM researchers are also exploring areas where the social impact could be huge. One example, spearheaded by scientists in India, is what it calls the "spoken Web." In a handful of villages in the state of Andhra Pradesh, the company is helping locals create Web pages and search the Web purely with voice. A plumber or farmer goes to a kiosk with mobile phones and builds a Web page promoting his or her products, produce, or services by speaking the answers to 10 or so questions. Then other villagers can use a mobile phone to speak commands to search for those Web sites; they hear the search results, rather than see them.
If successful, the technology could help open up the Internet to the world's hundreds of millions of illiterate people. "It has the potential to transform these regions," says Paul Bloom, IBM Research's business executive for the communications sector.
Business Exchange related topics:
IBM
Defense Industry
Telecom Industry
Hamm is a senior writer for BusinessWeek in New York and author of the Globespotting blog.