Businessweek Archives

Computer Dictation: It Rights It Wrong


Technology & You

COMPUTER DICTATION: IT RIGHTS IT WRONG

New, low-cost voice recognition software is promising-but often hard of hearing

If this column starts out with some generation here and there, it's because I am dictating it to 9 computer. Translation: If this column starts out with some jibberish here and there, it's because I am dictating it to my computer.

Welcome to the world of mass-market voice-recognition technology. A trio of low-cost programs that create text as you speak into a microphone are now available. They are intriguing in their own right and offer an early glimpse of revolutionary possibilities. Enabling machines to respond to speech is the holy grail of the computer age. Truly "user-friendly" computers, whether the ones we work on all day or those that get built into our cars, should do our bidding at the sound of our voices, almost instinctively. Why should we type in our thoughts or go to a manual to figure out whether the precise command we need is "call up" or "get?"

In short, we need a less devious version of Hal, the talkative computer in 2001: A Space Odyssey. Programs that convert speech to text make a start in the right direction. First the computer must convert the analog sound waves of speech into the digital signals of computer processing. Then the software must compare the digital signals against the program's vocabulary and figure out from the words nearby, or context, whether to select "for," "four," or "fore." Faster computers, improved microphones, larger memories, and fatter hard drives have helped, along with some clever programming. Early versions, some costing $1,000 or more, helped lawyers and doctors and provided a test bed for programmers. The software eventually broadened out to general office applications, such as spreadsheets and word processing. Now, stripped-down models of the office programs are available for as little as $60, (table) with a headset microphone included. IBM's Simply Speaking is leading the way, with Kurzweil's Voice Pad and Dragon Systems' DragonDictate Singles close behind.

I found these programs to be devilishly difficult. Just getting the microphone to work with your computer's sound system is no mean feat. If you want to try one of these, be sure to check the company's Web site first to make sure you have the appropriate sound board and system. The programs all recommend a Pentium. Generally figure on a 100 MHz or faster processor and at least 16 MB of RAM. Once you get set up, the programs claim, you can dictate efficiently. That wasn't my experience.

All three of these programs "learn" your speech. Special routines in which you recite selected words and commands help get you started. But the real training comes while you dictate, speaking in what is called discrete speech, with a pause between each word. When the program misunderstands a word, you must use the proper correction routine if the computer is to recognize the word next time. Even though the programs claimed accuracy in the 90% range out of the box and 95% or better with practice, don't get your hopes up. I found it took hours of using the programs, mainly to compose my E-mail, before I got even close to 90%. With 90% accuracy, for every message the length of this paragraph, you have to go back and correct 13 words.

The correction process itself takes practice. If you recognize the error right away, you speak a command, such as "Oops" with DragonDictate, and a menu with a numbered list of similar sounding words appears on the screen. Say, "Take 2," for example, if the right word is there, and the program replaces the original word. If the word isn't on the list, which was fairly often for me, you can spell it one letter at time with DragonDictate and Kurzweil Voice Pad using the military alphabet (Alpha, Bravo, Charlie, etc.) With IBM's Simply Speaking you have to type it in. So don't expect to be completely liberated from your keyboard.

If you don't recognize an error until just before you're ready to send your E-mail or print a letter, the process is more involved. I appreciated the IBM feature that lets you double-click on a word and hear the computer replay what you said. Sometimes the word on the screen was so far off, I couldn't remember what word I had used.

All in all, I can't recommend these programs unless you type less than 30 words a minute or frequently need hands-free dictation. I got my best results with the Dragon Systems' program. I suggest you stay away from the IBM program. Using the military alphabet for spelling may be awkward, but it's better than no spelling feature at all, which is the case with Simply Speaking. If you want to refer someone to a Web site, for example, you have to type it in.

If you have a problem, such as repetitive strain injury, that prevents you from using a keyboard, voice recognition software could probably help you. But get a version that is more accurate than the bare-bones offerings and works with various programs. More robust versions of the Kurzweil program, for example, start at $200.

You can use the Internet to get a taste of voice recognition. IBM offers a download of a program that lets you navigate the Web by voice. The program, VoiceType Connection, costs $12.95, seems to work well, and has a spelling feature. Kurzweil offers trial downloads of VoicePad. But, in both cases, remember: You will need a microphone. The Andrea NC-50 recommended by IBM costs $30.

For most people, waiting for improvements in voice dictation is probably a good idea. Dragon, for example, recently announced that it will soon offer a product that works with continuous speech--meaning that you don't have to dictate in the slow, stilted, "discrete" manner. And retailers such as CompUSA and J&R Electronics say the dictation-only software is selling well. That means a growing consumer market, expanding profits, and more intense competition to get new innovations to market fast.BY G. DAVID WALLACEReturn to top


Burger King's Young Buns
LIMITED-TIME OFFER SUBSCRIBE NOW

(enter your email)
(enter up to 5 email addresses, separated by commas)

Max 250 characters

 
blog comments powered by Disqus