HOW TO `WRECK A NICE BEACH'
The human voice can be represented as a pattern of changing audio frequencies (above). Speech-recognition systems compare human voices with patterns of known sounds of syllables and words stored in a computer. The computer makes comparisons--sometimes thousands of them--until it finds the pattern that most closely matches.
But this method has limitations. For example, the word "wreck" closely matches the first syllable in "recognize." When patterns are this close, the computer must make a best guess. And, as is the case with human hearing, the correct choice usually depends on the context of the sentence or phrase. That's where advances in linguistical methods help. Those involve statistical models to guess how likely it is that a person will say "recognize" as opposed to "wreck a nice."
The final choice often depends on the application. For instance, if a computer is trained for making hotel reservations, it would understand a customer who says, "I want to check in." But, if it were programmed to take fast food orders, it might take that as "I want two chicken."