Automatic recognition and understanding of spoken language - A first step toward natural human-machine communication
01 August 2000
The promise of a powerful computing device to help people in productivity as well as in recreation can only be realized with proper human-machine communication. Automatic recognition and understanding of spoken language is the first and probably the most important step toward natural human-machine interaction. Research in this fascinating field in the past few decades has produced remarkable results, leading to many exciting expectations as well as new challenges. In this paper, we summarize the development of the spoken language technology from both a vertical (the chronology) and a horizontal (the spectrum of technical approaches) perspective. We highlighted the introduction of statistical methods in dealing with language-related problems as it represents a paradigm shift in the research field of spoken language processing. Statistical methods are designed to allow the machine to learn, directly from data, structure regularities in the speech signal for the purpose of automatic speech recognition and understanding. Today, research results in spoken language processing have led to a number of successful applications ranging from dictation software for personal computers and telephone-call processing systems for automatic call routing to automatic subcaptioning for television broadcast. We analyze the technical successes that support these applications. Along with an assessment of the state-of-the-art in this board technical field, we also discuss the limitations of the current technology can be presented as the basis to inspire future advances.