Word Recognition Using Whole Word and Subword Models
01 January 1989
One of the key issues in designing a speech recognition system is the selection of the fundamental unit for recognition. The choice of the fundamental unit for a recognition task generally depends on the size of the vocabulary to be recognized and the availability of sufficient training data for creating effective reference models. In this paper, we address the problem of how to select and construct a set of fundamental unit statistical models suitable for speech recognition. We discuss a unified framework which can be used to accomplish the goal of crating effective basic models of speech. We also compare the use of three types of fundamental units, namely whole word, phoneme-like and acoustic segments units in an 1109-word vocabulary speech recognition task. We point out the relative advantages of each type of speech unit based on the results of a series of recognition experiments.